VoltAgent
diff --git a/‎.changeset/all-zebras-grin.md‎
Lines changed: 43 additions & 0 deletions b/‎.changeset/all-zebras-grin.md‎
Lines changed: 43 additions & 0 deletions
diff --git a/‎.changeset/lemon-shirts-look.md‎
Lines changed: 115 additions & 0 deletions b/‎.changeset/lemon-shirts-look.md‎
Lines changed: 115 additions & 0 deletions
diff --git a/‎.changeset/polite-cups-wear.md‎
Lines changed: 66 additions & 0 deletions b/‎.changeset/polite-cups-wear.md‎
Lines changed: 66 additions & 0 deletions
diff --git a/‎packages/core/src/agent/agent.ts‎
Lines changed: 45 additions & 12 deletions b/‎packages/core/src/agent/agent.ts‎
Lines changed: 45 additions & 12 deletions
@@ -0,0 +1,43 @@
+---
+"@voltagent/core": patch
+---
+
+feat: encapsulate tool-specific metadata in toolContext + prevent AI SDK context collision
+
+## Changes
+
+### 1. Tool Context Encapsulation
+
+Tool-specific metadata now organized under optional `toolContext` field for better separation and future-proofing.
+
+**Migration:**
+
+```typescript
+// Before
+execute: async ({ location }, options) => {
+  // Fields were flat (planned, not released)
+};
+
+// After
+execute: async ({ location }, options) => {
+  const { name, callId, messages, abortSignal } = options?.toolContext || {};
+
+  // Session context remains flat
+  const userId = options?.userId;
+  const logger = options?.logger;
+  const context = options?.context;
+};
+```
+
+### 2. AI SDK Context Field Protection
+
+Explicitly exclude `context` from being spread into AI SDK calls to prevent future naming collisions if AI SDK renames `experimental_context` → `context`.
+
+## Benefits
+
+- ✅ Better organization - tool metadata in one place
+- ✅ Clearer separation - session context vs tool context
+- ✅ Future-proof - easy to add new tool metadata fields
+- ✅ Namespace safety - no collision with OperationContext or AI SDK fields
+- ✅ Backward compatible - `toolContext` is optional for external callers (MCP servers)
+- ✅ Protected from AI SDK breaking changes
@@ -0,0 +1,115 @@
+---
+"@voltagent/core": patch
+---
+
+feat: add multi-modal tool results support with toModelOutput - #722
+
+Tools can now return images, media, and rich content to AI models using the `toModelOutput` function.
+
+## The Problem
+
+AI agents couldn't receive visual information from tools - everything had to be text or JSON. This limited use cases like:
+
+- Computer use agents that need to see screenshots
+- Image analysis workflows
+- Visual debugging tools
+- Any tool that produces media output
+
+## The Solution
+
+Added `toModelOutput?: (output) => ToolResultOutput` to tool options. This function transforms your tool's output into a format the AI model can understand, including images and media.
+
+```typescript
+import { createTool } from "@voltagent/core";
+import fs from "fs";
+
+const screenshotTool = createTool({
+  name: "take_screenshot",
+  description: "Takes a screenshot of the screen",
+  parameters: z.object({
+    region: z.string().optional().describe("Region to capture"),
+  }),
+  execute: async ({ region }) => {
+    const imageData = fs.readFileSync("./screenshot.png").toString("base64");
+    return {
+      type: "image",
+      data: imageData,
+      timestamp: new Date().toISOString(),
+    };
+  },
+  toModelOutput: (result) => ({
+    type: "content",
+    value: [
+      { type: "text", text: `Screenshot captured at ${result.timestamp}` },
+      { type: "media", data: result.data, mediaType: "image/png" },
+    ],
+  }),
+});
+```
+
+## Return Formats
+
+The `toModelOutput` function can return multiple formats:
+
+**Text output:**
+
+```typescript
+toModelOutput: (result) => ({
+  type: "text",
+  value: result.summary,
+});
+```
+
+**JSON output:**
+
+```typescript
+toModelOutput: (result) => ({
+  type: "json",
+  value: { status: "success", data: result },
+});
+```
+
+**Multi-modal content (text + media):**
+
+```typescript
+toModelOutput: (result) => ({
+  type: "content",
+  value: [
+    { type: "text", text: "Analysis complete" },
+    { type: "media", data: result.imageBase64, mediaType: "image/png" },
+  ],
+});
+```
+
+**Error handling:**
+
+```typescript
+toModelOutput: (result) => ({
+  type: "error-text",
+  value: result.errorMessage,
+});
+```
+
+## Impact
+
+- **Visual AI Workflows**: Build computer use agents that can see and interact with UIs
+- **Image Generation**: Tools can return generated images directly to the model
+- **Debugging**: Return screenshots and visual debugging information
+- **Rich Responses**: Combine text explanations with visual evidence
+
+## Usage with Anthropic
+
+```typescript
+const agent = createAgent({
+  name: "visual-assistant",
+  tools: [screenshotTool],
+  model: anthropic("claude-3-5-sonnet-20241022"),
+});
+
+const result = await agent.generateText({
+  prompt: "Take a screenshot and describe what you see",
+});
+// Agent receives both text and image, can analyze the screenshot
+```
+
+See [AI SDK documentation](https://sdk.vercel.ai/docs/ai-sdk-core/tools-and-tool-calling#multi-modal-tool-results) for more details on multi-modal tool results.
@@ -0,0 +1,66 @@
+---
+"@voltagent/core": patch
+---
+
+feat: add providerOptions support to tools for provider-specific features - #759
+
+Tools can now accept `providerOptions` to enable provider-specific features like Anthropic's cache control. This aligns VoltAgent tools with the AI SDK's tool API.
+
+## The Problem
+
+Users wanted to use provider-specific features like Anthropic's prompt caching to reduce costs and latency, but VoltAgent's `createTool()` didn't support the `providerOptions` field that AI SDK tools have.
+
+## The Solution
+
+**What Changed:**
+
+- Added `providerOptions?: ProviderOptions` field to `ToolOptions` type
+- VoltAgent tools now accept and pass through provider options to the AI SDK
+- Supports all provider-specific features: cache control, reasoning settings, etc.
+
+**What Gets Enabled:**
+
+```typescript
+import { createTool } from "@voltagent/core";
+import { z } from "zod";
+
+const cityAttractionsTool = createTool({
+  name: "get_city_attractions",
+  description: "Get tourist attractions for a city",
+  parameters: z.object({
+    city: z.string().describe("The city name"),
+  }),
+  providerOptions: {
+    anthropic: {
+      cacheControl: { type: "ephemeral" },
+    },
+  },
+  execute: async ({ city }) => {
+    return await fetchAttractions(city);
+  },
+});
+```
+
+## Impact
+
+- **Cost Optimization:** Anthropic cache control reduces API costs for repeated tool calls
+- **Future-Proof:** Any new provider features work automatically
+- **Type-Safe:** Uses official AI SDK `ProviderOptions` type
+- **Zero Breaking Changes:** Optional field, fully backward compatible
+
+## Usage
+
+Use with any provider that supports provider-specific options:
+
+```typescript
+const agent = new Agent({
+  name: "Travel Assistant",
+  model: anthropic("claude-3-5-sonnet"),
+  tools: [cityAttractionsTool], // Tool with cacheControl enabled
+});
+
+await agent.generateText("What are the top attractions in Paris?");
+// Tool definition cached by Anthropic for improved performance
+```
+
+Learn more: [Anthropic Cache Control](https://ai-sdk.dev/providers/ai-sdk-providers/anthropic#cache-control)
@@ -1,4 +1,9 @@
-import type { ModelMessage, ProviderOptions, SystemModelMessage } from "@ai-sdk/provider-utils";
+import type {
+  ModelMessage,
+  ProviderOptions,
+  SystemModelMessage,
+  ToolCallOptions,
+} from "@ai-sdk/provider-utils";
 import type { Span } from "@opentelemetry/api";
 import { SpanKind, SpanStatusCode } from "@opentelemetry/api";
 import type { Logger } from "@voltagent/internal";
@@ -510,6 +515,7 @@ export class Agent {
         const {
           userId,
           conversationId,
+          context, // Explicitly exclude to prevent collision with AI SDK's future 'context' field
           parentAgentId,
           parentOperationContext,
           hooks,
@@ -724,6 +730,7 @@ export class Agent {
         const {
           userId,
           conversationId,
+          context, // Explicitly exclude to prevent collision with AI SDK's future 'context' field
           parentAgentId,
           parentOperationContext,
           hooks,
@@ -1191,6 +1198,7 @@ export class Agent {
         const {
           userId,
           conversationId,
+          context, // Explicitly exclude to prevent collision with AI SDK's future 'context' field
           parentAgentId,
           parentOperationContext,
           hooks,
@@ -1414,6 +1422,7 @@ export class Agent {
         const {
           userId,
           conversationId,
+          context, // Explicitly exclude to prevent collision with AI SDK's future 'context' field
           parentAgentId,
           parentOperationContext,
           hooks,
@@ -2691,9 +2700,23 @@ export class Agent {
   private createToolExecutionFactory(
     oc: OperationContext,
     hooks: AgentHooks,
-  ): (tool: BaseTool) => (args: any, options?: ToolExecuteOptions) => Promise<any> {
-    return (tool: BaseTool) => async (args: any, options?: ToolExecuteOptions) => {
+  ): (tool: BaseTool) => (args: any, options?: ToolCallOptions) => Promise<any> {
+    return (tool: BaseTool) => async (args: any, options?: ToolCallOptions) => {
+      // AI SDK passes ToolCallOptions with fields: toolCallId, messages, abortSignal
       const toolCallId = options?.toolCallId ?? randomUUID();
+      const messages = options?.messages ?? [];
+      const abortSignal = options?.abortSignal;
+
+      // Convert ToolCallOptions to ToolExecuteOptions by merging with OperationContext
+      const executionOptions: ToolExecuteOptions = {
+        ...oc,
+        toolContext: {
+          name: tool.name,
+          callId: toolCallId,
+          messages: messages,
+          abortSignal: abortSignal,
+        },
+      };
 
       // Event tracking now handled by OpenTelemetry spans
       const toolSpan = oc.traceContext.createChildSpan(`tool.execution:${tool.name}`, "tool", {
@@ -2715,13 +2738,19 @@ export class Agent {
       return await oc.traceContext.withSpan(toolSpan, async () => {
         try {
           // Call tool start hook - can throw ToolDeniedError
-          await hooks.onToolStart?.({ agent: this, tool, context: oc, args, options });
+          await hooks.onToolStart?.({
+            agent: this,
+            tool,
+            context: oc,
+            args,
+            options: executionOptions,
+          });
 
-          // Execute tool with OperationContext directly
+          // Execute tool with merged options
           if (!tool.execute) {
             throw new Error(`Tool ${tool.name} does not have "execute" method`);
           }
-          const result = await tool.execute(args, oc, options);
+          const result = await tool.execute(args, executionOptions);
           const validatedResult = await this.validateToolOutput(result, tool);
 
           // End OTEL span
@@ -2736,7 +2765,7 @@ export class Agent {
             output: validatedResult,
             error: undefined,
             context: oc,
-            options,
+            options: executionOptions,
           });
 
           return result;
@@ -2755,7 +2784,7 @@ export class Agent {
             output: undefined,
             error: errorResult as any,
             context: oc,
-            options,
+            options: executionOptions,
           });
 
           if (isToolDeniedError(e)) {
@@ -3379,16 +3408,20 @@ export class Agent {
       name: toolName,
       description: toolDescription,
       parameters: parametersSchema,
-      execute: async (args, context) => {
+      execute: async (args, options) => {
         // Extract the prompt from args
         const prompt = (args as any).prompt || args;
 
+        // Extract OperationContext from options if available
+        // Since ToolExecuteOptions extends Partial<OperationContext>, we can extract the fields
+        const oc = options as OperationContext | undefined;
+
         // Generate response using this agent
         const result = await this.generateText(prompt, {
           // Pass through the operation context if available
-          parentOperationContext: context,
-          conversationId: context?.conversationId,
-          userId: context?.userId,
+          parentOperationContext: oc,
+          conversationId: options?.conversationId,
+          userId: options?.userId,
         });
 
         // Return the text result