Release @voltagent/core@1.2.0 · VoltAgent/voltagent

Minor Changes

#761 0d13b73 Thanks @omeraplak! - feat: add onHandoffComplete hook for early termination in supervisor/subagent workflows

The Problem

When using the supervisor/subagent pattern, subagents always return to the supervisor for processing, even when they generate final outputs (like JSON structures or reports) that need no additional handling. This causes unnecessary token consumption.

Current flow:
```
Supervisor → SubAgent (generates 2K token JSON) → Supervisor (processes JSON) → User
                                                    ↑ Wastes ~2K tokens
```
Example impact:
- Current: ~2,650 tokens per request
- With bail: ~560 tokens per request
- Savings: 79% (~2,000 tokens / ~$0.020 per request)
The Solution

Added onHandoffComplete hook that allows supervisors to intercept subagent results and optionally bail (skip supervisor processing) when the subagent produces final output.

New flow:
```
Supervisor → SubAgent → bail() → User ✅
```
API

The hook receives a bail() function that can be called to terminate early:
```
const supervisor = new Agent({
  name: "Workout Supervisor",
  subAgents: [exerciseAgent, workoutBuilder],
  hooks: {
    onHandoffComplete: async ({ agent, result, bail, context }) => {
      // Workout Builder produces final JSON - no processing needed
      if (agent.name === "Workout Builder") {
        context.logger?.info("Final output received, bailing");
        bail(); // Skip supervisor, return directly to user
        return;
      }

      // Large result - bail to save tokens
      if (result.length > 2000) {
        context.logger?.warn("Large result, bailing to save tokens");
        bail();
        return;
      }

      // Transform and bail
      if (agent.name === "Report Generator") {
        const transformed = `# Final Report\n\n${result}\n\n---\nGenerated at: ${new Date().toISOString()}`;
        bail(transformed); // Bail with transformed result
        return;
      }

      // Default: continue to supervisor for processing
    },
  },
});
```
Hook Arguments
```
interface OnHandoffCompleteHookArgs {
  agent: Agent; // Target agent (subagent)
  sourceAgent: Agent; // Source agent (supervisor)
  result: string; // Subagent's output
  messages: UIMessage[]; // Full conversation messages
  usage?: UsageInfo; // Token usage info
  context: OperationContext; // Operation context
  bail: (transformedResult?: string) => void; // Call to bail
}
```
Features
- ✅ Clean API: No return value needed, just call bail()
- ✅ True early termination: Supervisor execution stops immediately, no LLM calls wasted
- ✅ Conditional bail: Decide based on agent, result content, size, etc.
- ✅ Optional transformation: bail(newResult) to transform before bailing
- ✅ Observability: Automatic logging and OpenTelemetry events with visual indicators
- ✅ Backward compatible: Existing code works without changes
- ✅ Error handling: Hook errors logged, flow continues normally
How Bail Works (Implementation Details)

When bail() is called in the onHandoffComplete hook:

1. Hook Level (packages/core/src/agent/subagent/index.ts):
- Sets bailed: true flag in handoff return value
- Adds OpenTelemetry span attributes to both supervisor and subagent spans
- Logs the bail event with metadata
2. Tool Level (delegate_task tool):
- Includes bailed: true in tool result structure
- Adds note: "One or more subagents produced final output. No further processing needed."
3. Step Handler Level (createStepHandler in agent.ts):
- Detects bail during step execution when tool results arrive
- Creates BailError and aborts execution via abortController.abort(bailError)
- Stores bailed result in systemContext for retrieval
- Works for both generateText and streamText
4. Catch Block Level (method-specific handling):
- generateText: Catches BailError, retrieves bailed result from systemContext, applies guardrails, calls hooks, returns as successful generation
- streamText: onError catches BailError gracefully (not logged as error), onFinish retrieves and uses bailed result
This unified abort-based implementation ensures true early termination for all generation methods.

Stream Support (NEW)

For streamText supervisors:

When a subagent bails during streaming, the supervisor stream is immediately aborted using a BailError:
1. Detection during streaming (createStepHandler):
  - Tool results are checked in onStepFinish handler
  - If bailed: true found, BailError is created and stream is aborted via abortController.abort(bailError)
  - Bailed result stored in systemContext for retrieval in onFinish
2. Graceful error handling (streamText onError):
  - BailError is detected and handled gracefully (not logged as error)
  - Error hooks are NOT called for bail
  - Stream abort is treated as successful early termination
3. Final result (streamText onFinish):
  - Bailed result retrieved from systemContext
  - Output guardrails applied to bailed result
  - onEnd hook called with bailed result
Benefits for streaming:
- ✅ Stream stops immediately when bail detected (no wasted supervisor chunks)
- ✅ No unnecessary LLM calls after bail
- ✅ Works with fullStreamEventForwarding - subagent chunks already forwarded
- ✅ Clean abort semantic with BailError class
- ✅ Graceful handling - not treated as error
Supported methods:
- ✅ generateText - Aborts execution during step handler, catches BailError and returns bailed result
- ✅ streamText - Aborts stream during step handler, handles BailError in onError and onFinish
- ❌ generateObject - No tool support, bail not applicable
- ❌ streamObject - No tool support, bail not applicable
Key difference from initial implementation:
- ❌ OLD: Post-execution check in generateText (after AI SDK completes) - redundant
- ✅ NEW: Unified abort mechanism in createStepHandler - works for both methods, stops execution immediately
Use Cases

Perfect for scenarios where specialized subagents generate final outputs:
1. JSON/Structured data generators: Workout builders, report generators
2. Large content producers: Document creators, data exports
3. Token optimization: Skip processing for expensive results
4. Business logic: Conditional routing based on result characteristics
Observability

When bail occurs, both logging and OpenTelemetry tracking provide full visibility:

Logging:
- Log event: Supervisor bailed after handoff
- Includes: supervisor name, subagent name, result length, transformation status
OpenTelemetry:
- Span event: supervisor.handoff.bailed (for timeline events)
- Span attributes added to both supervisor and subagent spans:
  - bailed: true
  - bail.supervisor: supervisor agent name (on subagent span)
  - bail.subagent: subagent name (on supervisor span)
  - bail.transformed: true if result was transformed
Console Visualization:
Bailed subagents are visually distinct in the observability react-flow view:
- Purple border with shadow (border-purple-500 shadow-purple-600/50)
- "⚡ BAILED" badge in the header (shows "⚡ BAILED (T)" if transformed)
- Tooltip showing which supervisor initiated the bail
- Node opacity remains at 1.0 (fully visible)
- Status badge shows "BAILED" with purple styling instead of error
- Details panel shows "Early Termination" info section with supervisor info
Type Safety Improvements

Also improved type safety by replacing usage?: any with proper UsageInfo type:
```
export type UsageInfo = {
  promptTokens: number;
  completionTokens: number;
  totalTokens: number;
  cachedInputTokens?: number;
  reasoningTokens?: number;
};
```
This provides:
- ✅ Better autocomplete in IDEs
- ✅ Compile-time type checking
- ✅ Clear documentation of available fields
Breaking Changes

None - this is a purely additive feature. The UsageInfo type structure is fully compatible with existing code.

Patch Changes

#754 c80d18f Thanks @omeraplak! - feat: encapsulate tool-specific metadata in toolContext + prevent AI SDK context collision

Changes

1. Tool Context Encapsulation

Tool-specific metadata now organized under optional toolContext field for better separation and future-proofing.

Migration:
```
// Before
execute: async ({ location }, options) => {
  // Fields were flat (planned, not released)
};

// After
execute: async ({ location }, options) => {
  const { name, callId, messages, abortSignal } = options?.toolContext || {};

  // Session context remains flat
  const userId = options?.userId;
  const logger = options?.logger;
  const context = options?.context;
};
```
2. AI SDK Context Field Protection

Explicitly exclude context from being spread into AI SDK calls to prevent future naming collisions if AI SDK renames experimental_context → context.

Benefits
- ✅ Better organization - tool metadata in one place
- ✅ Clearer separation - session context vs tool context
- ✅ Future-proof - easy to add new tool metadata fields
- ✅ Namespace safety - no collision with OperationContext or AI SDK fields
- ✅ Backward compatible - toolContext is optional for external callers (MCP servers)
- ✅ Protected from AI SDK breaking changes

#754 c80d18f Thanks @omeraplak! - feat: add multi-modal tool results support with toModelOutput - #722

Tools can now return images, media, and rich content to AI models using the toModelOutput function.

The Problem

AI agents couldn't receive visual information from tools - everything had to be text or JSON. This limited use cases like:

Computer use agents that need to see screenshots
Image analysis workflows
Visual debugging tools
Any tool that produces media output

The Solution

Added toModelOutput?: (output) => ToolResultOutput to tool options. This function transforms your tool's output into a format the AI model can understand, including images and media.

import { createTool } from "@voltagent/core";
import fs from "fs";

const screenshotTool = createTool({
  name: "take_screenshot",
  description: "Takes a screenshot of the screen",
  parameters: z.object({
    region: z.string().optional().describe("Region to capture"),
  }),
  execute: async ({ region }) => {
    const imageData = fs.readFileSync("./screenshot.png").toString("base64");
    return {
      type: "image",
      data: imageData,
      timestamp: new Date().toISOString(),
    };
  },
  toModelOutput: (result) => ({
    type: "content",
    value: [
      { type: "text", text: `Screenshot captured at ${result.timestamp}` },
      { type: "media", data: result.data, mediaType: "image/png" },
    ],
  }),
});

Return Formats

The toModelOutput function can return multiple formats:

Text output:

toModelOutput: (result) => ({
  type: "text",
  value: result.summary,
});

JSON output:

toModelOutput: (result) => ({
  type: "json",
  value: { status: "success", data: result },
});

Multi-modal content (text + media):

toModelOutput: (result) => ({
  type: "content",
  value: [
    { type: "text", text: "Analysis complete" },
    { type: "media", data: result.imageBase64, mediaType: "image/png" },
  ],
});

Error handling:

toModelOutput: (result) => ({
  type: "error-text",
  value: result.errorMessage,
});

Impact

Visual AI Workflows: Build computer use agents that can see and interact with UIs
Image Generation: Tools can return generated images directly to the model
Debugging: Return screenshots and visual debugging information
Rich Responses: Combine text explanations with visual evidence

Usage with Anthropic

const agent = createAgent({
  name: "visual-assistant",
  tools: [screenshotTool],
  model: anthropic("claude-3-5-sonnet-20241022"),
});

const result = await agent.generateText({
  prompt: "Take a screenshot and describe what you see",
});
// Agent receives both text and image, can analyze the screenshot

See AI SDK documentation for more details on multi-modal tool results.

#754 c80d18f Thanks @omeraplak! - feat: add providerOptions support to tools for provider-specific features - #759

Tools can now accept providerOptions to enable provider-specific features like Anthropic's cache control. This aligns VoltAgent tools with the AI SDK's tool API.

The Problem

Users wanted to use provider-specific features like Anthropic's prompt caching to reduce costs and latency, but VoltAgent's createTool() didn't support the providerOptions field that AI SDK tools have.

The Solution

What Changed:
- Added providerOptions?: ProviderOptions field to ToolOptions type
- VoltAgent tools now accept and pass through provider options to the AI SDK
- Supports all provider-specific features: cache control, reasoning settings, etc.
What Gets Enabled:
```
import { createTool } from "@voltagent/core";
import { z } from "zod";

const cityAttractionsTool = createTool({
  name: "get_city_attractions",
  description: "Get tourist attractions for a city",
  parameters: z.object({
    city: z.string().describe("The city name"),
  }),
  providerOptions: {
    anthropic: {
      cacheControl: { type: "ephemeral" },
    },
  },
  execute: async ({ city }) => {
    return await fetchAttractions(city);
  },
});
```
Impact
- Cost Optimization: Anthropic cache control reduces API costs for repeated tool calls
- Future-Proof: Any new provider features work automatically
- Type-Safe: Uses official AI SDK ProviderOptions type
- Zero Breaking Changes: Optional field, fully backward compatible
Usage

Use with any provider that supports provider-specific options:
```
const agent = new Agent({
  name: "Travel Assistant",
  model: anthropic("claude-3-5-sonnet"),
  tools: [cityAttractionsTool], // Tool with cacheControl enabled
});

await agent.generateText("What are the top attractions in Paris?");
// Tool definition cached by Anthropic for improved performance
```
Learn more: Anthropic Cache Control

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

@voltagent/[email protected]

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Minor Changes

The Problem

The Solution

API

Hook Arguments

Features

How Bail Works (Implementation Details)

Stream Support (NEW)

Use Cases

Observability

Type Safety Improvements

Breaking Changes

Patch Changes

Changes

1. Tool Context Encapsulation

2. AI SDK Context Field Protection

Benefits

The Problem

The Solution

Return Formats

Impact

Usage with Anthropic

The Problem

The Solution

Impact

Usage

Uh oh!