Releases: deepset-ai/haystack
v2.19.0
⭐️ Highlights
🛡️ Try Multiple LLMs with FallbackChatGenerator
Introduced FallbackChatGenerator, a resilient chat generator that runs multiple LLMs sequentially and automatically falls back when one fails. It tries each generator in order until one succeeds, handling errors like timeouts, rate limits, or server issues. Ideal for building robust, production-grade chat systems that stay responsive across providers.
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.google_genai import GoogleGenAIChatGenerator
from haystack_integrations.components.generators.anthropic import AnthropicChatGenerator
from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack.components.generators.chat.fallback import FallbackChatGenerator
anthropic_generator = AnthropicChatGenerator(model="claude-sonnet-4-5", timeout=1) # force failure with low timeout
google_generator = GoogleGenAIChatGenerator(model="gemini-2.5-flashy") # force failure with typo in model name
openai_generator = OpenAIChatGenerator(model="gpt-4o-mini") # success
chat_generator = FallbackChatGenerator(chat_generators=[anthropic_generator, google_generator, openai_generator])
response = chat_generator.run(messages=[ChatMessage.from_user("What is the plot twist in Shawshank Redemption?")])
print("Successful ChatGenerator: ", response["meta"]["successful_chat_generator_class"])
print("Response: ", response["replies"][0].text)Output:
WARNING:haystack.components.generators.chat.fallback:ChatGenerator AnthropicChatGenerator failed with error: Request timed out or interrupted...
WARNING:haystack.components.generators.chat.fallback:ChatGenerator GoogleGenAIChatGenerator failed with error: Error in Google Gen AI chat generation: 404 NOT_FOUND...
Successful ChatGenerator: OpenAIChatGenerator
Response: In "The Shawshank Redemption," ....🛠️ Mix Tool and Toolset in Agents
You can now combine both Tool and Toolset objects in the same tools list for Agent and ToolInvoker components. This update brings more flexibility, letting you organize tools into logical groups while still adding standalone tools in one go.
from haystack.components.agents import Agent
from haystack.tools import Tool, Toolset
math_toolset = Toolset([add_tool, multiply_tool])
weather_toolset = Toolset([weather_tool, forecast_tool])
agent = Agent(
chat_generator=generator,
tools=[math_toolset, weather_toolset, calendar_tool], # ✨ Now supported!
)⚙️ Faster Agents with Tool Warmup
Tool and Toolset objects can now perform initialization during Agent or ToolInvoker warmup. This allows setup tasks such as connecting to databases, loading models, or initializing connection pools before the first use.
from haystack.tools import Toolset
from haystack.components.agents import Agent
# Custom toolset with initialization needs
class DatabaseToolset(Toolset):
def __init__(self, connection_string):
self.connection_string = connection_string
self.pool = None
super().__init__([query_tool, update_tool])
def warm_up(self):
# Initialize connection pool
self.pool = create_connection_pool(self.connection_string)🚀 New Features
-
Updated our serialization and deserialization of PipelineSnapshots to work with python Enum classes.
-
Added
FallbackChatGeneratorthat automatically retries different chat generators and returns first successful response with detailed information about which providers were tried. -
Added
pipeline_snapshotandpipeline_snapshot_file_pathparameters toBreakpointExceptionto provide more context when a pipeline breakpoint is triggered.
Addedpipeline_snapshot_file_pathparameter toPipelineRuntimeErrorto include a reference to the stored pipeline snapshot so it can be easily found. -
A new component
RegexTextExtractorwhich allows to extract text from chat messages or strings input based on custom regex pattern. -
CSVToDocument: add
conversion_mode='row'with optionalcontent_column; each row becomes aDocument; remaining columns stored inmeta; default 'file' mode preserved. -
Added the ability to resume an
Agentfrom anAgentSnapshotwhile specifying a new breakpoint in the same run call. This allows stepwise debugging and precise control over chat generator inputs tool inputs before execution, improving flexibility when inspecting intermediate states. This addresses a previous limitation where passing both a snapshot and a breakpoint simultaneously would throw an exception. -
Introduce
SentenceTransformersSparseTextEmbedderandSentenceTransformersSparseDocumentEmbeddercomponents. These components embed text and documents using sparse embedding models compatible with Sentence Transformers. Sparse embeddings are interpretable, efficient when used with inverted indexes, combine classic information retrieval with neural models, and are complementary to dense embeddings. Currently, the producedSparseEmbeddingobjects are compatible with theQdrantDocumentStore.Usage example:
from haystack.components.embedders import SentenceTransformersSparseTextEmbedder text_embedder = SentenceTransformersSparseTextEmbedder() text_embedder.warm_up() print(text_embedder.run("I love pizza!")) # {'sparse_embedding': SparseEmbedding(indices=[999, 1045, ...], values=[0.918, 0.867, ...])}
-
Added a
warm_up()function to theTooldataclass, allowing tools to perform resource-intensive initialization before execution. Tools and Toolsets can now override thewarm_up()method to establish connections to remote services, load models, or perform other preparatory operations. TheToolInvokerandAgentautomatically callwarm_up()on their tools during their own warm-up phase, ensuring tools are ready before use. -
Fixed a serialization issue related to function objects in a pipeline; now they are converted to type None (functions cannot be serialized). This was preventing the successful setting of breakpoints in agents and their use as a resume point. If an error occurs during an Agent execution, for instance, during tool calling. In that case, a snapshot of the last successful step is raised, allowing the caller to catch it to inspect the possible reason for the crash and use it to resume the pipeline execution from that point onwards.
⚡️ Enhancement Notes
- Added
toolsto agent run parameters to enhance the agent's flexibility. Users can now choose a subset of tools for the agent at runtime by providing a list of tool names, or supply an entirely new set by passingToolobjects or aToolset. - Enhanced the
toolsparameter across all tool-accepting components (Agent,ToolInvoker,OpenAIChatGenerator,AzureOpenAIChatGenerator,HuggingFaceAPIChatGenerator,HuggingFaceLocalChatGenerator) to accept either a mixed list of Tool and Toolset objects or just a Toolset object. Previously, components required either a list of Tool objects OR a single Toolset, but not both in the same list. Now users can organize tools into logical Toolsets while also including standalone Tool objects, providing greater flexibility in tool organization. For example:Agent(chat_generator=generator, tools=[math_toolset, weather_toolset, standalone_tool]). This change is fully backward compatible and preserves structure during serialization/deserialization, enabling proper round-trip support for mixed tool configurations. - Refactored
_save_pipeline_snapshotto consolidate try-except logic and added araise_on_failureoption to control whether save failures raise an exception or are logged._create_pipeline_snapshotnow wraps_serialize_value_with_schemain try-except blocks to prevent failures from non-serializable pipeline inputs.
🐛 Bug Fixes
- Fix Agent
run_asyncmethod to correctly handle async streaming callbacks. This previously triggered errors due to a bug. - Prevent duplication of the last assistant message in the chat history when initializing from an
AgentSnapshot. - We were setting
response_formattoNoneinOpenAIChatGeneratorby default which doesn't follow the API spec. We now omit the variable ifresponse_formatis not passed by the user. - Ensure that the
OpenAIChatGeneratoris properly serialized whenresponse_formatingeneration_kwargsis provided as a dictionary (for example,{"type": "json_object"}). Previously, this caused serialization errors. - Fixed parameter schema generation in
ComponentToolwhen usinginputs_from_state. Previously, parameters were only removed from the schema if the state key and parameter name matched exactly. For example,inputs_from_state={"text": "text"}removedtextas expected, butinputs_from_state={"state_text": "text"}did not. This is now resolved, and such cases work as intended. - Refactored
SentenceTransformersEmbeddingBackendto ensure unique embedding IDs by incorporating all relevant arguments. - Fixed Agent to correctly raise a
BreakpointExceptionwhen aToolBreakpointwith a specifictool_nameis provided in an assistant chat message containing multiple tool calls. - The
OpenAIChatGeneratorimplementation usesChatCompletionMessageCustomToolCall, which is only available in OpenAI client>=1.99.2. We now requireopenai>=1.99.2.
💙 Big thank you to everyone who contributed to this release!
@anakin87, @bilgeyucel, @davidsbatista, @dfokina, @...
v2.19.0-rc1
v2.19.0-rc1
v2.18.1
Release Notes
v2.18.1
⚡️ Enhancement Notes
- Added tools to agent run parameters to enhance the agent's flexibility. Users can now choose a subset of tools for the agent at runtime by providing a list of tool names, or supply an entirely new set by passing Tool objects or a Toolset.
🐛 Bug Fixes
- Fix Agent
run_asyncmethod to correctly handle async streaming callbacks. This previously triggered errors due to a bug. - Prevent duplication of the last assistant message in the chat history when initializing from an
AgentSnapshot. - We were setting
response_formattoNoneinOpenAIChatGeneratorby default which doesn't follow the API spec. We now omit the variable ifresponse_formatis not passed by the user.
v2.18.0
⭐️ Highlights
🔁 Pipeline Error Recovery with Snapshots
Pipelines now capture a snapshot of the last successful step when a run fails, including intermediate outputs. This lets you diagnose issues (e.g., failed tool calls), fix them, and resume from the checkpoint instead of restarting the entire run. Currently supported for synchronous Pipeline and Agent (not yet in AsyncPipeline)
The snapshot is part of the exception raised with the PipelineRuntimeError when the pipeline run fails. You need to wrap your pipeline.run() in a try-except block.
try:
pipeline.run(data=input_data)
except PipelineRuntimeError as exc_info
snapshot = exc_info.value.pipeline_snapshot
intermediate_outputs = pipeline_snapshot.pipeline_state.pipeline_outputs
# Snapshot can be used to resume the execution of a Pipeline by passing it to the run() method using the snapshot argument
pipeline.run(data={}, snapshot=saved_snapshot)🧠 Structured Outputs for OpenAI/Azure OpenAI
OpenAIChatGenerator and AzureOpenAIChatGenerator support structured outputs via response_format (Pydantic model or JSON schema).
from pydantic import BaseModel
from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
class CalendarEvent(BaseModel):
event_name: str
event_date: str
event_location: str
generator = OpenAIChatGenerator(generation_kwargs={"response_format": CalendarEvent})
message = "The Open NLP Meetup is going to be in Berlin at deepset HQ on September 19, 2025"
result = generator.run([ChatMessage.from_user(message)])
print(result["replies"][0].text)
# {"event_name":"Open NLP Meetup","event_date":"September 19","event_location":"deepset HQ, Berlin"}🛠️ Convert Pipelines into Tools with PipelineTool
The new PipelineTool lets you expose entire Haystack Pipelines as LLM-compatible tools. It simplifies the previous SuperComponent + ComponentTool pattern into a single abstraction and directly exposes input_mapping and output_mapping for fine-grained control.
from haystack import Pipeline
from haystack.tools import PipelineTool
retrieval_pipeline = Pipeline()
retrieval_pipeline.add_component...
..
retrieval_tool = PipelineTool(
pipeline=retrieval_pipeline,
input_mapping={"query": ["bm25_retriever.query"]},
output_mapping={"ranker.documents": "documents"},
name="retrieval_tool",
description="Use to retrieve documents",
)🗺️ Runtime System Prompt for Agents
Agent’s system_prompt can now be updated dynamically at runtime for more flexible behavior.
🚀 New Features
-
OpenAIChatGeneratorandAzureOpenAIChatGeneratornow support structured outputs usingresponse_formatparameter that can be passed ingeneration_kwargs. Theresponse_formatparameter can be a Pydantic model or a JSON schema for non-streaming responses. For streaming responses, theresponse_formatmust be a JSON schema. Example usage of theresponse_formatparameter:from pydantic import BaseModel from haystack.components.generators.chat import OpenAIChatGenerator from haystack.dataclasses import ChatMessage class NobelPrizeInfo(BaseModel): recipient_name: str award_year: int category: str achievement_description: str nationality: str client = OpenAIChatGenerator( model="gpt-4o-2024-08-06", generation_kwargs={"response_format": NobelPrizeInfo} ) response = client.run(messages=[ ChatMessage.from_user("In 2021, American scientist David Julius received the Nobel Prize in" " Physiology or Medicine for his groundbreaking discoveries on how the human body" " senses temperature and touch.") ]) print(response["replies"][0].text) >>> {"recipient_name":"David Julius","award_year":2021,"category":"Physiology or Medicine","achievement_description":"David Julius was awarded for his transformative findings regarding the molecular mechanisms underlying the human body's sense of temperature and touch. Through innovative experiments, he identified specific receptors responsible for detecting heat and mechanical stimuli, ranging from gentle touch to pain-inducing pressure.","nationality":"American"}
-
Added
PipelineTool, a new tool wrapper that allows Haystack Pipelines to be exposed as LLM-compatible tools.- Previously, this was achievable by first wrapping a pipeline in a
SuperComponentand then passing it toComponentTool. PipelineToolstreamlines that pattern into a dedicated abstraction. It uses the same approach under the hood but directly exposesinput_mappingandoutput_mappingso users can easily control which pipeline inputs and outputs are made available.- Automatically generates input schemas for LLM tool calling from pipeline inputs.
- Extracts descriptions from underlying component docstrings for better tool documentation.
- Can be passed directly to an
Agent, enabling seamless integration of full pipelines as tools in multi-step reasoning workflows.
- Previously, this was achievable by first wrapping a pipeline in a
-
Add a
reasoningfield toStreamingChunkthat optionally takes in aReasoningContentdataclass. This is to allow a structured way to pass reasoning contents to streaming chunks. -
If an error occurs during the execution of a pipeline, the pipeline will raise an PipelineRuntimeError exception containing an error message and the components outputs up to the point of failure. This allows you to inspect and debug the pipeline up to the point of failure.
-
LinkContentFetcher: add
request_headersto allow custom per-request HTTP headers. Header precedence: httpx client defaults → component defaults →request_headers→ rotatingUser-Agent. Also make HTTP/2 handling import-safe: ifh2isn’t installed, fall back to HTTP/1.1 with a warning. Thanks @xoaryaa. (Fixes #9064) -
A snapshot of the last successful step is also raised when an error occurs during a
Pipelinerun. Allowing the caller to catch it to inspect the possible reason for crash and use it to resume the pipeline execution from that point onwards. -
Add
exclude_subdomainsparameter toSerperDevWebSearchcomponent. When set toTrue, this parameter restricts search results to only the exact domains specified inallowed_domains, excluding any subdomains. For example, withallowed_domains=\["example.com"\]andexclude_subdomains=True, results from "blog.example.com" or "shop.example.com" will be filtered out, returning only results from "example.com". The parameter defaults toFalseto maintain backward compatibility with existing behavior.
⚡️ Enhancement Notes
- Added
system_promptto agent run parameters to enhance customization and control over agent behavior. - The internal Agent logic was refactored to help with readability and maintanability. This should help developers understand and extend the internal Agent logic moving forward.
🐛 Bug Fixes
- Reintroduce verbose error message when deserializing a
ChatMessagewith invalid content parts. While LLMs may still generate messages in the wrong format, this error provides guidance on the expected structure, making retries easier and more reliable during agent runs. The error message was unintentionally removed during a previous refactoring. - The English and German abbreviation files used by the
SentenceSplitterare now included in the distribution. They were previously missing due to a config in the.gitignorefile. - Preserve explicit
lambda_threshold=0.0inSentenceTransformersDiversityRankerinstead of overriding it with0.5due to short-circuit evaluation. - Fix
MetaFieldGroupingRankerto still work whensubgroup_byvalues are unhashable types like list. We handle this by stringfying the contents ofdoc.meta\[subgroup_by\]in the same we do this for values ofdoc.meta\[group_by\]. - Fixed missing trace parentage for tools executed via the synchronous ToolInvoker path. Updated
ToolInvoker.run()to propagatecontextvarsinto ThreadPoolExecutor workers, ensuring all tool spans (ComponentTool, Agent wrapped in ComponentTool, or custom tools) are correctly linked to the outer Agent's trace instead of starting new root traces. This improves end-to-end observability across the entire tool execution chain. - Fixed the
from_dictmethod ofMetadataRouterso theoutput_typeparameter introduced in Haystack 2.17 is now optional when loading from YAML. This ensures compatibility with older Haystack pipelines. - In
OpenAIChatGenerator, improved the logic to exclude unsupported custom tool calls. The previous implementation caused compatibility issues with the Mistral Haystack core integration, which extendsOpenAIChatGenerator. - Fixed parameter schema generation in
ComponentToolwhen usinginputs_from_state. Previously, parameters were only removed from the schema if the state key and parameter name matched exactly. For example,inputs_from_state={"text": "text"}removedtextas expected, butinputs_from_state={"state_text": "text"}did not. This is now resolved, and such cases work as intended.
💙 Big thank you to everyone who contributed to this release!
@Amnah199, @Ujjwal-Bajpayee, @abdokaseb, @anakin87, @davidsbatista, @dfokina, @rigved-telang, @sjrl, @tstadel, @vblagoje, @xoaryaa
v2.18.0-rc1
v2.18.0-rc1
v2.17.1
Release Notes
v2.17.1
Bug Fixes
- Fixed the
from_dictmethod ofMetadataRouterso theoutput_typeparameter introduced in Haystack 2.17 is now optional when loading from YAML. This ensures compatibility with older Haystack pipelines. - In
OpenAIChatGenerator, improved the logic to exclude unsupported custom tool calls. The previous implementation caused compatibility issues with the Mistral Haystack core integration, which extendsOpenAIChatGenerator.
v2.17.0
⭐️ Highlights
🖼️ Image support for several model providers
Following the introduction of image support in Haystack 2.16.0, we've expanded this to more model providers in Haystack and Haystack Core integrations.
Now supported: Amazon Bedrock, Anthropic, Azure, Google, Hugging Face API, Meta Llama API, Mistral, Nvidia, Ollama, OpenAI, OpenRouter, STACKIT.
🧩 Extended components
We've improved several components to make them more flexible:
MetadataRouter, which is used to routeDocumentsbased on metadata, has been extended to also support routingByteStreamobjects.- The
SentenceWindowRetriever, which retrieves neighboring sentences around relevantDocumentsto provide full context, is now more flexible. Previously, itssource_id_meta_fieldparameter accepted only a single field containing the ID of the original document. It now also accepts a list of fields, so that only documents matching all of the specified meta fields will be retrieved.
⬆️ Upgrade Notes
-
MultiFileConverteroutputs a new keyfailedin the result dictionary, which contains a list of files that failed to convert. Thedocumentsoutput is included only if at least one file is successfully converted. Previously,documentscould still be present but empty if a file with a supported MIME type was provided but did not actually exist. -
The
finish_reasonfield behavior inHuggingFaceAPIChatGeneratorhas been updated. Previously, the newfinish_reasonmapping (introduced in Haystack 2.15.0 release) was only applied when streaming was enabled. When streaming was disabled, the oldfinish_reasonwas still returned. This change ensures the updatedfinish_reasonvalues are consistently returned regardless of streaming mode.How to know if you're affected: If you rely on
finish_reasonin responses fromHuggingFaceAPIChatGeneratorwith streaming disabled, you may see different values after this upgrade.What to do: Review the updated mapping:
length→lengtheos_token→stopstop_sequence→stop- If tool calls are present →
tool_calls
🚀 New Features
- Add support for ByteStream objects in MetadataRouter. It can now be used to route
list[Documents]orlist[ByteStream]based on metadata. - Add support for the union type operator
|(added in python 3.10) inserialize_typeandPipeline.connect(). These functions support both thetyping.Unionand|operators and mixtures of them for backwards compatibility. - Added
ReasoningContentas a new content part to theChatMessagedataclass. This allows storing model reasoning text and additional metadata in assistant messages. Assistant messages can now include reasoning content using thereasoningparameter inChatMessage.from_assistant(). We will progressively update the implementations for Chat Generators with LLMs that support reasoning to use this new content part. - Updated
SentenceWindowRetriever's source_id_meta_field parameter to also accept a list of strings. If a list of fields are provided, then only documents matching both fields will be retrieved.
⚡️ Enhancement Notes
- Added multimodal support to
HuggingFaceAPIChatGeneratorto enable vision-language model (VLM) usage with images and text. Users can now send both text and images to VLM models through Hugging Face APIs. The implementation follows the HF VLM API format specification and maintains full backward compatibility with text-only messages. - Added serialization/deserialization methods for
TextContentandImageContentparts ofChatMessage. - Made the lazy import error message clearer explaining that the optional dependency is missing.
- Adopted modern type hinting syntax using PEP 585 throughout the codebase. This improves readability and removes unnecessary imports from the
typingmodule. - Support subclasses of
ChatMessagein Agent state schema validation. The validation now checks forissubclass(args[0], ChatMessage)instead of requiring exact type equality, allowing custom ChatMessage subclasses to be used in the messages field. - The
ToolInvokerrunmethod now accepts a list of tools. When provided, this list overrides the tools set in the constructor, allowing you to switch tools at runtime in previously built pipelines.
🐛 Bug Fixes
-
The English and German abbreviation files used by the
SentenceSplitterare now included in the distribution. They were previously missing due to a config in the.gitignorefile. -
Add encoding format keyword argument to OpenAI client when creating embeddings.
-
Addressed incorrect assumptions in the
ChatMessageclass that raised errors in valid usage scenario.-
ChatMessage.from_userwithcontent_parts: Previously, at least one text part was required, even though some model providers support messages with only image parts. This restriction has been removed. If a provider has such a limitation, it should now be enforced in the provider's implementation. -
ChatMessage.to_openai_dict_format: Messages containing multiple text parts weren't supported, despite this being allowed by the OpenAI API. This has now been corrected.
-
-
Improved validation in the
ChatMessage.from_userclass method. The method now raises an error if neithertextnorcontent_partsare provided. It does not raise an error iftextis an empty string. -
Ensure that the
scorefield inSentenceTransformersSimilarityRankeris returned as a Pythonfloatinstead ofnumpy.float32. This prevents potential serialization issues in downstream integrations. -
Raise a
RuntimeErrorwhenAsyncPipeline.runis called from within an async context, indicating thatrun_asyncshould be used instead. -
Prevented in-place mutation of input
Documentobjects in allExtractorandClassifiercomponents by creating copies withdataclasses.replacebefore processing. -
Prevented in-place mutation of input
Documentobjects in allDocumentEmbeddercomponents by creating copies withdataclasses.replacebefore processing. -
FileTypeRouterhas a new parameterraise_on_failurewith default value toFalse. When set toTrue,FileNotFoundErroris always raised for non-existent files. Previously, this exception was raised only when processing a non-existent file and themetaparameter was provided torun(). -
Return a more informative error message when attempting to connect two components and the sender component does not have any OutputSockets defined.
-
Fix tracing context not propagated to tools when running via ToolInvoker.run_async
-
Ensure consistent behavior in
SentenceTransformersDiversityRanker. Like other rankers, it now returns all documents instead of raising an error whentop_kexceeds the number of available documents.
💙 Big thank you to everyone who contributed to this release!
@abdokaseb @Amnah199 @anakin87 @bilgeyucel @ChinmayBansal @datbth @davidsbatista @dfokina @LastRemote
@mpangrazzi @RafaelJohn9 @rolshoven @SaraCalla @SaurabhLingam @sjrl
v2.17.0-rc2
v2.17.0-rc2
v2.17.0-rc1
v2.17.0-rc1
v2.16.1
Release Notes
v2.16.1
Bug Fixes
- Improved validation in the
ChatMessage.from_userclass method. The method now raises an error if neithertextnorcontent_partsare provided. It does not raise an error iftextis an empty string.