Integration: MLflow
Trace, evaluate, and monitor your Haystack applications with MLflow.
Table of Contents
Overview
MLflow is an open-source platform for managing the end-to-end machine learning and AI lifecycle. MLflow provides native tracing support for Haystack through its autolog integration, giving you full visibility into your Haystack pipeline execution.
MLflow Tracing offers:
- Hierarchical trace visualization of every component, LLM call, retriever step, and pipeline execution
- Automatic token usage and cost tracking for each LLM call
- Built-in evaluation framework with LLM judges and custom scorers
- Prompt versioning and management across your AI applications
- Fully open-source with no vendor lock-in, self-host or use Managed MLflow in the cloud
You can learn more about the integration in MLflow’s Haystack integration guide.
Installation
pip install mlflow haystack-ai
To start the MLflow tracking server:
mlflow server --port 5000
The MLflow UI will be available at http://localhost:5000.
Usage
Enable tracing for Haystack with a single line of code. This automatically captures traces from all Haystack pipelines and components.
Trace a RAG Pipeline
import mlflow
from haystack import Document, Pipeline
from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.dataclasses import ChatMessage
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.utils import Secret
# Enable MLflow tracing for Haystack
mlflow.haystack.autolog()
mlflow.set_experiment("Haystack")
# Write documents to InMemoryDocumentStore
document_store = InMemoryDocumentStore()
document_store.write_documents(
[
Document(content="My name is Jean and I live in Paris."),
Document(content="My name is Mark and I live in Berlin."),
Document(content="My name is Giorgio and I live in Rome."),
]
)
# Build a RAG pipeline
prompt_template = [
ChatMessage.from_system("You are a helpful assistant."),
ChatMessage.from_user(
"Given these documents, answer the question.\n"
"Documents:\n{% for doc in documents %}{{ doc.content }}{% endfor %}\n"
"Question: {{question}}\n"
"Answer:"
),
]
prompt_builder = ChatPromptBuilder(
template=prompt_template, required_variables={"question", "documents"}
)
retriever = InMemoryBM25Retriever(document_store=document_store)
llm = OpenAIChatGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY"))
rag_pipeline = Pipeline()
rag_pipeline.add_component("retriever", retriever)
rag_pipeline.add_component("prompt_builder", prompt_builder)
rag_pipeline.add_component("llm", llm)
rag_pipeline.connect("retriever", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "llm.messages")
# Ask a question
question = "Who lives in Paris?"
results = rag_pipeline.run(
{
"retriever": {"query": question},
"prompt_builder": {"question": question},
}
)
print(results["llm"]["replies"])
Open the MLflow UI at http://localhost:5000 and navigate to the Traces tab to see detailed traces of your pipeline execution, including component spans, LLM calls, and token usage.
Disable Tracing
Auto-tracing for Haystack can be disabled by calling:
mlflow.haystack.autolog(disable=True)
Use MLflow AI Gateway as an LLM Backend
MLflow AI Gateway (MLflow โฅ 3.0) is a database-backed LLM proxy that routes requests to multiple providers โ OpenAI, Anthropic, Gemini, Mistral, Bedrock, Ollama, and more โ through a single OpenAI-compatible endpoint. Provider API keys are stored encrypted on the server, and features like fallback/retry, traffic splitting, and budget tracking are configured in the MLflow UI with no code changes needed.
Since the gateway exposes an OpenAI-compatible API, you can use Haystack’s built-in OpenAIChatGenerator with a custom api_base_url:
1. Install MLflow and start the server:
pip install mlflow[genai]
mlflow server --host 127.0.0.1 --port 5000
2. Create a gateway endpoint in the MLflow UI at http://localhost:5000. Navigate to AI Gateway โ Create Endpoint, select a provider and model, and enter your provider API key. See the
MLflow AI Gateway documentation for details.
3. Use the endpoint in a Haystack pipeline:
from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret
pipe = Pipeline()
pipe.add_component("prompt_builder", ChatPromptBuilder())
pipe.add_component(
"llm",
OpenAIChatGenerator(
model="my-chat-endpoint", # your MLflow Gateway endpoint name
api_key=Secret.from_token("unused"), # provider keys live on the server
api_base_url="http://localhost:5000/gateway/openai/v1",
),
)
pipe.connect("prompt_builder", "llm")
messages = [ChatMessage.from_user("What is MLflow AI Gateway?")]
result = pipe.run({"prompt_builder": {"template": messages}})
print(result["llm"]["replies"][0].text)
You can also use it standalone:
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret
generator = OpenAIChatGenerator(
model="my-chat-endpoint",
api_key=Secret.from_token("unused"),
api_base_url="http://localhost:5000/gateway/openai/v1",
)
messages = [ChatMessage.from_user("What is MLflow AI Gateway?")]
result = generator.run(messages)
print(result["replies"][0].text)
License
MLflow is distributed under the terms of the Apache-2.0 license.
