From 0664dfbcca9d06ec0e2d0aab25fd145106c9edc3 Mon Sep 17 00:00:00 2001
From: albertodiazdurana <52709586+albertodiazdurana@users.noreply.github.com>
Date: Wed, 6 May 2026 18:15:34 +0200
Subject: [PATCH 1/3] docs(ollama): add streaming-with-tools example to
 OllamaChatGenerator reference

Closes deepset-ai/haystack-core-integrations#3263 (follow-up).

The component reference page already covers Tool Support and Streaming
in separate sections, but no example shows them combined. Adds a
Streaming with Tools section between the two, with an executable example
verified empirically against llama3.1:8b on Ollama.

Notable behavior captured in the doc: when the model invokes a tool,
streamed chunks carry tool_calls deltas and chunk.content is empty;
the final ChatMessage has text=None and tool_calls populated.
---
 .../generators/ollamachatgenerator.mdx        | 48 +++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/docs-website/docs/pipeline-components/generators/ollamachatgenerator.mdx b/docs-website/docs/pipeline-components/generators/ollamachatgenerator.mdx
index 7f6b3fad5f..2084037c0d 100644
--- a/docs-website/docs/pipeline-components/generators/ollamachatgenerator.mdx
+++ b/docs-website/docs/pipeline-components/generators/ollamachatgenerator.mdx
@@ -88,6 +88,54 @@ See our [Streaming Support](guides-to-generators/choosing-the-right-generator.md
 
 Give preference to `print_streaming_chunk` by default. Write a custom callback only if you need a specific transport (for example, SSE/WebSocket) or custom UI formatting.
 
+### Streaming with Tools
+
+You can combine streaming with tool calling. Pass both `tools` and `streaming_callback`; when the model decides to invoke a tool, the streamed chunks carry tool-call deltas instead of text tokens, and the final reconstructed `ChatMessage` exposes the resolved `tool_calls` list on `replies[0]`.
+
+```python
+from haystack.dataclasses import ChatMessage
+from haystack.dataclasses.streaming_chunk import StreamingChunk
+from haystack.tools import create_tool_from_function
+from haystack_integrations.components.generators.ollama import OllamaChatGenerator
+
+
+def get_weather(city: str) -> str:
+    """Get current weather for a city."""
+    return f"Sunny, 22°C in {city}"
+
+
+def callback(chunk: StreamingChunk) -> None:
+    if chunk.tool_calls:
+        print(f"[tool delta] {chunk.tool_calls}")
+    elif chunk.content:
+        print(chunk.content, end="", flush=True)
+
+
+generator = OllamaChatGenerator(
+    model="llama3.1:8b",
+    generation_kwargs={"temperature": 0.0},
+    tools=[create_tool_from_function(get_weather)],
+    streaming_callback=callback,
+)
+
+response = generator.run(
+    messages=[ChatMessage.from_user(
+        "What's the weather in Berlin? Use the get_weather tool."
+    )]
+)
+
+# Final reconstructed message: tool_calls populated, text is None
+assistant_message = response["replies"][0]
+print(assistant_message.tool_calls)
+# -> [ToolCall(tool_name='get_weather', arguments={'city': 'Berlin'}, ...)]
+```
+
+:::tip[What to expect when tools fire]
+When the model emits a tool call rather than free-form text, streamed chunks carry `tool_calls` deltas and `chunk.content` is empty. The final `replies[0].text` will be `None`, and `replies[0].tool_calls` holds the reconstructed call list. Plain text streaming and tool calling are mutually exclusive within a single generation step.
+:::
+
+You can use the built-in `print_streaming_chunk` callback (which handles both text tokens and tool events) instead of writing your own.
+
 ## Usage
 
 1. You need a running instance of Ollama. The installation instructions are [in the Ollama GitHub repository](https://github.com/jmorganca/ollama).

From e22be72b4a027e451cbd5aaa5bc20b0b088bc1c3 Mon Sep 17 00:00:00 2001
From: albertodiazdurana <52709586+albertodiazdurana@users.noreply.github.com>
Date: Wed, 6 May 2026 23:46:29 +0200
Subject: [PATCH 2/3] docs(ollama): add release note for streaming-with-tools
 doc

Per CONTRIBUTING.md, every PR requires a release note under
releasenotes/notes/. Categorized as `enhancements` to match the
shape of prior docs-only release notes (e.g.,
docs-cleaner-markdown-ocr-examples-...yaml).
---
 ...-with-tools-ollamachatgenerator-docs-8e339d62f38ebd06.yaml | 4 ++++
 1 file changed, 4 insertions(+)
 create mode 100644 releasenotes/notes/streaming-with-tools-ollamachatgenerator-docs-8e339d62f38ebd06.yaml

diff --git a/releasenotes/notes/streaming-with-tools-ollamachatgenerator-docs-8e339d62f38ebd06.yaml b/releasenotes/notes/streaming-with-tools-ollamachatgenerator-docs-8e339d62f38ebd06.yaml
new file mode 100644
index 0000000000..cf1599ceef
--- /dev/null
+++ b/releasenotes/notes/streaming-with-tools-ollamachatgenerator-docs-8e339d62f38ebd06.yaml
@@ -0,0 +1,4 @@
+---
+enhancements:
+  - |
+    Document the combined use of streaming and tool calling in ``OllamaChatGenerator``: pass both ``streaming_callback`` and ``tools`` to receive ``tool_calls`` deltas through the streaming callback, and read the reconstructed call list from ``replies[0].tool_calls`` once the run returns. The component reference page now includes an executable example and notes that, within a single generation step, streamed text tokens and tool-call deltas are mutually exclusive (when the model invokes a tool, ``chunk.content`` is empty and ``replies[0].text`` is ``None``).

From de747a4bc5b865b64d1d42b28c0673b1728e9ef4 Mon Sep 17 00:00:00 2001
From: albertodiazdurana <52709586+albertodiazdurana@users.noreply.github.com>
Date: Thu, 7 May 2026 14:50:57 +0200
Subject: [PATCH 3/3] docs(ollama): address review feedback on PR #11268

- Remove releasenotes/notes/streaming-with-tools-ollamachatgenerator-docs-8e339d62f38ebd06.yaml: docs-only change does not need a release note.
- Remove the :::tip[What to expect when tools fire] admonition from docs-website/docs/pipeline-components/generators/ollamachatgenerator.mdx: the inline comments in the streaming-with-tools example already convey the same information.
- Add the Streaming with Tools section to docs-website/versioned_docs/version-2.28/pipeline-components/generators/ollamachatgenerator.mdx (latest stable), byte-identical to the v3 docs section.
---
 .../generators/ollamachatgenerator.mdx        |  4 --
 .../generators/ollamachatgenerator.mdx        | 44 +++++++++++++++++++
 ...machatgenerator-docs-8e339d62f38ebd06.yaml |  4 --
 3 files changed, 44 insertions(+), 8 deletions(-)
 delete mode 100644 releasenotes/notes/streaming-with-tools-ollamachatgenerator-docs-8e339d62f38ebd06.yaml

diff --git a/docs-website/docs/pipeline-components/generators/ollamachatgenerator.mdx b/docs-website/docs/pipeline-components/generators/ollamachatgenerator.mdx
index 2084037c0d..b141d5889e 100644
--- a/docs-website/docs/pipeline-components/generators/ollamachatgenerator.mdx
+++ b/docs-website/docs/pipeline-components/generators/ollamachatgenerator.mdx
@@ -130,10 +130,6 @@ print(assistant_message.tool_calls)
 # -> [ToolCall(tool_name='get_weather', arguments={'city': 'Berlin'}, ...)]
 ```
 
-:::tip[What to expect when tools fire]
-When the model emits a tool call rather than free-form text, streamed chunks carry `tool_calls` deltas and `chunk.content` is empty. The final `replies[0].text` will be `None`, and `replies[0].tool_calls` holds the reconstructed call list. Plain text streaming and tool calling are mutually exclusive within a single generation step.
-:::
-
 You can use the built-in `print_streaming_chunk` callback (which handles both text tokens and tool events) instead of writing your own.
 
 ## Usage
diff --git a/docs-website/versioned_docs/version-2.28/pipeline-components/generators/ollamachatgenerator.mdx b/docs-website/versioned_docs/version-2.28/pipeline-components/generators/ollamachatgenerator.mdx
index 816ed66ad1..d16e1964cf 100644
--- a/docs-website/versioned_docs/version-2.28/pipeline-components/generators/ollamachatgenerator.mdx
+++ b/docs-website/versioned_docs/version-2.28/pipeline-components/generators/ollamachatgenerator.mdx
@@ -87,6 +87,50 @@ See our [Streaming Support](guides-to-generators/choosing-the-right-generator.md
 
 Give preference to `print_streaming_chunk` by default. Write a custom callback only if you need a specific transport (for example, SSE/WebSocket) or custom UI formatting.
 
+### Streaming with Tools
+
+You can combine streaming with tool calling. Pass both `tools` and `streaming_callback`; when the model decides to invoke a tool, the streamed chunks carry tool-call deltas instead of text tokens, and the final reconstructed `ChatMessage` exposes the resolved `tool_calls` list on `replies[0]`.
+
+```python
+from haystack.dataclasses import ChatMessage
+from haystack.dataclasses.streaming_chunk import StreamingChunk
+from haystack.tools import create_tool_from_function
+from haystack_integrations.components.generators.ollama import OllamaChatGenerator
+
+
+def get_weather(city: str) -> str:
+    """Get current weather for a city."""
+    return f"Sunny, 22°C in {city}"
+
+
+def callback(chunk: StreamingChunk) -> None:
+    if chunk.tool_calls:
+        print(f"[tool delta] {chunk.tool_calls}")
+    elif chunk.content:
+        print(chunk.content, end="", flush=True)
+
+
+generator = OllamaChatGenerator(
+    model="llama3.1:8b",
+    generation_kwargs={"temperature": 0.0},
+    tools=[create_tool_from_function(get_weather)],
+    streaming_callback=callback,
+)
+
+response = generator.run(
+    messages=[ChatMessage.from_user(
+        "What's the weather in Berlin? Use the get_weather tool."
+    )]
+)
+
+# Final reconstructed message: tool_calls populated, text is None
+assistant_message = response["replies"][0]
+print(assistant_message.tool_calls)
+# -> [ToolCall(tool_name='get_weather', arguments={'city': 'Berlin'}, ...)]
+```
+
+You can use the built-in `print_streaming_chunk` callback (which handles both text tokens and tool events) instead of writing your own.
+
 ## Usage
 
 1. You need a running instance of Ollama. The installation instructions are [in the Ollama GitHub repository](https://github.com/jmorganca/ollama).
diff --git a/releasenotes/notes/streaming-with-tools-ollamachatgenerator-docs-8e339d62f38ebd06.yaml b/releasenotes/notes/streaming-with-tools-ollamachatgenerator-docs-8e339d62f38ebd06.yaml
deleted file mode 100644
index cf1599ceef..0000000000
--- a/releasenotes/notes/streaming-with-tools-ollamachatgenerator-docs-8e339d62f38ebd06.yaml
+++ /dev/null
@@ -1,4 +0,0 @@
----
-enhancements:
-  - |
-    Document the combined use of streaming and tool calling in ``OllamaChatGenerator``: pass both ``streaming_callback`` and ``tools`` to receive ``tool_calls`` deltas through the streaming callback, and read the reconstructed call list from ``replies[0].tool_calls`` once the run returns. The component reference page now includes an executable example and notes that, within a single generation step, streamed text tokens and tool-call deltas are mutually exclusive (when the model invokes a tool, ``chunk.content`` is empty and ``replies[0].text`` is ``None``).