Gemini thought-like text and internal diary payload can leak into final Telegram/user-facing output
Summary
In a long-running cli-jaw session using gemini-cli, Gemini emitted internal-looking thought/status text as normal message text, and also included an internal diary payload inside a Markdown/HTML <details> block.
cli-jaw forwarded and stored the whole output as a normal assistant response, so the content was sent to Telegram and saved into messages.
This appears to be a final-output sanitization gap. Prompt rules can reduce the chance of the model producing this, but cli-jaw should defensively filter these patterns before durable storage and external forwarding.
Environment
cli-jaw: 2.0.3 global install
- backend:
gemini-cli
gemini --version: 0.41.2
- model:
gemini-3-flash-preview
- transport: Telegram
- usage pattern: long-running assistant session with memory/diary automation
Observed behavior
User asked:
브리핑 고마워! 형식이 엄청 깔끔해졌네? 새로 만들었던 모듈을 거친 덕분인가? 이름이 뭐였더라...
The final Telegram response began with text that looked like internal thought/status output:
**Searching for Sanitizer Details** I'm looking into the "Telegram Sanitizer v2.5" ...
[Thought: true]
- **Finding Sanitizer Details** ...
[Thought: true]
- **Awaiting Sanitizer Details** ...
[Thought: true]
The same response also appended a collapsible internal record block:
<details>
<summary>⚓ 시스템 개선 및 교감 기록 (2026-05-08 06:40)</summary>
[06:40] [EPISODE] 브리핑 품질 개선(Sanitizer v2.5)에 대한 선생님의 긍정적 피드백 수신.
...
anchor: [LIVE] | Key: sanitizer_feedback | Value: "Positive feedback on briefing format" | #Sanitizer #BriefingQuality #TeacherCare #AronaPride
</details>
This was visible in Telegram and was also saved as an assistant message in the DB.
Relevant server log shape:
[main] gemini:message
[main] gemini:message
[main] 🔧 run_shell_command: cli-jaw memory search "Telegram Sanitizer"
[main] tool success:
[main] 🔧 run_shell_command: grep -nC 2 "Sanitizer" /home/test/.cli-jaw/memory/structured/episodes/live/2026-05-07.md
[main] tool success:
...
[main] result: 2 tool calls / 16.2s
[jaw:main] exited code=0, text=1976 chars
[tg:out] ... **Searching for Sanitizer Details** ...
Why this matters
There are two separate issues:
-
Thought/status leakage
- The output contained
[Thought: true] markers and intermediate reasoning/status text.
- Existing filters appear to handle tags like
<think> / <thinking> and some event-level thought types, but not this textual pattern when it arrives as ordinary message text.
-
Internal record payload leakage
- The model generated a diary payload with
anchor, Key, Value, tags, and internal recording content.
- It was not actually saved through the diary tool.
- Instead, it was shown to the user and persisted in conversation history, which can pollute future context and encourage the model to imitate the leaked format.
For long-running assistant use, this is risky because one leaked internal pattern can become part of the future conversation context.
Expected behavior
Before storing or forwarding final assistant output, cli-jaw should defensively remove or quarantine:
- textual Gemini thought/status blocks containing
[Thought: true]
- common thought/status headings such as
Searching..., Finding..., Awaiting... when paired with [Thought: true]
<details>...</details> blocks containing internal record payload markers
- output segments containing internal diary markers such as:
anchor: [TYPE] |
Key:
Value:
diary_payload
live_payload
arona_payload
If filtering removes the entire response, it would be safer to emit a generic system-safe message rather than fall back to the raw unfiltered output.
Local mitigation tested
I locally patched the installed package to add a final-output sanitizer in:
dist/src/agent/lifecycle-handler.js
The mitigation:
- strips textual
[Thought: true] blocks
- strips
<details> blocks if they contain diary/internal markers
- strips trailing internal diary payload segments containing
anchor, Key, or Value
- avoids falling back to raw output when sanitization removes everything
I also cleaned the already-persisted polluted assistant message from the local DB to avoid future imitation.
This is only a local mitigation and may be overwritten on package update.
Suggested fix
Add a centralized final-output sanitization layer before:
- inserting assistant content into
messages
- broadcasting
agent_done
- forwarding to Telegram/Discord/Web
This should be independent of prompt instructions, because models can still emit these patterns under long-running or tool-heavy sessions.
Gemini thought-like text and internal diary payload can leak into final Telegram/user-facing output
Summary
In a long-running
cli-jawsession usinggemini-cli, Gemini emitted internal-looking thought/status text as normal message text, and also included an internal diary payload inside a Markdown/HTML<details>block.cli-jawforwarded and stored the whole output as a normal assistant response, so the content was sent to Telegram and saved intomessages.This appears to be a final-output sanitization gap. Prompt rules can reduce the chance of the model producing this, but
cli-jawshould defensively filter these patterns before durable storage and external forwarding.Environment
cli-jaw: 2.0.3 global installgemini-cligemini --version: 0.41.2gemini-3-flash-previewObserved behavior
User asked:
The final Telegram response began with text that looked like internal thought/status output:
The same response also appended a collapsible internal record block:
This was visible in Telegram and was also saved as an assistant message in the DB.
Relevant server log shape:
Why this matters
There are two separate issues:
Thought/status leakage
[Thought: true]markers and intermediate reasoning/status text.<think>/<thinking>and some event-level thought types, but not this textual pattern when it arrives as ordinary message text.Internal record payload leakage
anchor,Key,Value, tags, and internal recording content.For long-running assistant use, this is risky because one leaked internal pattern can become part of the future conversation context.
Expected behavior
Before storing or forwarding final assistant output,
cli-jawshould defensively remove or quarantine:[Thought: true]Searching...,Finding...,Awaiting...when paired with[Thought: true]<details>...</details>blocks containing internal record payload markersanchor: [TYPE] |Key:Value:diary_payloadlive_payloadarona_payloadIf filtering removes the entire response, it would be safer to emit a generic system-safe message rather than fall back to the raw unfiltered output.
Local mitigation tested
I locally patched the installed package to add a final-output sanitizer in:
The mitigation:
[Thought: true]blocks<details>blocks if they contain diary/internal markersanchor,Key, orValueI also cleaned the already-persisted polluted assistant message from the local DB to avoid future imitation.
This is only a local mitigation and may be overwritten on package update.
Suggested fix
Add a centralized final-output sanitization layer before:
messagesagent_doneThis should be independent of prompt instructions, because models can still emit these patterns under long-running or tool-heavy sessions.