fix(session): prevent resume data loss and add /resume slash command#3225
Closed
Mibayy wants to merge 1 commit intoNousResearch:mainfrom
Closed
fix(session): prevent resume data loss and add /resume slash command#3225Mibayy wants to merge 1 commit intoNousResearch:mainfrom
Mibayy wants to merge 1 commit intoNousResearch:mainfrom
Conversation
Three related bugs all rooted in the same session registration flow. Bug 1 — session written to DB with zero messages (NousResearch#3123 root cause): AIAgent.__init__ calls create_session() for the session_id. In the CLI flow, the CLI already called create_session() earlier, so the INSERT hits a UNIQUE constraint. The exception handler set self._session_db = None, silently dropping every subsequent append_message call. The session row existed in the DB but had no messages, so --resume found it and printed 'has no messages'. Fix: change INSERT to INSERT OR IGNORE in SessionDB.create_session(). Re-registering an existing session is now a safe no-op. Also remove the self._session_db = None fallback — it caused more harm than the original failure it was guarding against. Bug 2 — --resume overwrites session JSON with truncated history: When --resume is used on a session with zero SQLite messages (due to bug 1), the CLI loads empty history. The first turn then calls _save_session_log(messages=[user+assistant]) which atomically writes a 2-message session JSON file — overwriting the pre-existing file that had the full conversation history. The user loses all data. Fix: _save_session_log now reads the existing file before writing. If the file has more messages than the current batch, the write is skipped. The full history is preserved until the in-memory messages list catches up. Bug 3 — /resume is an unknown command: The commands registry listed 'resume' but process_command() had no handler for it, so /resume <id> produced 'Unknown command'. Fix: add elif canonical == 'resume': to process_command(). The handler resolves the target by title or ID, ends the current session, switches session_id + conversation_history, re-opens the target session in the DB, and syncs the cached agent if one exists. Closes NousResearch#3123
teknium1
added a commit
that referenced
this pull request
Mar 27, 2026
…reopen_session API Three improvements salvaged from PR #3225 by Mibayy: 1. Add /resume slash command handler in CLI process_command(). The command was registered in the commands registry but had no handler, so typing /resume produced 'Unknown command'. The handler resolves by title or session ID, ends the current session cleanly, loads conversation history from SQLite, re-opens the target session, and syncs the AIAgent instance. Follows the same pattern as new_session(). 2. Add truncation guard in _save_session_log(). When resuming a session whose messages weren't fully written to SQLite, the agent starts with partial history and the first save would overwrite the full JSON log on disk. The guard reads the existing file and skips the write if it already has more messages than the current batch. 3. Add reopen_session() method to SessionDB. Proper API for clearing ended_at/end_reason instead of reaching into _conn directly. Note: Bug 1 from the original PR (INSERT OR IGNORE + _session_db = None) is already fixed on main — skipped as redundant. Closes #3123.
Merged
2 tasks
teknium1
added a commit
that referenced
this pull request
Mar 27, 2026
…reopen_session API (#3315) Three improvements salvaged from PR #3225 by Mibayy: 1. Add /resume slash command handler in CLI process_command(). The command was registered in the commands registry but had no handler, so typing /resume produced 'Unknown command'. The handler resolves by title or session ID, ends the current session cleanly, loads conversation history from SQLite, re-opens the target session, and syncs the AIAgent instance. Follows the same pattern as new_session(). 2. Add truncation guard in _save_session_log(). When resuming a session whose messages weren't fully written to SQLite, the agent starts with partial history and the first save would overwrite the full JSON log on disk. The guard reads the existing file and skips the write if it already has more messages than the current batch. 3. Add reopen_session() method to SessionDB. Proper API for clearing ended_at/end_reason instead of reaching into _conn directly. Note: Bug 1 from the original PR (INSERT OR IGNORE + _session_db = None) is already fixed on main — skipped as redundant. Closes #3123.
Contributor
|
Merged via PR #3315. Salvaged the session log truncation guard, the /resume CLI handler (with a proper reopen_session() API instead of raw SQL), and added the missing SessionDB method. Bug 1 (INSERT OR IGNORE) was already fixed on main so that piece was skipped. Thanks for the thorough analysis of the cascading failures! |
StreamOfRon
pushed a commit
to StreamOfRon/hermes-agent
that referenced
this pull request
Mar 29, 2026
…reopen_session API (NousResearch#3315) Three improvements salvaged from PR NousResearch#3225 by Mibayy: 1. Add /resume slash command handler in CLI process_command(). The command was registered in the commands registry but had no handler, so typing /resume produced 'Unknown command'. The handler resolves by title or session ID, ends the current session cleanly, loads conversation history from SQLite, re-opens the target session, and syncs the AIAgent instance. Follows the same pattern as new_session(). 2. Add truncation guard in _save_session_log(). When resuming a session whose messages weren't fully written to SQLite, the agent starts with partial history and the first save would overwrite the full JSON log on disk. The guard reads the existing file and skips the write if it already has more messages than the current batch. 3. Add reopen_session() method to SessionDB. Proper API for clearing ended_at/end_reason instead of reaching into _conn directly. Note: Bug 1 from the original PR (INSERT OR IGNORE + _session_db = None) is already fixed on main — skipped as redundant. Closes NousResearch#3123.
angelburgosrosado
pushed a commit
to angelburgosrosado/hermes-agent
that referenced
this pull request
Apr 27, 2026
…reopen_session API (NousResearch#3315) Three improvements salvaged from PR NousResearch#3225 by Mibayy: 1. Add /resume slash command handler in CLI process_command(). The command was registered in the commands registry but had no handler, so typing /resume produced 'Unknown command'. The handler resolves by title or session ID, ends the current session cleanly, loads conversation history from SQLite, re-opens the target session, and syncs the AIAgent instance. Follows the same pattern as new_session(). 2. Add truncation guard in _save_session_log(). When resuming a session whose messages weren't fully written to SQLite, the agent starts with partial history and the first save would overwrite the full JSON log on disk. The guard reads the existing file and skips the write if it already has more messages than the current batch. 3. Add reopen_session() method to SessionDB. Proper API for clearing ended_at/end_reason instead of reaching into _conn directly. Note: Bug 1 from the original PR (INSERT OR IGNORE + _session_db = None) is already fixed on main — skipped as redundant. Closes NousResearch#3123.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #3123
Three bugs in the session resume flow, all cascading from the same root cause.
Bug 1 — Sessions recorded with zero messages (root cause)
AIAgent.__init__callscreate_session()for the session ID. In the CLI flow,__init__already calledcreate_session()earlier, so the INSERT hits a UNIQUE constraint. The exception handler did:This is self-defeating: it caused data loss by making every subsequent
append_message()call silently no-op. The session row existed in the DB with zero messages.--resumefound the row and printed"has no messages. Starting fresh.".Fix: Change
INSERT→INSERT OR IGNOREinSessionDB.create_session(). Re-registering an existing session is now a safe no-op. Also removeself._session_db = None— it made the situation worse, not better.Bug 2 —
--resumeoverwrites session JSON with truncated history (data loss)This is what the new comment reports. Sequence:
Xhas 200 messages in JSONL +session_X.json, but 0 in SQLite (due to bug 1)hermes --resume X_init_agentloads 0 messages from SQLite, setsconversation_history = []_save_session_log([user, assistant])writes a 2-message JSON file → overwrites the 200-messagesession_X.jsonFix:
_save_session_lognow reads the existing file before writing. If the file already has more messages than the current batch, the write is skipped. The full history is preserved until the in-memorymessageslist catches up through normal turns.Bug 3 —
/resumeis an unknown commandThe commands registry (
hermes_cli/commands.py) listsresumewithoutcli_only=True, implying CLI availability — butprocess_command()had no handler for it. Typing/resume <id>produced"Unknown command: /resume".Fix: Add
elif canonical == "resume":toprocess_command(). The handler:_resolve_session_by_name_or_id)self.session_idand loadsconversation_historyfrom SQLiteended_at)AIAgentinstance if one is already initialised (session_id,_last_flushed_db_idx, system prompt cache)Files changed
hermes_state.pyINSERT OR IGNOREincreate_sessionrun_agent.pyself._session_db = None; add truncation guard in_save_session_logcli.py/resumehandler inprocess_commandTests
1661 gateway + session + run_agent tests pass, 21 skipped, 0 failures.