feat(storage): add conversation memory store substrate#1065
feat(storage): add conversation memory store substrate#1065zhoug9127 wants to merge 24 commits intolightseekorg:mainfrom
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 30 minutes and 46 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThreads an optional ConversationMemoryWriter through storage creation (StorageBundle) into AppContext and router; adds an in-memory ConversationMemoryWriter, memory HTTP headers, MemoryRuntimeConfig, MemoryExecutionContext, middleware helpers, and validation/tests to gate memory store/recall. Changes
Sequence DiagramsequenceDiagram
autonumber
participant Client as HTTP Request
participant Middleware as build_memory_execution_context
participant Headers as MemoryHeaderView
participant ExecCtx as MemoryExecutionContext
participant Factory as create_storage (StorageBundle)
participant Storage as StorageBundle
participant AppCtx as AppContext
participant Router as OpenAI Router
Client->>Middleware: send request with headers
Middleware->>Headers: MemoryHeaderView::from_http_headers
Headers-->>Middleware: normalized header view
Middleware->>ExecCtx: MemoryExecutionContext::from_headers(view, runtime)
ExecCtx-->>Middleware: exec context (requested vs active)
Factory->>Storage: create_storage(config) -> StorageBundle (includes optional memory writer)
Storage-->>AppCtx: provide storages + conversation_memory_writer
AppCtx->>Router: start request with ExecCtx and StorageHandles (may include writer)
Router->>Storage: use storages and conversation_memory_writer (if present) during handling
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a new ConversationMemoryWriter trait and its implementations (Memory and NoOp) to the data connector, along with a new StorageBundle structure to manage storage backends. It also adds a MemoryRuntimeConfig and MemoryExecutionContext to the model gateway to handle memory-related HTTP headers and runtime configuration, including validation logic. My feedback highlights that header parsing for policies and boolean flags should be case-insensitive for robustness, and that the HeaderMap clone in RequestContext::refresh_memory_execution_context should be avoided to improve efficiency.
d7f018a to
b344599
Compare
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
b344599 to
0996098
Compare
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
model_gateway/src/routers/openai/context.rs (1)
125-170: 🛠️ Refactor suggestion | 🟠 MajorBuild
memory_execution_contextin the constructors, not as a separate refresh step.
RequestContext::for_chat()andRequestContext::for_responses()already have both the request headers and the router config, but they still seedmemory_execution_contextwithDefaultand rely on a laterrefresh_memory_execution_context()call. That makes header parsing opt-in and easy to miss on new code paths; forgetting the extra call silently drops the memory controls for that request. Compute the context once during construction, and keep a refresh method only if headers can actually change after theRequestContextis built.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/openai/context.rs` around lines 125 - 170, The constructors RequestContext::for_responses and RequestContext::for_chat currently set memory_execution_context to MemoryExecutionContext::default() and rely on refresh_memory_execution_context() later; instead, call middleware::build_memory_execution_context(self.components.router_config(), &headers) inside each constructor using the same headers extraction (the provided headers param or HeaderMap::default()) to initialize memory_execution_context at construction time, leaving refresh_memory_execution_context() only for cases where headers may change; update the fields in for_responses and for_chat to compute and assign memory_execution_context accordingly (referencing for_responses, for_chat, memory_execution_context, middleware::build_memory_execution_context, and refresh_memory_execution_context).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/data_connector/src/factory.rs`:
- Around line 168-185: The hook path currently wraps
response/conversation/conversation_item storages but leaves
bundle.conversation_memory_writer unwrapped, letting
ConversationMemoryWriter::create_memory bypass hooks; either wrap it in a new
HookedConversationMemoryWriter (create HookedConversationMemoryWriter that
implements the same trait as conversation_memory_writer and invokes
StorageHook::before/after/current_request_context() like
HookedResponseStorage/HookedConversationStorage/HookedConversationItemStorage)
and return
Arc::new(HookedConversationMemoryWriter::new(bundle.conversation_memory_writer,
hook.clone())) in the StorageBundle, or explicitly reject combinations by
returning an Err when config.hook.is_some() &&
bundle.conversation_memory_writer.is_some() until the wrapper exists; update
create_storage_bundle() to use the chosen approach and add a test asserting
hook-enabled bundles either wrap the memory writer or are rejected.
In `@model_gateway/src/app_context.rs`:
- Around line 559-564: After awaiting create_storage_bundle(storage_config), add
a fail-fast validation that if memory_runtime.ltm_store_enabled (the runtime
config flag) is true but bundle.conversation_memory_writer is None, return an
error (or propagate a Config/Init error) instead of continuing; update the code
path that sets self.conversation_memory_writer to only accept the Some(writer)
case (or fail when absent). Reference create_storage_bundle,
bundle.conversation_memory_writer, and memory_runtime.ltm_store_enabled so the
gateway will refuse to start when store-enabled is requested but the selected
backend exposes no memory writer.
In `@model_gateway/src/config/validation.rs`:
- Around line 1168-1170: The test currently only checks that
ConfigValidator::validate(&config) returns an Err, which is too broad; update
the assertion to match the specific error variant ConfigError::ValidationFailed
(and optionally inspect its reason/message) by pattern-matching on result (e.g.,
let err = result.expect_err(...); match err {
ConfigError::ValidationFailed(reason) => { /* assert reason contains expected
text */ }, _ => panic!("unexpected error: {:?}", err) }); this ensures the
failure is the new validation rule rather than some unrelated error.
In `@model_gateway/src/memory/context.rs`:
- Around line 12-43: MemoryExecutionContext currently drops the parsed Policy so
callers cannot distinguish store_and_recall vs store_only; update
MemoryExecutionContext (and its constructors from_headers/from_http_headers) to
preserve the policy intent by adding explicit fields (e.g., recall_requested and
recall_active) or keeping the parsed Policy, and set them based on
Policy::from_value(headers.policy.as_deref()) and runtime flags; ensure
store_ltm_requested, store_ltm_active logic remains but also compute
recall_requested = policy.allows_recall() and recall_active = recall_requested
&& runtime.recall_enabled (or equivalent), populate
subject_id/recall_method/embedding_model/extraction_model as before, and
propagate these new fields so downstream hooks can observe recall intent in
MemoryExecutionContext and from_headers.
---
Outside diff comments:
In `@model_gateway/src/routers/openai/context.rs`:
- Around line 125-170: The constructors RequestContext::for_responses and
RequestContext::for_chat currently set memory_execution_context to
MemoryExecutionContext::default() and rely on refresh_memory_execution_context()
later; instead, call
middleware::build_memory_execution_context(self.components.router_config(),
&headers) inside each constructor using the same headers extraction (the
provided headers param or HeaderMap::default()) to initialize
memory_execution_context at construction time, leaving
refresh_memory_execution_context() only for cases where headers may change;
update the fields in for_responses and for_chat to compute and assign
memory_execution_context accordingly (referencing for_responses, for_chat,
memory_execution_context, middleware::build_memory_execution_context, and
refresh_memory_execution_context).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: eefa691b-bd5d-491a-8137-761a1dd03f8e
📒 Files selected for processing (17)
crates/data_connector/src/factory.rscrates/data_connector/src/lib.rscrates/data_connector/src/memory.rscrates/data_connector/src/noop.rsmodel_gateway/src/app_context.rsmodel_gateway/src/config/builder.rsmodel_gateway/src/config/types.rsmodel_gateway/src/config/validation.rsmodel_gateway/src/lib.rsmodel_gateway/src/memory/context.rsmodel_gateway/src/memory/headers.rsmodel_gateway/src/memory/mod.rsmodel_gateway/src/middleware.rsmodel_gateway/src/routers/openai/context.rsmodel_gateway/src/routers/openai/responses/route.rsmodel_gateway/src/routers/openai/router.rsmodel_gateway/src/service_discovery.rs
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2c035f5398
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
model_gateway/src/app_context.rs (1)
286-360:⚠️ Potential issue | 🟠 MajorEnforce the memory-writer invariant in
build()too.The new readiness check only runs in
with_storage(). Any caller that usesAppContextBuilderdirectly can still build anAppContextwithmemory_runtime.ltm_store_enabled = trueandconversation_memory_writer = None, which recreates the missing-writer state this PR is trying to fail fast on.Suggested fix
pub fn build(self) -> Result<AppContext, AppContextBuildError> { let router_config = self .router_config .ok_or(AppContextBuildError::MissingField("router_config"))?; + validate_memory_writer_configuration( + &router_config, + self.conversation_memory_writer.as_ref(), + ) + .map_err(AppContextBuildError::InvalidConfig)?; + let configured_reasoning_parser = router_config.reasoning_parser.clone(); let configured_tool_parser = router_config.tool_call_parser.clone();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/app_context.rs` around lines 286 - 360, The build() method must enforce the same invariant as with_storage(): if router_config.memory_runtime.ltm_store_enabled (or equivalent flag on the RouterConfig) is true then conversation_memory_writer must be Some; add an early check in AppContextBuilder::build that reads router_config (or self.router_config) and returns Err(AppContextBuildError::InvalidConfig(...)) when ltm_store_enabled is true but self.conversation_memory_writer.is_none(), mirroring the with_storage() readiness check to fail fast; reference the build() function, with_storage(), RouterConfig.memory_runtime.ltm_store_enabled, and conversation_memory_writer when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/data_connector/src/factory.rs`:
- Around line 169-175: The current check in factory.rs unconditionally rejects
any config.hook when bundle.conversation_memory_writer.is_some(), which wrongly
blocks HistoryBackend::Memory and HistoryBackend::None even if the LTM store is
disabled; change the guard so it only errors when hooks are enabled AND the
runtime will actually use the conversation writer (e.g., check
memory_runtime.ltm_store_enabled or equivalent runtime flag) or move this
validation out of the factory into the runtime-aware layer that has access to
memory_runtime.ltm_store_enabled; specifically modify the logic around
config.hook and bundle.conversation_memory_writer to defer or conditionalize the
rejection (do not unconditionally reject for HistoryBackend::Memory/None).
---
Outside diff comments:
In `@model_gateway/src/app_context.rs`:
- Around line 286-360: The build() method must enforce the same invariant as
with_storage(): if router_config.memory_runtime.ltm_store_enabled (or equivalent
flag on the RouterConfig) is true then conversation_memory_writer must be Some;
add an early check in AppContextBuilder::build that reads router_config (or
self.router_config) and returns Err(AppContextBuildError::InvalidConfig(...))
when ltm_store_enabled is true but self.conversation_memory_writer.is_none(),
mirroring the with_storage() readiness check to fail fast; reference the build()
function, with_storage(), RouterConfig.memory_runtime.ltm_store_enabled, and
conversation_memory_writer when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5ec57a41-791a-4430-bd70-f6fea53c7d37
📒 Files selected for processing (5)
crates/data_connector/src/factory.rsmodel_gateway/src/app_context.rsmodel_gateway/src/config/validation.rsmodel_gateway/src/memory/context.rsmodel_gateway/src/routers/openai/context.rs
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@model_gateway/src/memory/context.rs`:
- Around line 95-108: The is_enabled helper currently treats "1", "true", "yes",
and "enabled" as truthy; update is_enabled to also accept "on" as a truthy value
by adding a check for value.eq_ignore_ascii_case("on") in the boolean expression
that compares the normalized value (function: is_enabled, uses normalize to trim
input). Ensure normalize(&str) remains used so values like " on " are correctly
handled.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: f4cb03ec-aaf6-4a29-ba27-1f0089d398c1
📒 Files selected for processing (4)
crates/data_connector/src/factory.rsmodel_gateway/src/app_context.rsmodel_gateway/src/memory/context.rsmodel_gateway/src/routers/openai/context.rs
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@model_gateway/src/config/validation.rs`:
- Around line 82-94: Change the validation error text in validate_memory_runtime
(the ConfigError::ValidationFailed returned when
memory_runtime.ltm_store_enabled is true but memory_runtime.ltm_enabled is
false) to a more explicit constraint wording such as stating that
"ltm_store_enabled requires ltm_enabled to be true" so the message reads clearly
and directly conveys the dependency between memory_runtime.ltm_store_enabled and
memory_runtime.ltm_enabled.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: a88beada-10fd-4dd5-8491-7cc845ef3cb8
📒 Files selected for processing (1)
model_gateway/src/config/validation.rs
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/data_connector/src/memory.rs`:
- Around line 280-283: Update the doc comment on MemoryConversationMemoryWriter
to accurately describe which backend it serves: replace the misleading
"memory/noop history backends" text with a clear note that
MemoryConversationMemoryWriter is used only for HistoryBackend::Memory (and that
HistoryBackend::None uses NoOpConversationMemoryWriter). Locate the struct
MemoryConversationMemoryWriter and its doc comment and change the description to
reference HistoryBackend::Memory and NoOpConversationMemoryWriter explicitly so
readers aren’t confused about backend wiring.
In `@model_gateway/src/memory/headers.rs`:
- Around line 5-10: The header-name constants (MEMORY_POLICY_HEADER,
MEMORY_LTM_STORE_ENABLED_HEADER, MEMORY_SUBJECT_ID_HEADER,
MEMORY_RECALL_METHOD_HEADER, MEMORY_EMBEDDING_MODEL_HEADER,
MEMORY_EXTRACTION_MODEL_HEADER) should be made public so other modules and tests
can reuse the single source of truth: change each const to pub const in
model_gateway/src/memory/headers.rs and update any callers (e.g., tests in
model_gateway/src/routers/openai/context.rs) to import and reference these
constants instead of hardcoding "x-smg-memory-ltm-store-enabled" (and other
header strings).
In `@model_gateway/src/routers/openai/context.rs`:
- Around line 301-359: Add a regression test that follows the Responses path
through RequestContext::for_responses and then calls into_streaming_context() to
ensure the new fields are preserved; specifically, construct a RequestContext
via RequestContext::for_responses (using
shared_components_with_memory_enabled()), call ctx.into_streaming_context(), and
assert that the resulting StorageHandles contains a non-none
conversation_memory_writer and that its memory_execution_context has
store_ltm_active set (i.e., verify both conversation_memory_writer and
memory_execution_context survive into StorageHandles).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: dc98e2e0-bb5d-42ac-bce7-b34480c7b367
📒 Files selected for processing (6)
crates/data_connector/src/memory.rscrates/data_connector/src/noop.rsmodel_gateway/src/config/builder.rsmodel_gateway/src/config/validation.rsmodel_gateway/src/memory/headers.rsmodel_gateway/src/routers/openai/context.rs
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
crates/data_connector/src/memory.rs (1)
282-286:⚠️ Potential issue | 🟡 MinorFix the backend description in this doc comment.
The doc comment states "memory/noop history backends" but
MemoryConversationMemoryWriteris only wired forHistoryBackend::Memory. TheHistoryBackend::Nonevariant usesNoOpConversationMemoryWriterinstead.📝 Suggested fix
#[derive(Default, Clone)] -/// In-memory conversation memory writer used by memory/noop history backends. +/// In-memory conversation memory writer used by the Memory history backend. pub struct MemoryConversationMemoryWriter {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/data_connector/src/memory.rs` around lines 282 - 286, The doc comment for MemoryConversationMemoryWriter incorrectly says it's used by "memory/noop history backends" — update the comment to correctly state that MemoryConversationMemoryWriter is used for the HistoryBackend::Memory (in-memory) backend only (and not for HistoryBackend::None, which uses NoOpConversationMemoryWriter); adjust the text on the MemoryConversationMemoryWriter struct to reflect that it is an in-memory conversation memory writer wired for HistoryBackend::Memory.model_gateway/src/routers/openai/context.rs (1)
368-427: 🧹 Nitpick | 🔵 TrivialAdd a regression test for the streaming handoff.
These tests verify
RequestContextinitialization, but the new fields are consumed later ininto_streaming_context()(lines 316-352). A test usingResponsesComponentsshould verify thatconversation_memory_writerandmemory_execution_contextsurvive intoStorageHandles.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/openai/context.rs` around lines 368 - 427, Add a regression test that ensures RequestContext fields used by the streaming handoff survive into StorageHandles: create a test that builds a RequestContext (using ResponsesRequest and ResponsesComponents/SharedComponents similar to the existing tests), call RequestContext::into_streaming_context() (or the method that produces StorageHandles) and assert that the returned StorageHandles contain a non-None conversation_memory_writer and that memory_execution_context (e.g., store_ltm_active) is preserved; reference RequestContext::into_streaming_context, ResponsesComponents, StorageHandles, conversation_memory_writer, and memory_execution_context when locating where to add the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/data_connector/src/memory.rs`:
- Around line 297-307: Add unit tests for MemoryConversationMemoryWriter that
validate create_memory behavior: write tests that call
MemoryConversationMemoryWriter::create_memory with a sample
NewConversationMemory and assert the returned ConversationMemoryId starts with
"mem_" and is unique across multiple calls (use ulid-based format expectations),
inspect the writer's inner store (via its .inner RwLock) to confirm the created
entry was inserted under that id and contains the expected NewConversationMemory
content, and add a clone-sharing-state test similar to
test_conversation_item_storage_clone_shares_state that clones
MemoryConversationMemoryWriter and verifies mutations (creating a memory) on one
instance are visible through the other. Use the types ConversationMemoryId,
NewConversationMemory, and MemoryConversationMemoryWriter to locate targets.
---
Duplicate comments:
In `@crates/data_connector/src/memory.rs`:
- Around line 282-286: The doc comment for MemoryConversationMemoryWriter
incorrectly says it's used by "memory/noop history backends" — update the
comment to correctly state that MemoryConversationMemoryWriter is used for the
HistoryBackend::Memory (in-memory) backend only (and not for
HistoryBackend::None, which uses NoOpConversationMemoryWriter); adjust the text
on the MemoryConversationMemoryWriter struct to reflect that it is an in-memory
conversation memory writer wired for HistoryBackend::Memory.
In `@model_gateway/src/routers/openai/context.rs`:
- Around line 368-427: Add a regression test that ensures RequestContext fields
used by the streaming handoff survive into StorageHandles: create a test that
builds a RequestContext (using ResponsesRequest and
ResponsesComponents/SharedComponents similar to the existing tests), call
RequestContext::into_streaming_context() (or the method that produces
StorageHandles) and assert that the returned StorageHandles contain a non-None
conversation_memory_writer and that memory_execution_context (e.g.,
store_ltm_active) is preserved; reference
RequestContext::into_streaming_context, ResponsesComponents, StorageHandles,
conversation_memory_writer, and memory_execution_context when locating where to
add the test.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 18684e97-49c5-4a16-8ea4-e65b550be079
📒 Files selected for processing (4)
crates/data_connector/src/memory.rscrates/data_connector/src/noop.rsmodel_gateway/src/config/validation.rsmodel_gateway/src/routers/openai/context.rs
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@model_gateway/src/config/types.rs`:
- Around line 40-41: Add a unit test that deserializes a legacy config string
that omits the memory_runtime field and asserts the resulting struct's
memory_runtime equals MemoryRuntimeConfig::default(); this verifies the
#[serde(default)] compatibility. In the test (e.g.,
test_deserialize_legacy_config_memory_runtime_default) call serde_json::from_str
(or the repo's preferred format) to deserialize into the top-level config type
containing the memory_runtime field, then assert_eq!(config.memory_runtime,
MemoryRuntimeConfig::default()) to catch accidental removal of
#[serde(default)].
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 425fbca2-0a98-4209-be40-ddd35f5df885
📒 Files selected for processing (5)
model_gateway/src/app_context.rsmodel_gateway/src/config/builder.rsmodel_gateway/src/config/types.rsmodel_gateway/src/config/validation.rsmodel_gateway/src/routers/openai/context.rs
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
model_gateway/src/routers/openai/context.rs (1)
37-49: 🛠️ Refactor suggestion | 🟠 MajorExpose the memory writer on the shared OpenAI context.
RequestContext::for_chat()can now markmemory_execution_context.store_ltm_active, butComponentRefs::Sharedstill hardcodesconversation_memory_writer()toNone. That leaves the Chat path with an “active store” signal and no writer capability, while the Responses path has both. If this PR is meant to establish a generic memory substrate for later request hooks, hoist the writer intoSharedComponents(or ontoRequestContext) so Chat and Responses expose the same memory surface.Also applies to: 100-105, 153-176
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/openai/context.rs` around lines 37 - 49, RequestContext::for_chat() can set memory_execution_context.store_ltm_active but ComponentRefs::Shared still returns None for the writer; fix by moving the conversation_memory_writer into SharedComponents (add pub conversation_memory_writer: Option<Arc<dyn ConversationMemoryWriter>>) and remove the hardcoded None in ComponentRefs::Shared so both chat and responses paths use the same writer surface, updating constructors/initializers that create SharedComponents and any uses in RequestContext, ResponsesComponents, and ComponentRefs; also apply corresponding changes where SharedComponents is constructed or referenced (the areas around the earlier noted sections ~100-105 and ~153-176) so the writer is threaded through consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@model_gateway/src/routers/openai/context.rs`:
- Around line 37-49: RequestContext::for_chat() can set
memory_execution_context.store_ltm_active but ComponentRefs::Shared still
returns None for the writer; fix by moving the conversation_memory_writer into
SharedComponents (add pub conversation_memory_writer: Option<Arc<dyn
ConversationMemoryWriter>>) and remove the hardcoded None in
ComponentRefs::Shared so both chat and responses paths use the same writer
surface, updating constructors/initializers that create SharedComponents and any
uses in RequestContext, ResponsesComponents, and ComponentRefs; also apply
corresponding changes where SharedComponents is constructed or referenced (the
areas around the earlier noted sections ~100-105 and ~153-176) so the writer is
threaded through consistently.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: e32d45d8-59c7-4af4-8478-753e3eac375a
📒 Files selected for processing (4)
model_gateway/src/app_context.rsmodel_gateway/src/memory/context.rsmodel_gateway/src/memory/headers.rsmodel_gateway/src/routers/openai/context.rs
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 650599b91f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@model_gateway/src/memory/context.rs`:
- Around line 124-131: Update the test missing_headers_produce_inactive_context
to assert that all recall/store-related active/requested flags are false in
addition to store_ltm_active; after creating ctx via
MemoryExecutionContext::from_headers(&MemoryHeaderView::default(),
&MemoryRuntimeConfig::default()), add assertions such as
assert!(!ctx.recall_requested) and assert!(!ctx.recall_active) (or the exact
recall/request flag names used on MemoryExecutionContext) so the test verifies
both inactive store_ltm and inactive recall/request flags.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: d380b5ce-e04c-490f-8643-30c0a11ed614
📒 Files selected for processing (1)
model_gateway/src/memory/context.rs
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
| @@ -53,22 +73,36 @@ pub struct StorageFactoryConfig<'a> { | |||
| /// # Errors | |||
| /// Returns error string if required configuration is missing or initialization fails | |||
| pub async fn create_storage(config: StorageFactoryConfig<'_>) -> Result<StorageTuple, String> { | |||
There was a problem hiding this comment.
do we need keep both create_storage and create_storage_bundle? I feel it redundant, how about just delete create_storage and rename the create_storage_bundle
There was a problem hiding this comment.
create_storage_bundle was redundant, so I removed the split and kept a single create_storage() that returns StorageBundle. That keeps the API surface smaller and avoids two names for the same operation.
| #[derive(Debug, Clone, Default, PartialEq, Eq, Serialize, Deserialize)] | ||
| /// Runtime feature flags that gate memory store/recall behavior. | ||
| #[serde(default)] | ||
| pub struct MemoryRuntimeConfig { |
There was a problem hiding this comment.
if this is config, move this to config
config should be self contained instead of depending on other code in gw
There was a problem hiding this comment.
Moved MemoryRuntimeConfig into the config module so config stays self-contained and does not depend on the gateway memory module anymore. memory/context.rs now imports it from crate::config.
Revised since latest change pushed:
MemoryRuntimeConfig now lives under the config layer (model_gateway/src/config/types.rs) and is re-exported via crate::config, so memory/context.rs depends on config rather than defining config-owned types itself.
|
|
||
| #[derive(Debug, Clone, Default, PartialEq, Eq)] | ||
| /// Normalized per-request memory execution context derived from headers and runtime flags. | ||
| pub struct MemoryExecutionContext { |
There was a problem hiding this comment.
can you elaborate on those
i cant barely understand any of those means by just reading the variable names
There was a problem hiding this comment.
Sure I expanded the docs on MemoryExecutionContext and its fields. The code now explains the distinction between *_requested (caller intent from headers/policy) and *_active (actually enabled after runtime gating), and also documents the purpose of subject_id, recall_method, embedding_model, and extraction_model.
For the Header related code - I think Sai is taking care of them and should have a PR shortly.
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { |
There was a problem hiding this comment.
This module’s main responsibility is translating memory headers + runtime flags into MemoryExecutionContext, and these tests cover that decision logic directly:
- missing headers => inactive context
- store headers => store requested/active only when runtime permits it
- store_and_recall => recall requested/active
- store_only => no recall
Let me know if you find them redundant.
|
|
||
| // Placeholder header names for memory controls. | ||
| // These names are under active discussion and may change in a follow-up PR. | ||
| pub const MEMORY_POLICY_HEADER: &str = "x-smg-memory-policy"; |
There was a problem hiding this comment.
move header specifics to
model_gateway/src/routers/header_utils.rs
|
Hi @zhoug9127, this PR has merge conflicts that must be resolved before it can be merged. Please rebase your branch: git fetch origin main
git rebase origin/main
# resolve any conflicts, then:
git push --force-with-lease |
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
- Remove create_storage_bundle, consolidate into create_storage returning StorageBundle directly (removes redundant tuple wrapper) - Move MemoryRuntimeConfig from memory::context to config::types so config is self-contained without gateway dependencies - Add doc comments to all MemoryExecutionContext fields explaining requested vs active semantics and each field's purpose - Move memory header constants from memory/headers.rs to routers/common/header_utils.rs where other x-smg-* headers live - Remove duplicated serde deserialization test (already covered in config::types::tests) Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/data_connector/src/factory.rs`:
- Around line 15-19: The None history backend currently does not expose a
conversation_memory_writer which causes AppContext to reject store-enabled
configs; update crates/data_connector/src/factory.rs so that
HistoryBackend::None returns/include the no-op memory writer (e.g., expose a
NoOpConversationMemoryWriter or equivalent alongside
NoOpConversationItemStorage/NoOpConversationStorage/NoOpResponseStorage) under
the None branch where conversation_memory_writer is currently None, and also
update
model_gateway/src/app_context.rs::history_backend_supports_memory_writer() to
treat HistoryBackend::None as supporting a no-op memory writer (align the
precheck logic with the factory change).
In `@model_gateway/src/routers/common/header_utils.rs`:
- Around line 12-19: Move ownership of the memory header constants
(MEMORY_POLICY_HEADER, MEMORY_LTM_STORE_ENABLED_HEADER,
MEMORY_SUBJECT_ID_HEADER, MEMORY_RECALL_METHOD_HEADER,
MEMORY_EMBEDDING_MODEL_HEADER, MEMORY_EXTRACTION_MODEL_HEADER) out of
routers::common into the memory module (define them in memory::headers), then
update routers::common to import and re-export those constants from
memory::headers so the dependency flows from router -> memory rather than memory
depending on routers; update any use sites to import the constants from the
memory module or the router re-export as appropriate.
In `@model_gateway/src/routers/openai/responses/route.rs`:
- Around line 165-166: The call to refresh_memory_execution_context() is
redundant because RequestContext::for_responses already initializes
memory_execution_context; remove the extra
ctx.refresh_memory_execution_context() call so the context is built only once,
or alternatively refactor so that RequestContext::for_responses does not build
memory_execution_context and you explicitly build it after setting
ctx.storage_request_context (i.e., move the initial memory build out of
for_responses and invoke refresh_memory_execution_context() only once after
ctx.storage_request_context is assigned); update uses of
RequestContext::for_responses and refresh_memory_execution_context() accordingly
to keep a single, deterministic build path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1f185014-c18c-4220-adab-915e7fd049b3
📒 Files selected for processing (15)
crates/data_connector/src/factory.rscrates/data_connector/src/lib.rscrates/data_connector/src/noop.rsmodel_gateway/src/app_context.rsmodel_gateway/src/config/builder.rsmodel_gateway/src/config/types.rsmodel_gateway/src/config/validation.rsmodel_gateway/src/memory/context.rsmodel_gateway/src/memory/headers.rsmodel_gateway/src/memory/mod.rsmodel_gateway/src/routers/common/header_utils.rsmodel_gateway/src/routers/openai/context.rsmodel_gateway/src/routers/openai/responses/route.rsmodel_gateway/src/routers/openai/router.rsmodel_gateway/src/service_discovery.rs
Signed-off-by: zhoug9127 <zhoug9127@gmail.com>
|
Superseded by #1149, which contains the updated implementation and follow-up fixes. Closing this PR to keep review history focused in the new thread. |
…ector Both have been the consistent primary authors of recent PRs in these subsystems: - @zhoug9127 (Daisy): #1168, #1149, #1065, #1061, #976 — mcp + data_connector - @zhaowenzi (Ziwen): #1174, #1163, #1123 — mcp Signed-off-by: Simo Lin <linsimo.mark@gmail.com>
Description
Problem
SMG already supports response, conversation, and conversation-item persistence, but it does not yet expose a generic conversation-memory store substrate that later request hooks can consume. We need that generic substrate in upstream first, without introducing specific planners, row semantics, or persistence behavior changes.
Header Note (Placeholder)
Current memory request headers are placeholders and are still under discussion.
They are intentionally treated as provisional and may be renamed in a follow-up PR:
x-smg-memory-policyx-smg-memory-ltm-store-enabledx-smg-memory-subject-idx-smg-memory-recall-methodx-smg-memory-embedding-modelx-smg-memory-extraction-modelSolution
Add a generic conversation-memory store substrate and normalized request memory execution context.
This PR:
ConversationMemoryWriterexposure through the storage factory and app contextNonememory_runtimeconfig plus validation for store readinessMemoryExecutionContextNon-goals in this PR:
Changes
create_storage_bundleplumbing and optionalconversation_memory_writermemory_runtimeconfig and validationmodel_gateway::memorymodule for normalized request memory contextTest Plan
cargo test -p data-connector test_create_storage_bundle_memory_exposes_memory_writer -- --nocapturecargo test -p data-connector test_create_storage_bundle_none_exposes_memory_writer -- --nocapturecargo test -p smg test_builder_memory_runtime_flags_round_trip -- --nocapturecargo test -p smg test_reject_ltm_store_when_ltm_runtime_disabled -- --nocapturecargo test -p smg header_policy_store_and_recall_enables_store_request -- --nocapturecargo test -p smg missing_headers_produce_inactive_context -- --nocapturecargo test -p smg store_headers_become_active_when_runtime_is_ready -- --nocapturegit diff --checkSummary by CodeRabbit
New Features
Behavioral Changes
Tests