SEP-XXXX: Multi Round-Trip Requests#12
Conversation
…col rules, and add security note
Co-authored-by: Caitie McCaffrey <caitiem20@github.com>
| ``` | ||
|
|
||
| However, we need to consider how to avoid breaking things for existing | ||
| tools that are written the old way. Ideally, we will be able to modify |
There was a problem hiding this comment.
Hmm, this "ideally" is doing a lot of heavy lifting. I don't think we can accept this without a full story here? It seems really non-trivial to make this just work, and if we can't make it just work then this becomes a breaking change for lots of people, which could be a major problem.
I think before we move forward with this SEP we need to figure out what's happening here.
There was a problem hiding this comment.
I agree that we need to figure this out, and I think we'll need help from some of the SDK maintainers to do so. I suspect that there won't be a single answer here; I think we'll need a story in each language, and they may not all be the same. If you can help get the right set of people involved, I'm happy to provide input.
That having been said, I'm not sure that this issue needs to block the SEP from moving forward. One question I would ask here is, if there is some subset of languages in which this change may be a bit disruptive, are we going to reject this SEP as a result? I'm not sure we can afford to do that -- this change is badly needed, and I don't know that we have a good alternative. And if not, it seems like we shouldn't prevent this from moving forward while we investigate this.
We can probably discuss this further at today's meeting.
There was a problem hiding this comment.
I suspect this will require some breaking change to get to a stateless protocol, but agree we need a clear migration path here.
There was a problem hiding this comment.
Chiming in here from the python & typescript sdk side of things; there are a couple of different things we might mean by "breaking things" here, and I think it's worth separating them:
1. If server or client implementers upgrade the SDK, will their code still compile/work?
I'd argue this is a non-goal. The code changes for tool authors are small and mechanical (see the migration demos in typescript-sdk#1597 or even the python snippet right here). This is the kind of migration that can be documented in a migration.md and automated with AI tooling. Trying to keep this non-breaking adds permanent complexity to the SDKs for a one-time migration cost.
2. New client → old server: what happens?
This seems mostly manageable. The client SDK keeps support for the existing SSE-based elicitation flow alongside the new MRTR path, and uses capability negotiation to determine which to use. It's a bit unfortunate to carry the legacy code, but it's contained to the client SDK (not tool authors) and can be deprecated on a timeline tied to a protocol version bump.
3. Old client → new server: what happens?
This is the hard one, and I think it's fundamentally unresolvable. The whole motivation for MRTR is enabling horizontally scaled servers without sticky sessions or long-lived SSE connections. An old client requires those things for elicitation - it expects the server to hold an SSE stream open and send elicitation requests on it. A new server that's deployed for horizontal scalability can't provide that by definition.
We could build a server-side SDK shim that translates new-style tool code back to the old SSE flow, but that only works if the server can actually hold SSE connections - which kind of defeats the purpose of adopting MRTR in the first place. It makes maintenance of such servers more complicated, not less.
I'd propose explicitly making graceful degradation the policy in this case: old clients can still connect to new servers and use tools normally, but elicitation and sampling aren't available until the client upgrades. The impact of this is scoped to tools that actually use server-initiated requests - I believe the vast majority of tools are simple request/response and would be completely unaffected.
There was a problem hiding this comment.
I've addressed the following points that we discussed in today's meeting:
- allow progress on SSE stream followed by incomplete response
- generalize wording to be non-tool-specific in motivation section
- clarify wording on server encryption algorithms (it should be able to use any it wants)
- move "client may stop processing at this point" from step 4 to step 1 of persistent workflow
- note method name in step 5 for persistent
- document why we didn't include the elicitation on tasks/get
- highlight incomplete response with request_state but without input_requests
There is one other item that we talked about today but didn't come to consensus on, which is whether to support returning an error from the tasks/provide_input RPC. That will need further discussion.
Please let me know if I missed anything or got anything wrong. Thanks!
|
|
||
| Since `Tasks` are likely longer running, have state associated with them, and are likely more costly to compute, the request for more information does not end the originally requested operation (e.g., the tool call). Instead, the server can resume processing once the necessary information is provided. | ||
|
|
||
| To align with MRTR semantics, the server will respond to the `task/result` request with a `InputRequests` object. Both of these will have the same JsonRPC `id`. When the client responds with a `InputResponses` object this is a new client request wit a new JSONRPC `id` and therefore needs a new method name. We propose `tasks/input_response`. |
There was a problem hiding this comment.
Are you suggesting that the client use the tasks/result method to provide the input responses? That seems a little odd to me, but I see the logic. It's okay with me if that's our consensus.
@CaitieM20, thoughts?
Adding Information on Error Handling
| - Examples: accessing an agent, spinning up a VM and needing user | ||
| interaction to manipulate the VM | ||
|
|
||
| The vast majority of MCP tools will be ephemeral, and it is extremely |
There was a problem hiding this comment.
An alternative framing of this is that I think is consistent with the shift to the protocol being stateless is that the "persistent" cases are better suited to tasks w/ input_required than elicitation during a tool call.
edit: i see this called out below, might just be an earlier mention
| "method": "elicitation/create", | ||
| "params": { | ||
| "mode": "form", | ||
| "message": "Please provide your GitHub username", |
There was a problem hiding this comment.
nit: I think a stronger example here would help clarify what cases this flow is targeting.
e.g. an argument to the tool that's github_username would be a better fit.
edit: Okay I need to read further! The ADO one is a solid example, but does take more space to explain why it's good. maybe we can come up with a more compact one that doesn't degenerate to tool arguments
|
|
||
| ##### Use Case 1: Rolling Upgrades | ||
|
|
||
| Let's say that you are doing a rolling upgrade of your horizontally |
There was a problem hiding this comment.
would this case be useful for changing tool args as well? not just changing your elicitations, but you could use an elicitation when an optional or new arg was left out (e.g. client had an outdated tool description).
There was a problem hiding this comment.
Yeah, I think that would work if the arg is optional or if there's no input schema. If the arg is required and there is an input schema, then I think the SDK would reject the tool call request before the tool call impl has the opportunity to do this. But presumably if you want to do this, you can just make the arg optional.
Three side-by-side demo servers exploring the migration story for
existing `await elicitInput()`-style handlers under the MRTR SEP
(transports-wg#12), bucketed by migration cost:
1. Simple retry — idempotent; mechanical inversion to
'check inputResponses, else return IncompleteResult'
2. Continuation state — multi-step dialogue; handler becomes a
re-entrant state machine threading requestState through the client
3. Persistent — mutation before elicitation; MRTR alone cannot
rescue this, must migrate to Tasks workflow
Each demo registers <name>_before (today's pattern) and <name>_after
(MRTR pattern) so the diff is visible in one file.
The SDK doesn't have the real types yet so shims.ts stubs
IncompleteResult/InputRequests and carries the MRTR params via a
reserved `arguments._mrtr` key as a transport stand-in.
Relates to the backwards-compatibility discussion at
modelcontextprotocol/transports-wg#12
| This workflow would look like this: | ||
|
|
||
| 1. Client sends tool call request with task metadata. | ||
| 2. Server sends back `inputRequests` response indicating that more information is needed to process the request. This terminates the original request. |
There was a problem hiding this comment.
I'm wondering whether this is worth the potential added complexity for client side logic?
Without this change task augmented tool calls can already do elicitation/sampling in the new stateless transport. So, adding the ability for a task augmented tool call to return an inputRequests response doesn't gain any new functionality, it's only a potential optimization to avoid having to create a task before the initial required elicitation/sampling requests.
There was a problem hiding this comment.
I don't actually think this is going to add any significant complexity to client-side logic. I think the ability to do this falls naturally out of the fact that we need to support both the ephemeral and persistent workflows. Each retry attempt in the ephemeral workflow is essentially a completely new request, and the server can create a task in response to any such request.
Keep in mind that just because the client request indicates that it will accept a task doesn't mean that the server is required to start a task. And also, SEP-2229 is proposing simplifying this such that the client doesn't need to explicitly ask for a task in the first place; the server can just decide to send one whenever it needs to.
| processing the request at this point. | ||
| 2. Client retrieves the Task Status by calling `tasks/get` and sees that more information is needed. | ||
| 3. Client calls `task/result` | ||
| 4. Server returns the `InputRequests` object. |
There was a problem hiding this comment.
This is a breaking change to the tasks endpoint. Currently:
When a receiver receives a
tasks/resultrequest for a task in any other non-terminal status (workingorinput_required), it MUST block the response until the task reaches a terminal status.
Should probably be called out explicitly as a breaking change.
I'm also worried at some of these breaking changes around tasks being quite difficult to implement in the SDKs to support new/old server/client combinations. Especially making it so both the client and server is spec compliant in their interactions (i.e. we could just expose raw request methods and say "good luck" to sdk users but that's not fun haha).
Although, I suppose since we considered tasks to be experimental, we could just purposefully break it and say there's no backwards compatibility path? Either way I think it's worth noting that it's explicitly breaking.
There was a problem hiding this comment.
+1 for using the experimental tag to explain a breaking change here.
There was a problem hiding this comment.
@LucaButBoring may want to weigh in on this.
There was a problem hiding this comment.
Yes, we'll need to make a breaking change to Tasks to support MRTR, and should call that out explicitly. I think the exact shape of the breaking change we make is still open to discussion, however - I plan to open a separate SEP for the relevant changes, which will probably involve (1) introducing a consolidated tasks/continue method that gets called repeatedly to advance the flow, and (2) making a breaking change to tasks/result to make it return an error if the task isn't complete, yet.
For now, can we simply state that MRTR will require breaking changes to Tasks, and that those will be addressed in a subsequent proposal? Let's describe the ephemeral/persistent split, delegate persistent use cases to Tasks, and explain the rationale (mostly as we're doing right now in the SEP), but remove the low-level details and example flow. I'll draft a TWG proposal by Wednesday addressing the full scope of the required changes.
There was a problem hiding this comment.
I don't think it makes sense to remove the tasks workflow from this SEP, because it's really one of the key workflows that we need to work here, and there are a lot of common data structures that we want for both.
Let's try to push forward the tasks simplification SEP in parallel with this one. I'm perfectly happy to update this SEP to use whatever tasks workflow we converge on before it gets finalized.
There was a problem hiding this comment.
| // At least one of the the inputRequests and requestState fields must be | ||
| // present, since the presence of these fields allow the client to | ||
| // determine that the result is incomplete. | ||
| export interface IncompleteResult extends Result { |
There was a problem hiding this comment.
Have you considered this instead being an JSONRPC error?
I am thinking this could potentially cause issues becaues this would mean that tools/call now returns multiple types for the same request.
The only case of this happening before is with task augmented tool calls, but importantly it was dependant on the type of request being sent out. A tool call would only send back a CreateTaskResult but only if the request was task augmented as well. So in the SDKs this is easily separated out, and the result type to try and parse the response into is known ahead of time.
This will probably be a big issues for both the Typescript and Python SDKs which explicitly set the expected return type when making a request:
- Python does this on all requests sent out: https://github.com/modelcontextprotocol/python-sdk/blob/cb07adeca345effa111598889f6a8924a8722e6d/src/mcp/client/session.py#L313
- Same with Typescript: https://github.com/modelcontextprotocol/typescript-sdk/blob/3391bfb57fb05f07482480b83ec63b219d913292/packages/client/src/client/client.ts#L874
So by having this same request return two different types would beak this.
There was a problem hiding this comment.
We'll definitely have to talk about how to handle this.
I don't think we really want this to be an error, because then we have to start extending the error type with individual fields. Also, we want to use this same type in the task-based workflow for persistent tools, and an error doesn't really make sense in that context.
It's worth noting that for tasks, SEP-2229 is proposing that the server be able to return a task to any call, even if the client does not explicitly ask for one. So I think that even for that, the approach of determining the result schema based on the request is going to stop working.
There was a problem hiding this comment.
This SEP is explicitly removing URLElicitationRequiredError error.
|
Thas has now been published as SEP-2322, so please move all further discussion there. Thanks! |
|
Merging this to keep a record of the original draft, similar to IEFT Internet Drafts. Future updates and comments should be directed to SEP-2322. |
This is a draft SEP for Multi Round-Trip Requests (MRTR).
Motivation and Context
See SEP for details.
How Has This Been Tested?
Not tested yet, that will happen when we implement.
Breaking Changes
See SEP for details.
Types of changes
Checklist
Additional context
N/A