-
Notifications
You must be signed in to change notification settings - Fork 293
Description
I started testing Spacebot and noticed that the prompt cache was never being used when chatting with the Channel agent.
I'm using local models via llama.cpp/llama-proxy/litellm. I have an fresh spacebot instance and the inference server was not receiving any other queries that could interfere with prompt caching.
When querying the Cortex agent directly via the web-ui, prompt caching does work as expected.
I did a quick diff check on the Channel agent messages and it revealed that the system prompt changes with every request.
The offending section appears to be related to the current system time:
This seems to be by design:
Line 122 in 6f81f39
| let mut output = String::from("## System\n"); |
but perhaps there is another way to accomplish this which is not so expensive? I'm not entirely sure if having the agent call a tool to check the current timestamp when relevant in a conversation would work reliably (either based on just knowing about the tool, or tweaking the system prompt to nudge it to do so eagerly)
(Side-note, is that newline supposed to be there between ## System and Time:?)