Skip to content

Tool call ID error causing model unload - responding after tool use broken? #48

@DatCaptainHorse

Description

@DatCaptainHorse
...
2025-11-10 14:17:55,786 - INFO - 192.168.32.85:32898 - "POST /v1/chat/completions HTTP/1.1" 200
2025-11-10 14:17:56,448 - ERROR - LLM inference failed!
Traceback (most recent call last):
  File "/mnt/Mediaz/Projekt/LLMao/OpenArc/src/server/worker_registry.py", line 86, in infer_llm
    async for item in llm_instance.generate_type(packet.gen_config):
  File "/mnt/Mediaz/Projekt/LLMao/OpenArc/src/engine/ov_genai/llm.py", line 96, in generate_text
    prompt_token_ids = self.prepare_inputs(gen_config.messages, gen_config.tools)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/Mediaz/Projekt/LLMao/OpenArc/src/engine/ov_genai/llm.py", line 47, in prepare_inputs
    prompt_token_ids = self.encoder_tokenizer.apply_chat_template(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/Mediaz/Projekt/LLMao/OpenArc/.venv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1652, in apply_chat_template
    rendered_chat, generation_indices = render_jinja_template(
                                        ^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/Mediaz/Projekt/LLMao/OpenArc/.venv/lib/python3.11/site-packages/transformers/utils/chat_template_utils.py", line 498, in render_jinja_template
    rendered_chat = compiled_template.render(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/Mediaz/Projekt/LLMao/OpenArc/.venv/lib/python3.11/site-packages/jinja2/environment.py", line 1295, in render
    self.environment.handle_exception()
  File "/mnt/Mediaz/Projekt/LLMao/OpenArc/.venv/lib/python3.11/site-packages/jinja2/environment.py", line 942, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<template>", line 62, in top-level template code
  File "/mnt/Mediaz/Projekt/LLMao/OpenArc/.venv/lib/python3.11/site-packages/jinja2/sandbox.py", line 401, in call
    return __context.call(__obj, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/Mediaz/Projekt/LLMao/OpenArc/.venv/lib/python3.11/site-packages/transformers/utils/chat_template_utils.py", line 423, in raise_exception
    raise jinja2.exceptions.TemplateError(message)
jinja2.exceptions.TemplateError: Tool call IDs should be alphanumeric strings with length 9!
2025-11-10 14:17:56,449 - ERROR - [Ministral-8B LLM Worker] Inference failed, triggering model unload...
2025-11-10 14:17:56,747 - INFO - [Ministral-8B] unloaded successfully

I feel model unloading when something goes wrong with jinja is bit aggressive 🤔 - also not sure if there's general issue with tool calling, I tried both Letta and AutoGen - both would cause above error when trying to have model respond after tool use.

I kinda just threw this out, so if you need any further context, information or need me to do code changes for debugging, let me know 👍

Screenshots from Letta side.
with llama.cpp, tool calls are executed just fine, ID is valid:
Image

with OpenArc, ID is invalid (contains call_)
Image

Also, it seems OpenArc can't handle multiple user messages before an assistant response, causing jinja to fail in that case as well. While llama.cpp handles this gracefully.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions