Skip to content

Comments

feat: per task chat template and extra body args#672

Open
cmunley1 wants to merge 4 commits intomainfrom
cmunley1/chat-template-per-task
Open

feat: per task chat template and extra body args#672
cmunley1 wants to merge 4 commits intomainfrom
cmunley1/chat-template-per-task

Conversation

@cmunley1
Copy link
Contributor

@cmunley1 cmunley1 commented Feb 11, 2026

support different chat template kwargs and extra body params per task, such as reasoning on/off or thinking budget.

see #619

I have not tested extra body outside of the new tests (tho I hope to with thinking budget). Only tested chat template enable thinking on/off with single-step no tool environment. We should test in training and with tools.

this allows adding to tasks in dataset:

"metadata": {
      "chat_template_kwargs": "{\"enable_thinking\": false}"
    }

or true.

then we get results like:

{
  "responses_create_params": {
    "background": null,
    "include": null,
    "input": [
      {
        "content": "A very special island is inhabited only by pioneers and laggards. Pioneers always tell the truth, and laggards always lie. You meet 2 inhabitants: Olivia, and Lily. \"Lily is a laggard and Lily is a pioneer\" - Olivia. Lily noted, \"if Olivia is a pioneer then Lily is a pioneer\". So who is a pioneer and who is a laggard? (Format your answer like: \"Olivia is a pioneer/laggard, and Lily is a pioneer/laggard\")",
        "role": "user",
        "type": "message"
      }
    ],
    "instructions": null,
    "max_output_tokens": null,
    "max_tool_calls": null,
    "metadata": {
      "chat_template_kwargs": "{\"enable_thinking\": false}"
    },
    "model": null,
    "parallel_tool_calls": true,
    "previous_response_id": null,
    "prompt": null,
    "reasoning": null,
    "service_tier": null,
    "store": null,
    "temperature": null,
    "text": null,
    "tool_choice": "auto",
    "tools": [],
    "top_logprobs": null,
    "top_p": null,
    "truncation": null,
    "user": null,
    "stream": null
  },
  "response": {
    "id": "resp_cb9d84dd26534667a546a362d3d773d1",
    "created_at": 1770849328.0,
    "error": null,
    "incomplete_details": null,
    "instructions": null,
    "metadata": {
      "chat_template_kwargs": "{\"enable_thinking\": false}"
    },
    "model": "Qwen/Qwen3-0.6B",
    "object": "response",
    "output": [
      {
        "id": "msg_18dfdb3049d54ffca1d0ed9c42ec398d",
        "content": [
          {
            "annotations": [],
            "text": "Olivia is a **pioneer**, and Lily is a **laggard**.  \n\n**Explanation:**  \n- Olivia says, \"Lily is a laggard and Lily is a pioneer.\"  \n  - Since Olivia is a pioneer, she tells the truth.  \n  - Therefore, Lily must be both a laggard and a pioneer.  \n- Lily says, \"If Olivia is a pioneer then Lily is a pioneer.\"  \n  - This is a logical implication.  \n  - Since Olivia is a pioneer, the implication becomes true only if Lily is a pioneer.  \n  - But Lily is a laggard, so this implication is false.  \n  - Therefore, Lily must be lying.",
            "type": "output_text",
            "logprobs": null
          }
        ],
        "role": "assistant",
        "status": "completed",
        "type": "message"
      }
    ],
    "parallel_tool_calls": true,
    "temperature": null,
    "tool_choice": "auto",
    "tools": [],
    "top_p": null,
    "background": null,
    "conversation": null,
    "max_output_tokens": null,
    "max_tool_calls": null,
    "previous_response_id": null,
    "prompt": null,
    "prompt_cache_key": null,
    "reasoning": null,
    "safety_identifier": null,
    "service_tier": null,
    "status": null,
    "text": null,
    "top_logprobs": null,
    "truncation": null,
    "usage": null,
    "user": null
  },

or

{
  "responses_create_params": {
    "background": null,
    "include": null,
    "input": [
      {
        "content": "A very special island is inhabited only by pioneers and laggards. Pioneers always tell the truth, and laggards always lie. You meet 2 inhabitants: Olivia, and Lily. \"Lily is a laggard and Lily is a pioneer\" - Olivia. Lily noted, \"if Olivia is a pioneer then Lily is a pioneer\". So who is a pioneer and who is a laggard? (Format your answer like: \"Olivia is a pioneer/laggard, and Lily is a pioneer/laggard\")",
        "role": "user",
        "type": "message"
      }
    ],
    "instructions": null,
    "max_output_tokens": null,
    "max_tool_calls": null,
    "metadata": {
      "chat_template_kwargs": "{\"enable_thinking\": true}"
    },
    "model": null,
    "parallel_tool_calls": true,
    "previous_response_id": null,
    "prompt": null,
    "reasoning": null,
    "service_tier": null,
    "store": null,
    "temperature": null,
    "text": null,
    "tool_choice": "auto",
    "tools": [],
    "top_logprobs": null,
    "top_p": null,
    "truncation": null,
    "user": null,
    "stream": null
  },
  "response": {
    "id": "resp_3937b9aab8204d63bd9e48d7869b90be",
    "created_at": 1770849330.0,
    "error": null,
    "incomplete_details": null,
    "instructions": null,
    "metadata": {
      "chat_template_kwargs": "{\"enable_thinking\": true}"
    },
    "model": "Qwen/Qwen3-0.6B",
    "object": "response",
    "output": [
      {
        "id": "rs_398d14afc57c4cf38d65afec2b3f1b27",
        "summary": [
          {
            "text": "\nOkay, let's try to figure out who is a pioneer and who is a laggard here. So, there are two people: Olivia and Lily. Pioneers always tell the truth, and laggards always lie. \n\nFirst, Olivia says, \"Lily is a laggard and Lily is a pioneer.\" Wait, that seems contradictory. Because if Olivia is a pioneer, then she's telling the truth, so both parts of her statement must be true. But Lily being a laggard and a pioneer can't both be true. That's impossible. So maybe Olivia is lying? But if she's a pioneer, then she can't lie. Hmm, this is confusing.\n\nLet me break it down. Olivia's statement is a conjunction: \"Lily is a laggard AND Lily is a pioneer.\" Since both parts must be true for the conjunction to be true, but a person can't be both a laggard and a pioneer. Therefore, Olivia's statement is impossible. Therefore, Olivia must be a laggard because she's lying. But wait, if she's a laggard, then her statement is false. So the negation of her statement would be \"Lily is not a laggard OR Lily is not a pioneer.\" But Lily's status is either laggard or pioneer. So maybe Lily is a pioneer? But then Olivia's statement being false would mean that either Lily isn't a laggard (which would be true if Lily is a pioneer) or she isn't a pioneer (which would be false if she is). So the negation is true. Therefore, Olivia is lying, so she's a laggard. \n\nNow, moving to Lily's statement: \"If Olivia is a pioneer, then Lily is a pioneer.\" Let's analyze this. Since Lily is either a pioneer or a laggard. If Lily is a pioneer, then her statement must be true. If she's a laggard, then her statement is false. Let's check both possibilities.\n\nCase 1: Suppose Olivia is a pioneer. Then her statement is \"Lily is a laggard AND Lily is a pioneer.\" But that's impossible. So Olivia cannot be a pioneer. Therefore, Olivia must be a laggard. Which we already established. \n\nSo now, since Olivia is a laggard, her statement is false. Her statement was \"Lily is a laggard AND Lily is a pioneer.\" So the negation is \"Lily is not a laggard OR Lily is not a pioneer.\" Since Lily is either a pioneer or a laggard, this negation is always true. Therefore, Lily could be either. But let's check Lily's own statement.\n\nIf Lily is a pioneer, then her statement must be true: \"If Olivia is a pioneer, then Lily is a pioneer.\" Since Olivia is a laggard (so Olivia is not a pioneer), the implication \"If P then Q\" is automatically true, regardless of Q. Therefore, Lily's statement would be true, which is consistent with her being a pioneer. \n\nIf Lily were a laggard, then her statement would be false. The statement is \"If Olivia is a pioneer, then Lily is a pioneer.\" Since Olivia is a laggard, the antecedent (Olivia is a pioneer) is false, making the implication true regardless. But a laggard cannot say a true statement. Therefore, Lily cannot be a laggard because her statement would be true. Therefore, Lily must be a pioneer. \n\nSo putting it all together: Olivia is a laggard, Lily is a pioneer. \n\nWait, let me double-check. Olivia's status: since she can't be a pioneer (because her statement is impossible), she's a laggard. Lily is a pioneer, so her statement is true. Therefore, the answer is Olivia is a laggard, and Lily is a pioneer.\n",
            "type": "summary_text"
          }
        ],
        "type": "reasoning",
        "encrypted_content": null
      },
      {
        "id": "msg_898c246655dd470ba6f7a75fd81ba1e0",
        "content": [
          {
            "annotations": [],
            "text": "\n\nOlivia is a laggard, and Lily is a pioneer. \n\n\"Olivia is a laggard/laggard, and Lily is a pioneer/pioneer\"",
            "type": "output_text",
            "logprobs": null
          }
        ],
        "role": "assistant",
        "status": "completed",
        "type": "message"
      }
    ],
    "parallel_tool_calls": true,
    "temperature": null,
    "tool_choice": "auto",
    "tools": [],
    "top_p": null,
    "background": null,
    "conversation": null,
    "max_output_tokens": null,
    "max_tool_calls": null,
    "previous_response_id": null,
    "prompt": null,
    "prompt_cache_key": null,
    "reasoning": null,
    "safety_identifier": null,
    "service_tier": null,
    "status": null,
    "text": null,
    "top_logprobs": null,
    "truncation": null,
    "usage": null,
    "user": null
  },

Signed-off-by: Christian Munley <cmunley@nvidia.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 11, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@cmunley1 cmunley1 marked this pull request as ready for review February 11, 2026 22:44
if body_dict.get("metadata") and isinstance(body_dict["metadata"], dict):
metadata_chat_kwargs_str = body_dict["metadata"].get("chat_template_kwargs")
if metadata_chat_kwargs_str:
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we take out the try except here and fail loudly when we cannot decode properly pls?

chat_template_kwargs = deepcopy(self.config.chat_template_kwargs)

# Merge global config chat_template_kwargs with per-request overrides in metadata (e.g. per-sample reasoning on/off)
if body_dict.get("metadata") and isinstance(body_dict["metadata"], dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we apply to both chat template kwargs and extra_body?

Signed-off-by: Christian Munley <cmunley@nvidia.com>
Signed-off-by: Christian Munley <cmunley@nvidia.com>
@cmunley1 cmunley1 requested a review from bxyu-nvidia February 12, 2026 00:06
@cmunley1 cmunley1 changed the title feat: per task chat template args feat: per task chat template and extra body args Feb 12, 2026
@cmunley1
Copy link
Contributor Author

tested extra body, is working with thinking budget

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants