feat: per task chat template and extra body args by cmunley1 · Pull Request #672 · NVIDIA-NeMo/Gym

cmunley1 · 2026-02-11T22:19:35Z

support different chat template kwargs and extra body params per task, such as reasoning on/off or thinking budget.

I have not tested extra body outside of the new tests (tho I hope to with thinking budget). Only tested chat template enable thinking on/off with single-step no tool environment. We should test in training and with tools.

this allows adding to tasks in dataset:

"metadata": {
      "chat_template_kwargs": "{\"enable_thinking\": false}"
    }

or true.

then we get results like:

{
  "responses_create_params": {
    "background": null,
    "include": null,
    "input": [
      {
        "content": "A very special island is inhabited only by pioneers and laggards. Pioneers always tell the truth, and laggards always lie. You meet 2 inhabitants: Olivia, and Lily. \"Lily is a laggard and Lily is a pioneer\" - Olivia. Lily noted, \"if Olivia is a pioneer then Lily is a pioneer\". So who is a pioneer and who is a laggard? (Format your answer like: \"Olivia is a pioneer/laggard, and Lily is a pioneer/laggard\")",
        "role": "user",
        "type": "message"
      }
    ],
    "instructions": null,
    "max_output_tokens": null,
    "max_tool_calls": null,
    "metadata": {
      "chat_template_kwargs": "{\"enable_thinking\": false}"
    },
    "model": null,
    "parallel_tool_calls": true,
    "previous_response_id": null,
    "prompt": null,
    "reasoning": null,
    "service_tier": null,
    "store": null,
    "temperature": null,
    "text": null,
    "tool_choice": "auto",
    "tools": [],
    "top_logprobs": null,
    "top_p": null,
    "truncation": null,
    "user": null,
    "stream": null
  },
  "response": {
    "id": "resp_cb9d84dd26534667a546a362d3d773d1",
    "created_at": 1770849328.0,
    "error": null,
    "incomplete_details": null,
    "instructions": null,
    "metadata": {
      "chat_template_kwargs": "{\"enable_thinking\": false}"
    },
    "model": "Qwen/Qwen3-0.6B",
    "object": "response",
    "output": [
      {
        "id": "msg_18dfdb3049d54ffca1d0ed9c42ec398d",
        "content": [
          {
            "annotations": [],
            "text": "Olivia is a **pioneer**, and Lily is a **laggard**.  \n\n**Explanation:**  \n- Olivia says, \"Lily is a laggard and Lily is a pioneer.\"  \n  - Since Olivia is a pioneer, she tells the truth.  \n  - Therefore, Lily must be both a laggard and a pioneer.  \n- Lily says, \"If Olivia is a pioneer then Lily is a pioneer.\"  \n  - This is a logical implication.  \n  - Since Olivia is a pioneer, the implication becomes true only if Lily is a pioneer.  \n  - But Lily is a laggard, so this implication is false.  \n  - Therefore, Lily must be lying.",
            "type": "output_text",
            "logprobs": null
          }
        ],
        "role": "assistant",
        "status": "completed",
        "type": "message"
      }
    ],
    "parallel_tool_calls": true,
    "temperature": null,
    "tool_choice": "auto",
    "tools": [],
    "top_p": null,
    "background": null,
    "conversation": null,
    "max_output_tokens": null,
    "max_tool_calls": null,
    "previous_response_id": null,
    "prompt": null,
    "prompt_cache_key": null,
    "reasoning": null,
    "safety_identifier": null,
    "service_tier": null,
    "status": null,
    "text": null,
    "top_logprobs": null,
    "truncation": null,
    "usage": null,
    "user": null
  },

or

{
  "responses_create_params": {
    "background": null,
    "include": null,
    "input": [
      {
        "content": "A very special island is inhabited only by pioneers and laggards. Pioneers always tell the truth, and laggards always lie. You meet 2 inhabitants: Olivia, and Lily. \"Lily is a laggard and Lily is a pioneer\" - Olivia. Lily noted, \"if Olivia is a pioneer then Lily is a pioneer\". So who is a pioneer and who is a laggard? (Format your answer like: \"Olivia is a pioneer/laggard, and Lily is a pioneer/laggard\")",
        "role": "user",
        "type": "message"
      }
    ],
    "instructions": null,
    "max_output_tokens": null,
    "max_tool_calls": null,
    "metadata": {
      "chat_template_kwargs": "{\"enable_thinking\": true}"
    },
    "model": null,
    "parallel_tool_calls": true,
    "previous_response_id": null,
    "prompt": null,
    "reasoning": null,
    "service_tier": null,
    "store": null,
    "temperature": null,
    "text": null,
    "tool_choice": "auto",
    "tools": [],
    "top_logprobs": null,
    "top_p": null,
    "truncation": null,
    "user": null,
    "stream": null
  },
  "response": {
    "id": "resp_3937b9aab8204d63bd9e48d7869b90be",
    "created_at": 1770849330.0,
    "error": null,
    "incomplete_details": null,
    "instructions": null,
    "metadata": {
      "chat_template_kwargs": "{\"enable_thinking\": true}"
    },
    "model": "Qwen/Qwen3-0.6B",
    "object": "response",
    "output": [
      {
        "id": "rs_398d14afc57c4cf38d65afec2b3f1b27",
        "summary": [
          {
            "text": "\nOkay, let's try to figure out who is a pioneer and who is a laggard here. So, there are two people: Olivia and Lily. Pioneers always tell the truth, and laggards always lie. \n\nFirst, Olivia says, \"Lily is a laggard and Lily is a pioneer.\" Wait, that seems contradictory. Because if Olivia is a pioneer, then she's telling the truth, so both parts of her statement must be true. But Lily being a laggard and a pioneer can't both be true. That's impossible. So maybe Olivia is lying? But if she's a pioneer, then she can't lie. Hmm, this is confusing.\n\nLet me break it down. Olivia's statement is a conjunction: \"Lily is a laggard AND Lily is a pioneer.\" Since both parts must be true for the conjunction to be true, but a person can't be both a laggard and a pioneer. Therefore, Olivia's statement is impossible. Therefore, Olivia must be a laggard because she's lying. But wait, if she's a laggard, then her statement is false. So the negation of her statement would be \"Lily is not a laggard OR Lily is not a pioneer.\" But Lily's status is either laggard or pioneer. So maybe Lily is a pioneer? But then Olivia's statement being false would mean that either Lily isn't a laggard (which would be true if Lily is a pioneer) or she isn't a pioneer (which would be false if she is). So the negation is true. Therefore, Olivia is lying, so she's a laggard. \n\nNow, moving to Lily's statement: \"If Olivia is a pioneer, then Lily is a pioneer.\" Let's analyze this. Since Lily is either a pioneer or a laggard. If Lily is a pioneer, then her statement must be true. If she's a laggard, then her statement is false. Let's check both possibilities.\n\nCase 1: Suppose Olivia is a pioneer. Then her statement is \"Lily is a laggard AND Lily is a pioneer.\" But that's impossible. So Olivia cannot be a pioneer. Therefore, Olivia must be a laggard. Which we already established. \n\nSo now, since Olivia is a laggard, her statement is false. Her statement was \"Lily is a laggard AND Lily is a pioneer.\" So the negation is \"Lily is not a laggard OR Lily is not a pioneer.\" Since Lily is either a pioneer or a laggard, this negation is always true. Therefore, Lily could be either. But let's check Lily's own statement.\n\nIf Lily is a pioneer, then her statement must be true: \"If Olivia is a pioneer, then Lily is a pioneer.\" Since Olivia is a laggard (so Olivia is not a pioneer), the implication \"If P then Q\" is automatically true, regardless of Q. Therefore, Lily's statement would be true, which is consistent with her being a pioneer. \n\nIf Lily were a laggard, then her statement would be false. The statement is \"If Olivia is a pioneer, then Lily is a pioneer.\" Since Olivia is a laggard, the antecedent (Olivia is a pioneer) is false, making the implication true regardless. But a laggard cannot say a true statement. Therefore, Lily cannot be a laggard because her statement would be true. Therefore, Lily must be a pioneer. \n\nSo putting it all together: Olivia is a laggard, Lily is a pioneer. \n\nWait, let me double-check. Olivia's status: since she can't be a pioneer (because her statement is impossible), she's a laggard. Lily is a pioneer, so her statement is true. Therefore, the answer is Olivia is a laggard, and Lily is a pioneer.\n",
            "type": "summary_text"
          }
        ],
        "type": "reasoning",
        "encrypted_content": null
      },
      {
        "id": "msg_898c246655dd470ba6f7a75fd81ba1e0",
        "content": [
          {
            "annotations": [],
            "text": "\n\nOlivia is a laggard, and Lily is a pioneer. \n\n\"Olivia is a laggard/laggard, and Lily is a pioneer/pioneer\"",
            "type": "output_text",
            "logprobs": null
          }
        ],
        "role": "assistant",
        "status": "completed",
        "type": "message"
      }
    ],
    "parallel_tool_calls": true,
    "temperature": null,
    "tool_choice": "auto",
    "tools": [],
    "top_p": null,
    "background": null,
    "conversation": null,
    "max_output_tokens": null,
    "max_tool_calls": null,
    "previous_response_id": null,
    "prompt": null,
    "prompt_cache_key": null,
    "reasoning": null,
    "safety_identifier": null,
    "service_tier": null,
    "status": null,
    "text": null,
    "top_logprobs": null,
    "truncation": null,
    "usage": null,
    "user": null
  },

Signed-off-by: Christian Munley <cmunley@nvidia.com>

copy-pr-bot · 2026-02-11T22:19:40Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

bxyu-nvidia · 2026-02-11T23:07:06Z

responses_api_models/vllm_model/app.py

+        if body_dict.get("metadata") and isinstance(body_dict["metadata"], dict):
+            metadata_chat_kwargs_str = body_dict["metadata"].get("chat_template_kwargs")
+            if metadata_chat_kwargs_str:
+                try:


can we take out the try except here and fail loudly when we cannot decode properly pls?

bxyu-nvidia · 2026-02-11T23:07:15Z

responses_api_models/vllm_model/app.py

+            chat_template_kwargs = deepcopy(self.config.chat_template_kwargs)
+
+        # Merge global config chat_template_kwargs with per-request overrides in metadata (e.g. per-sample reasoning on/off)
+        if body_dict.get("metadata") and isinstance(body_dict["metadata"], dict):


can we apply to both chat template kwargs and extra_body?

Signed-off-by: Christian Munley <cmunley@nvidia.com>

cmunley1 · 2026-02-20T23:20:40Z

tested extra body, is working with thinking budget

per sample chat template support

10a06cc

Signed-off-by: Christian Munley <cmunley@nvidia.com>

cmunley1 requested review from bxyu-nvidia and hrossnv February 11, 2026 22:41

cmunley1 marked this pull request as ready for review February 11, 2026 22:44

bxyu-nvidia requested changes Feb 11, 2026

View reviewed changes

cmunley1 added 2 commits February 11, 2026 15:38

extra body per task

6ce97d7

Signed-off-by: Christian Munley <cmunley@nvidia.com>

test

1327b6f

Signed-off-by: Christian Munley <cmunley@nvidia.com>

cmunley1 requested a review from bxyu-nvidia February 12, 2026 00:06

cmunley1 changed the title ~~feat: per task chat template args~~ feat: per task chat template and extra body args Feb 12, 2026

Merge branch 'main' into cmunley1/chat-template-per-task

bdc2839

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat: per task chat template and extra body args#672

feat: per task chat template and extra body args#672
cmunley1 wants to merge 4 commits intomainfrom
cmunley1/chat-template-per-task

cmunley1 commented Feb 11, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Feb 11, 2026

Uh oh!

bxyu-nvidia Feb 11, 2026

Uh oh!

bxyu-nvidia Feb 11, 2026

Uh oh!

cmunley1 commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

cmunley1 commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Feb 11, 2026

Uh oh!

bxyu-nvidia Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

bxyu-nvidia Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

cmunley1 commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cmunley1 commented Feb 11, 2026 •

edited

Loading