Improve tool call success rate: Allow arbitrary tool call parameter ordering for up to 8 params#13
Open
florianbrede-ayet wants to merge 1 commit intopwilkin:autoparserfrom
Open
Conversation
…rdering for up to 8 params.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Thanks @pwilkin for working on this branch, it's a nice improvement over the fixed (and often partially broken) chat templates.
However, I noticed tool call failures especially under mistral-vibe which I could reliably reproduce (
read_filewithoffsetandlimit).This seems to happen for any qwen35 model, including 27b dense (albeit with a higher chance of correct order).
I debugged your autoparser with claude and found that tool calls enforce a strict parameter order.
Qwen with his native XML tool calls has a very high chance to generate the parameter entities in a very particular order (also tested at different temperatures and penalties - with different seeds you can have a random chance of tool calls succeeding) which does not neccessarily match the expected order in the autoparser.
To limit the number of permutations, I set the "allow any order" to a hard cap of 8 parameters (fallback to sequential order otherwise).
Disclosure: Code was mostly written by Opus, I ran the tests and built it against ROCm and tested it with several hundred tool calls. I don't have a CI/CD setup locally.
Make sure to read the contributing guidelines before submitting a PR