Conversation
There was a problem hiding this comment.
Pull request overview
Adds LFM2 tool-call parsing support to OVMS LLM I/O processing, wiring it into the existing OutputParser tool-parser selection and providing a dedicated LFM2 output parser test suite.
Changes:
- Introduced
Lfm2ToolParser(unary + streaming parsing) undersrc/llm/io_processing/lfm2/. - Registered the new parser in
OutputParser(parser name"lfm2") and Bazel build targets. - Added comprehensive LFM2 output parser tests (including streaming scenarios and malformed inputs).
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
src/test/llm/output_parsers/lfm2_output_parser_test.cpp |
New gtest coverage for LFM2 tool-call parsing (unary + streaming). |
src/llm/io_processing/output_parser.cpp |
Registers "lfm2" tool parser and includes its header. |
src/llm/io_processing/lfm2/tool_parser.hpp |
Declares Lfm2ToolParser state machine and parsing helpers. |
src/llm/io_processing/lfm2/tool_parser.cpp |
Implements LFM2 unary + streaming parsing and argument normalization. |
src/llm/BUILD |
Adds Bazel library target for the LFM2 parser and links it into output_parsers. |
| argument.value = argumentStr.substr(equalPos + 1); | ||
|
|
||
| if (argument.value[0] == '\'' && argument.value.back() == '\'') { | ||
| argument.value[0] = '"'; | ||
| argument.value[argument.value.size() - 1] = '"'; | ||
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Argument value is enclosed in single quotes, replaced with double quotes. Modified value: {}", argument.value); | ||
| } | ||
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed argument - name: {}, value: {}", argument.name, argument.value); |
There was a problem hiding this comment.
parseSingleArgument indexes argument.value[0] and argument.value.back() without checking whether the value is empty. Inputs like arg= (or trailing = due to malformed output) will make argument.value empty and cause out-of-bounds access. Please guard for empty values before indexing and mark the argument invalid or treat it as an empty string safely.
| argument.value = argumentStr.substr(equalPos + 1); | |
| if (argument.value[0] == '\'' && argument.value.back() == '\'') { | |
| argument.value[0] = '"'; | |
| argument.value[argument.value.size() - 1] = '"'; | |
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Argument value is enclosed in single quotes, replaced with double quotes. Modified value: {}", argument.value); | |
| } | |
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed argument - name: {}, value: {}", argument.name, argument.value); | |
| if (equalPos + 1 < argumentStr.size()) { | |
| argument.value = argumentStr.substr(equalPos + 1); | |
| } else { | |
| argument.value.clear(); | |
| } | |
| if (argument.value.empty()) { | |
| argument.isValid = false; | |
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Argument '{}' has an empty value after '=', marking argument as invalid", argument.name); | |
| } else { | |
| if (argument.value[0] == '\'' && argument.value.back() == '\'') { | |
| argument.value[0] = '"'; | |
| argument.value[argument.value.size() - 1] = '"'; | |
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Argument value is enclosed in single quotes, replaced with double quotes. Modified value: {}", argument.value); | |
| } | |
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed argument - name: {}, value: {}", argument.name, argument.value); | |
| } |
| void Lfm2ToolParser::writeArgumentOfAnyType(const std::string& arg, rapidjson::Writer<rapidjson::StringBuffer>& writer) { | ||
| std::string normalized = normalizeArgStr(arg); | ||
|
|
||
| rapidjson::Document doc; | ||
| doc.Parse(normalized.c_str()); | ||
|
|
||
| if (doc.HasParseError()) { | ||
| SPDLOG_LOGGER_ERROR(llm_calculator_logger, "Failed to parse argument string as JSON. Argument string: {}", normalized); | ||
| return; | ||
| } | ||
|
|
||
| rapidjson::Value& argumentDoc = doc; | ||
| writeArgumentOfAnyType(argumentDoc, writer); | ||
| } |
There was a problem hiding this comment.
writeArgumentOfAnyType(const std::string&, ...) returns early on JSON parse errors. Callers have already emitted the JSON key, so returning without writing a value can produce invalid JSON (and may trigger RapidJSON writer assertions). Instead, always emit a value (e.g., fall back to writer.String(arg.c_str()) or writer.Null()) when parsing fails.
| bool Lfm2ToolParser::parseSingleToolCall(const std::string& toolStr, ToolCall& toolCall) { | ||
| size_t argsPos = toolStr.find(TOOL_ARGS_START_INDICATOR); | ||
| if (argsPos != std::string::npos) { | ||
| std::string toolName = toolStr.substr(0, argsPos); | ||
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed tool name: {}", toolName); | ||
|
|
||
| int argsStrLen = toolStr.length() - argsPos - TOOL_ARGS_START_INDICATOR.length() - TOOL_ARGS_END_INDICATOR.length(); | ||
| std::string argsStr = toolStr.substr(argsPos + TOOL_ARGS_START_INDICATOR.length(), argsStrLen); | ||
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed args string: {}", argsStr); | ||
| std::vector<Lfm2ToolParser::Argument> arguments = parseArguments(argsStr); |
There was a problem hiding this comment.
parseSingleToolCall assumes the tool string ends with ) and computes argsStrLen as an int. For malformed inputs missing the closing ) this length becomes negative, which can lead to exceptions or undefined behavior when passed to substr. Please validate the closing ) (and bounds) before extracting arguments and use size_t for lengths.
| this->currentState = State::Content; | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
parseChunk ignores finishReason. If generation ends while the tool call is incomplete (e.g., missing <|tool_call_end|> or )), the parser may never emit the final arguments/content and will leave state stuck. Other parsers use finishReason != NONE to flush/close structures; consider doing the same here to finalize whatever has been accumulated.
| if (finishReason != ov::genai::GenerationFinishReason::NONE) { | |
| if (this->currentState == State::ToolCallParameters || this->currentState == State::ToolCallEnded) { | |
| if (!this->toolCall.arguments.empty()) { | |
| return wrapDeltaArgs(this->toolCall.arguments, toolCallIndex); | |
| } | |
| } else if (this->currentState == State::Content) { | |
| if (this->streamingPosition < this->streamingContent.size()) { | |
| auto content = this->streamingContent.substr(this->streamingPosition); | |
| this->streamingPosition = this->streamingContent.size(); | |
| if (!content.empty()) { | |
| return wrapDeltaContent(content); | |
| } | |
| } | |
| } | |
| } |
There was a problem hiding this comment.
Consider this. We need to remain functional even if max_tokens is exceeded
| bool Lfm2ToolParser::parseInContentState() { | ||
| size_t pos = this->streamingContent.find(TOOL_CALL_START_TAG, this->streamingPosition); | ||
| size_t toolCallEndTagPos = this->streamingContent.find(TOOL_CALL_END_TAG, this->streamingPosition); | ||
| if (toolCallEndTagPos != std::string::npos && pos == std::string::npos) { | ||
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Detected end of tool call at position: {}", toolCallEndTagPos); | ||
| this->streamingPosition = toolCallEndTagPos + TOOL_CALL_END_TAG.length(); | ||
| return false; | ||
| } | ||
| if (pos != std::string::npos) { | ||
| this->streamingPosition = pos + TOOL_CALL_START_TAG.length() + TOOL_LIST_START_INDICATOR.length(); | ||
| this->currentState = State::ToolCallStarted; | ||
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Detected start of tool call at position: {}", pos); | ||
| return false; | ||
| } | ||
|
|
There was a problem hiding this comment.
Streaming: parseInContentState advances streamingPosition past <|tool_call_start|>[ as soon as it finds the tag, but it never emits the content that may precede the tag in the same buffer. Because OutputParser can call toolParser->parseChunk() with a buffer that contains both normal content and the start tag (during UNKNOWN→TOOL transition), this will drop user-visible content. Please return a content delta for any prefix before the start tag (or keep the state as Content until the prefix has been emitted).
| std::optional<rapidjson::Document> Lfm2ToolParser::parseChunk(const std::string& chunk, ov::genai::GenerationFinishReason finishReason) { | ||
| if (chunk.empty()) { | ||
| return std::nullopt; | ||
| } | ||
|
|
||
| this->streamingContent += chunk; | ||
|
|
||
| if (parseNewContent()) { | ||
| if (this->currentState == State::ToolCallParameters) { | ||
| return BaseOutputParser::wrapFirstDelta(this->toolCall.name, toolCallIndex); | ||
| } | ||
| if (this->currentState == State::ToolCallEnded) { | ||
| return wrapDeltaArgs(this->toolCall.arguments, toolCallIndex); | ||
| } | ||
| if (this->currentState == State::Content) { | ||
| auto content = this->streamingContent.substr(this->streamingPosition); | ||
| this->streamingPosition += content.size(); | ||
| return wrapDeltaContent(content); | ||
| } | ||
| if (this->currentState == State::AfterToolCall) { | ||
| this->currentState = State::Content; | ||
| } | ||
| } | ||
|
|
||
| return std::nullopt; | ||
| } |
There was a problem hiding this comment.
Streaming: streamingContent is only appended to and never trimmed, while streamingPosition monotonically increases. For long responses this can grow without bound and increase substring/search costs over time. Consider erasing the processed prefix (or keeping only an unprocessed tail) once streamingPosition moves forward, to keep memory and time bounded.
| std::vector<Lfm2ToolParser::Argument> Lfm2ToolParser::parseArguments(const std::string& argumentsStr) { | ||
| std::vector<std::string> args; | ||
| std::vector<Lfm2ToolParser::Argument> parsedArgs; | ||
|
|
||
| size_t argPos = 0; | ||
| while (argPos < argumentsStr.length()) { | ||
| size_t commaPos = findInStringRespectingSpecialChars(argumentsStr, TOOL_SEPARATOR_STR, argPos); | ||
| if (commaPos == std::string::npos) { | ||
| auto remainingStr = argumentsStr.substr(argPos); | ||
| args.push_back(remainingStr); | ||
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "No more commas found, adding remaining argument string: {}", remainingStr); | ||
| break; | ||
| } | ||
| auto argStr = argumentsStr.substr(argPos, commaPos - argPos); | ||
| args.push_back(argStr); | ||
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed argument string: {}", argStr); | ||
| argPos = commaPos + TOOL_SEPARATOR_STR.length(); | ||
| } | ||
|
|
||
| for (const std::string& arg : args) { | ||
| parsedArgs.push_back(parseSingleArgument(arg)); | ||
| } | ||
| return parsedArgs; | ||
| } | ||
|
|
||
| bool Lfm2ToolParser::parseInContentState() { | ||
| size_t pos = this->streamingContent.find(TOOL_CALL_START_TAG, this->streamingPosition); | ||
| size_t toolCallEndTagPos = this->streamingContent.find(TOOL_CALL_END_TAG, this->streamingPosition); | ||
| if (toolCallEndTagPos != std::string::npos && pos == std::string::npos) { | ||
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Detected end of tool call at position: {}", toolCallEndTagPos); | ||
| this->streamingPosition = toolCallEndTagPos + TOOL_CALL_END_TAG.length(); | ||
| return false; | ||
| } | ||
| if (pos != std::string::npos) { | ||
| this->streamingPosition = pos + TOOL_CALL_START_TAG.length() + TOOL_LIST_START_INDICATOR.length(); | ||
| this->currentState = State::ToolCallStarted; | ||
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Detected start of tool call at position: {}", pos); | ||
| return false; | ||
| } | ||
|
|
||
| return true; | ||
| } | ||
|
|
||
| bool Lfm2ToolParser::parseInToolCallState() { | ||
| size_t pos = this->streamingContent.find(TOOL_ARGS_START_INDICATOR, this->streamingPosition); | ||
|
|
||
| if (pos == std::string::npos) { | ||
| return false; | ||
| } | ||
|
|
||
| std::string toolName = this->streamingContent.substr(this->streamingPosition, pos - this->streamingPosition); | ||
| this->toolCall = ToolCall{generateRandomId(), toolName, ""}; | ||
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed tool name: {}", toolName); | ||
| this->streamingPosition = pos + TOOL_ARGS_START_INDICATOR.length(); | ||
| this->currentState = State::ToolCallParameters; | ||
| this->toolCallIndex++; | ||
| return true; | ||
| } | ||
|
|
||
|
|
||
| bool Lfm2ToolParser::parseToolCallParametersState() { | ||
| size_t pos = this->streamingContent.find(TOOL_ARGS_END_INDICATOR, this->streamingPosition); | ||
| if (pos == std::string::npos) { | ||
| return false; | ||
| } | ||
| std::string argumentsStr = this->streamingContent.substr(this->streamingPosition, pos - this->streamingPosition); | ||
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed arguments string: {}", argumentsStr); | ||
| std::vector<Argument> arguments = parseArguments(argumentsStr); | ||
|
|
||
| rapidjson::Document argsDoc(rapidjson::kObjectType); | ||
| rapidjson::StringBuffer sb; | ||
| rapidjson::Writer<rapidjson::StringBuffer> argsWriter(sb); | ||
| argsWriter.StartObject(); | ||
|
|
||
| for (const Argument& argument : arguments) { | ||
| argsWriter.Key(argument.name.c_str()); | ||
| writeArgumentOfAnyType(argument.value, argsWriter); | ||
| } | ||
|
|
||
| argsWriter.EndObject(); | ||
| this->toolCall.arguments = sb.GetString(); | ||
| this->currentState = State::ToolCallEnded; | ||
| this->streamingPosition = pos + TOOL_ARGS_END_INDICATOR.length(); | ||
|
|
||
| return true; | ||
| } | ||
|
|
||
| bool Lfm2ToolParser::parseInToolCallEndedState() { | ||
| size_t pos = this->streamingContent.find(TOOL_LIST_END_INDICATOR, this->streamingPosition); | ||
| size_t toolSeparatorPos = this->streamingContent.find(TOOL_SEPARATOR_STR, this->streamingPosition); | ||
| size_t toolCallEndTagPos = this->streamingContent.find(TOOL_CALL_END_TAG, this->streamingPosition); | ||
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Current state: ToolCallEnded. Streaming content from current position: {}", this->streamingContent.substr(this->streamingPosition)); | ||
| if (pos == std::string::npos && toolSeparatorPos == std::string::npos && toolCallEndTagPos == std::string::npos) { | ||
| return false; | ||
| } else if (toolSeparatorPos != std::string::npos && toolSeparatorPos < pos) { | ||
| this->streamingPosition = toolSeparatorPos + TOOL_SEPARATOR_STR.length(); | ||
| this->currentState = State::ToolCallStarted; | ||
| SPDLOG_LOGGER_INFO(llm_calculator_logger, "Detected separator between tool calls at position: {}, expecting another tool call to start", toolSeparatorPos); | ||
| } else if (toolCallEndTagPos != std::string::npos) { |
There was a problem hiding this comment.
The parser logs many per-token/per-chunk events at INFO level (e.g., every argument split, every state transition). In streaming mode this can produce very high log volume and impact latency/throughput. Please downgrade these to DEBUG/TRACE (and keep INFO for genuinely actionable events/errors).
| #include <gtest/gtest.h> | ||
| #include <openvino/genai/tokenizer.hpp> | ||
| #include <string> | ||
| #include <vector> | ||
|
|
||
| #include "../../../llm/io_processing/base_output_parser.hpp" | ||
| #include "../../../llm/io_processing/output_parser.hpp" | ||
| #include "../../platform_utils.hpp" |
There was a problem hiding this comment.
This test file uses standard library types/functions that aren't directly included (e.g., std::tuple, std::optional, std::all_of, isalnum). Please add the corresponding headers (<tuple>, <optional>, <algorithm>, <cctype>) to avoid relying on transitive includes, which can break builds on different platforms/toolchains.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| const std::string Lfm2ToolParser::TOOL_ARGS_END_INDICATOR = ")"; | ||
| const std::string Lfm2ToolParser::TOOL_SEPARATOR_STR = ", "; | ||
|
|
||
| void Lfm2ToolParser::writeArgumentOfAnyType(const rapidjson::Value& arg, rapidjson::Writer<rapidjson::StringBuffer>& writer) { |
There was a problem hiding this comment.
If that's a standalone function could it be static?
Also it sounds quite generic. Can we extract it to utils? Not sure if we already have such file in io_processing...
| } | ||
| writer.EndObject(); | ||
| } else { | ||
| writer.String(""); |
There was a problem hiding this comment.
Should we write anything if logging error?
There was a problem hiding this comment.
in other case the key will be left like this {key: bad_value, another_key: value} -> {"key", "another_key", "value"}
| const char last = normalized.back(); | ||
| if ((first == '{' && last == '}') || (first == '[' && last == ']')) { | ||
| std::replace(normalized.begin(), normalized.end(), '\'', '"'); | ||
| SPDLOG_LOGGER_TRACE(llm_calculator_logger, "Argument contains curly braces or square brackets, replaced single quotes with double quotes for JSON parsing. Modified string: {}", normalized); |
There was a problem hiding this comment.
Is it always safe? What if we get argument with list of strings:
["hello", "it's me"]
?
There was a problem hiding this comment.
it's right, I will think how can I change it
| SPDLOG_LOGGER_TRACE(llm_calculator_logger, "Argument contains curly braces or square brackets, replaced single quotes with double quotes for JSON parsing. Modified string: {}", normalized); | ||
| } | ||
|
|
||
| if (normalized[0] == '"' && normalized.back() == '"' && normalized.find("\\") != std::string::npos) { |
There was a problem hiding this comment.
Don't we already have some methods that implement escaping? Please check and if they are, we could extract them and put to some utils file.
Note that other special chars might require escaping too. Like /n, /t etc.
| std::string lowerArg = normalized; | ||
| std::transform(lowerArg.begin(), lowerArg.end(), lowerArg.begin(), ::tolower); | ||
|
|
||
| if (lowerArg == "true" || lowerArg == "false") { |
There was a problem hiding this comment.
We don't want to capture situation where false/true has for example trailing whitespace? Or it's not expected in that format.
| this->currentState = State::Content; | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Consider this. We need to remain functional even if max_tokens is exceeded
| size_t pos = 0; | ||
| int main_guard = 0; | ||
|
|
||
| while (pos != std::string::npos && main_guard < MAX_TOOL_CALLS) { |
There was a problem hiding this comment.
Is this guard really needed?
| if(end == std::string::npos) { | ||
| end = parsedOutput.content.rfind(TOOL_LIST_END_INDICATOR, end); | ||
| if (end == std::string::npos) { | ||
| SPDLOG_LOGGER_ERROR(llm_calculator_logger, "Malformed tool call in content, no tool end tag or tool list end tag found for tool call starting at position {}", start); |
There was a problem hiding this comment.
I would put it to trace or debug at most
| EXPECT_EQ(parsedOutput.toolCalls[0].id.empty(), false); | ||
| } | ||
|
|
||
| TEST_F(LFM2OutputParserTest, HolisticStreaming) { |
There was a problem hiding this comment.
For better coverage of escaping, could you also add test where we have long argument in string format?
I mean like:
impl = "import package\nimport package2\n\ndef func(a, b):\n\td={"python": "dict"}\n\tl = ["list \"with escaped text\"", 123, []]
|
|
||
| this->streamingContent += chunk; | ||
|
|
||
| if (parseNewContent()) { |
There was a problem hiding this comment.
What happens if you receive chunk with both name and first argument? like:
dummy(arg1=
Do we support scenario with bigger chunks that would fit in multiple states?
🛠 Summary
CVS-182500
Adding toolp parser for LFM2 model
🧪 Checklist
``