LFM2 parser by przepeck · Pull Request #4085 · openvinotoolkit/model_server

przepeck · 2026-03-25T12:41:57Z

🛠 Summary

CVS-182500
Adding toolp parser for LFM2 model

🧪 Checklist

Unit tests added.
The documentation updated.
Change follows security best practices.
``

Copilot

Pull request overview

Adds LFM2 tool-call parsing support to OVMS LLM I/O processing, wiring it into the existing OutputParser tool-parser selection and providing a dedicated LFM2 output parser test suite.

Changes:

Introduced Lfm2ToolParser (unary + streaming parsing) under src/llm/io_processing/lfm2/.
Registered the new parser in OutputParser (parser name "lfm2") and Bazel build targets.
Added comprehensive LFM2 output parser tests (including streaming scenarios and malformed inputs).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
`src/test/llm/output_parsers/lfm2_output_parser_test.cpp`	New gtest coverage for LFM2 tool-call parsing (unary + streaming).
`src/llm/io_processing/output_parser.cpp`	Registers `"lfm2"` tool parser and includes its header.
`src/llm/io_processing/lfm2/tool_parser.hpp`	Declares `Lfm2ToolParser` state machine and parsing helpers.
`src/llm/io_processing/lfm2/tool_parser.cpp`	Implements LFM2 unary + streaming parsing and argument normalization.
`src/llm/BUILD`	Adds Bazel library target for the LFM2 parser and links it into `output_parsers`.

src/llm/io_processing/lfm2/tool_parser.cpp

Copilot · 2026-03-25T12:50:02Z

src/llm/io_processing/lfm2/tool_parser.cpp

+        argument.value = argumentStr.substr(equalPos + 1);
+
+        if (argument.value[0] == '\'' && argument.value.back() == '\'') {
+            argument.value[0] = '"';
+            argument.value[argument.value.size() - 1] = '"';
+            SPDLOG_LOGGER_INFO(llm_calculator_logger, "Argument value is enclosed in single quotes, replaced with double quotes. Modified value: {}", argument.value);
+        } 
+        SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed argument - name: {}, value: {}", argument.name, argument.value);


parseSingleArgument indexes argument.value[0] and argument.value.back() without checking whether the value is empty. Inputs like arg= (or trailing = due to malformed output) will make argument.value empty and cause out-of-bounds access. Please guard for empty values before indexing and mark the argument invalid or treat it as an empty string safely.

Suggested change

argument.value = argumentStr.substr(equalPos + 1);

if (argument.value[0] == '\'' && argument.value.back() == '\'') {

argument.value[0] = '"';

argument.value[argument.value.size() - 1] = '"';

SPDLOG_LOGGER_INFO(llm_calculator_logger, "Argument value is enclosed in single quotes, replaced with double quotes. Modified value: {}", argument.value);

}

SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed argument - name: {}, value: {}", argument.name, argument.value);

if (equalPos + 1 < argumentStr.size()) {

argument.value = argumentStr.substr(equalPos + 1);

} else {

argument.value.clear();

}

if (argument.value.empty()) {

argument.isValid = false;

SPDLOG_LOGGER_INFO(llm_calculator_logger, "Argument '{}' has an empty value after '=', marking argument as invalid", argument.name);

} else {

if (argument.value[0] == '\'' && argument.value.back() == '\'') {

argument.value[0] = '"';

argument.value[argument.value.size() - 1] = '"';

SPDLOG_LOGGER_INFO(llm_calculator_logger, "Argument value is enclosed in single quotes, replaced with double quotes. Modified value: {}", argument.value);

}

SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed argument - name: {}, value: {}", argument.name, argument.value);

}

Copilot · 2026-03-25T12:50:02Z

src/llm/io_processing/lfm2/tool_parser.cpp

+void Lfm2ToolParser::writeArgumentOfAnyType(const std::string& arg, rapidjson::Writer<rapidjson::StringBuffer>& writer) {
+    std::string normalized = normalizeArgStr(arg);
+
+    rapidjson::Document doc;
+    doc.Parse(normalized.c_str());
+
+    if (doc.HasParseError()) {
+        SPDLOG_LOGGER_ERROR(llm_calculator_logger, "Failed to parse argument string as JSON. Argument string: {}", normalized);
+        return;
+    }
+
+    rapidjson::Value& argumentDoc = doc;
+    writeArgumentOfAnyType(argumentDoc, writer);
+}


writeArgumentOfAnyType(const std::string&, ...) returns early on JSON parse errors. Callers have already emitted the JSON key, so returning without writing a value can produce invalid JSON (and may trigger RapidJSON writer assertions). Instead, always emit a value (e.g., fall back to writer.String(arg.c_str()) or writer.Null()) when parsing fails.

Copilot · 2026-03-25T12:50:02Z

src/llm/io_processing/lfm2/tool_parser.cpp

+bool Lfm2ToolParser::parseSingleToolCall(const std::string& toolStr, ToolCall& toolCall) {
+    size_t argsPos = toolStr.find(TOOL_ARGS_START_INDICATOR);
+    if (argsPos != std::string::npos) {
+        std::string toolName = toolStr.substr(0, argsPos);                
+        SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed tool name: {}", toolName);
+
+        int argsStrLen = toolStr.length() - argsPos - TOOL_ARGS_START_INDICATOR.length() - TOOL_ARGS_END_INDICATOR.length();
+        std::string argsStr = toolStr.substr(argsPos + TOOL_ARGS_START_INDICATOR.length(), argsStrLen);
+        SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed args string: {}", argsStr);
+        std::vector<Lfm2ToolParser::Argument> arguments = parseArguments(argsStr);


parseSingleToolCall assumes the tool string ends with ) and computes argsStrLen as an int. For malformed inputs missing the closing ) this length becomes negative, which can lead to exceptions or undefined behavior when passed to substr. Please validate the closing ) (and bounds) before extracting arguments and use size_t for lengths.

Copilot · 2026-03-25T12:50:03Z

src/llm/io_processing/lfm2/tool_parser.cpp

+            this->currentState = State::Content;
+        }
+    }
+


parseChunk ignores finishReason. If generation ends while the tool call is incomplete (e.g., missing <|tool_call_end|> or )), the parser may never emit the final arguments/content and will leave state stuck. Other parsers use finishReason != NONE to flush/close structures; consider doing the same here to finalize whatever has been accumulated.

Suggested change

if (finishReason != ov::genai::GenerationFinishReason::NONE) {

if (this->currentState == State::ToolCallParameters || this->currentState == State::ToolCallEnded) {

if (!this->toolCall.arguments.empty()) {

return wrapDeltaArgs(this->toolCall.arguments, toolCallIndex);

}

} else if (this->currentState == State::Content) {

if (this->streamingPosition < this->streamingContent.size()) {

auto content = this->streamingContent.substr(this->streamingPosition);

this->streamingPosition = this->streamingContent.size();

if (!content.empty()) {

return wrapDeltaContent(content);

}

}

}

}

Consider this. We need to remain functional even if max_tokens is exceeded

src/llm/io_processing/lfm2/tool_parser.cpp

Copilot · 2026-03-25T12:50:04Z

src/llm/io_processing/lfm2/tool_parser.cpp

+bool Lfm2ToolParser::parseInContentState() {
+    size_t pos = this->streamingContent.find(TOOL_CALL_START_TAG, this->streamingPosition);
+    size_t toolCallEndTagPos = this->streamingContent.find(TOOL_CALL_END_TAG, this->streamingPosition);
+    if (toolCallEndTagPos != std::string::npos && pos == std::string::npos) {
+        SPDLOG_LOGGER_INFO(llm_calculator_logger, "Detected end of tool call at position: {}", toolCallEndTagPos);
+        this->streamingPosition = toolCallEndTagPos + TOOL_CALL_END_TAG.length();
+        return false;
+    }
+    if (pos != std::string::npos) {
+        this->streamingPosition = pos + TOOL_CALL_START_TAG.length() + TOOL_LIST_START_INDICATOR.length();
+        this->currentState = State::ToolCallStarted;
+        SPDLOG_LOGGER_INFO(llm_calculator_logger, "Detected start of tool call at position: {}", pos);
+        return false;
+    }
+


Streaming: parseInContentState advances streamingPosition past <|tool_call_start|>[ as soon as it finds the tag, but it never emits the content that may precede the tag in the same buffer. Because OutputParser can call toolParser->parseChunk() with a buffer that contains both normal content and the start tag (during UNKNOWN→TOOL transition), this will drop user-visible content. Please return a content delta for any prefix before the start tag (or keep the state as Content until the prefix has been emitted).

Copilot · 2026-03-25T12:50:04Z

src/llm/io_processing/lfm2/tool_parser.cpp

+std::optional<rapidjson::Document> Lfm2ToolParser::parseChunk(const std::string& chunk, ov::genai::GenerationFinishReason finishReason) {
+    if (chunk.empty()) {
+        return std::nullopt;
+    }
+
+    this->streamingContent += chunk;
+
+    if (parseNewContent()) {
+        if (this->currentState == State::ToolCallParameters) {
+            return BaseOutputParser::wrapFirstDelta(this->toolCall.name, toolCallIndex);
+        }
+        if (this->currentState == State::ToolCallEnded) {
+            return wrapDeltaArgs(this->toolCall.arguments, toolCallIndex);
+        }
+        if (this->currentState == State::Content) {
+            auto content = this->streamingContent.substr(this->streamingPosition);
+            this->streamingPosition += content.size();
+            return wrapDeltaContent(content);
+        }
+        if (this->currentState == State::AfterToolCall) {
+            this->currentState = State::Content;
+        }
+    }
+
+    return std::nullopt;
+}


Streaming: streamingContent is only appended to and never trimmed, while streamingPosition monotonically increases. For long responses this can grow without bound and increase substring/search costs over time. Consider erasing the processed prefix (or keeping only an unprocessed tail) once streamingPosition moves forward, to keep memory and time bounded.

Copilot · 2026-03-25T12:50:05Z

src/llm/io_processing/lfm2/tool_parser.cpp

+std::vector<Lfm2ToolParser::Argument> Lfm2ToolParser::parseArguments(const std::string& argumentsStr) {
+    std::vector<std::string> args;
+    std::vector<Lfm2ToolParser::Argument> parsedArgs;
+
+    size_t argPos = 0;
+    while (argPos < argumentsStr.length()) {
+        size_t commaPos = findInStringRespectingSpecialChars(argumentsStr, TOOL_SEPARATOR_STR, argPos);
+        if (commaPos == std::string::npos) {
+            auto remainingStr = argumentsStr.substr(argPos);
+            args.push_back(remainingStr);
+            SPDLOG_LOGGER_INFO(llm_calculator_logger, "No more commas found, adding remaining argument string: {}", remainingStr);
+            break;
+        }
+        auto argStr = argumentsStr.substr(argPos, commaPos - argPos);
+        args.push_back(argStr);
+        SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed argument string: {}", argStr);
+        argPos = commaPos + TOOL_SEPARATOR_STR.length();
+    }
+
+    for (const std::string& arg : args) {
+        parsedArgs.push_back(parseSingleArgument(arg));
+    }
+    return parsedArgs;
+}
+
+bool Lfm2ToolParser::parseInContentState() {
+    size_t pos = this->streamingContent.find(TOOL_CALL_START_TAG, this->streamingPosition);
+    size_t toolCallEndTagPos = this->streamingContent.find(TOOL_CALL_END_TAG, this->streamingPosition);
+    if (toolCallEndTagPos != std::string::npos && pos == std::string::npos) {
+        SPDLOG_LOGGER_INFO(llm_calculator_logger, "Detected end of tool call at position: {}", toolCallEndTagPos);
+        this->streamingPosition = toolCallEndTagPos + TOOL_CALL_END_TAG.length();
+        return false;
+    }
+    if (pos != std::string::npos) {
+        this->streamingPosition = pos + TOOL_CALL_START_TAG.length() + TOOL_LIST_START_INDICATOR.length();
+        this->currentState = State::ToolCallStarted;
+        SPDLOG_LOGGER_INFO(llm_calculator_logger, "Detected start of tool call at position: {}", pos);
+        return false;
+    }
+
+    return true;
+}
+
+bool Lfm2ToolParser::parseInToolCallState() {
+    size_t pos = this->streamingContent.find(TOOL_ARGS_START_INDICATOR, this->streamingPosition);
+
+    if (pos == std::string::npos) {
+        return false;
+    }
+
+    std::string toolName = this->streamingContent.substr(this->streamingPosition, pos - this->streamingPosition);
+    this->toolCall = ToolCall{generateRandomId(), toolName, ""};
+    SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed tool name: {}", toolName);
+    this->streamingPosition = pos + TOOL_ARGS_START_INDICATOR.length();
+    this->currentState = State::ToolCallParameters;
+    this->toolCallIndex++;
+    return true;
+}
+
+
+bool Lfm2ToolParser::parseToolCallParametersState() {
+    size_t pos = this->streamingContent.find(TOOL_ARGS_END_INDICATOR, this->streamingPosition);
+    if (pos == std::string::npos) {
+        return false;
+    }
+    std::string argumentsStr = this->streamingContent.substr(this->streamingPosition, pos - this->streamingPosition);
+    SPDLOG_LOGGER_INFO(llm_calculator_logger, "Parsed arguments string: {}", argumentsStr);
+    std::vector<Argument> arguments = parseArguments(argumentsStr);
+
+    rapidjson::Document argsDoc(rapidjson::kObjectType);
+    rapidjson::StringBuffer sb;
+    rapidjson::Writer<rapidjson::StringBuffer> argsWriter(sb);
+    argsWriter.StartObject();
+
+    for (const Argument& argument : arguments) {
+        argsWriter.Key(argument.name.c_str());
+        writeArgumentOfAnyType(argument.value, argsWriter);
+    }
+
+    argsWriter.EndObject();
+    this->toolCall.arguments = sb.GetString();
+    this->currentState = State::ToolCallEnded;
+    this->streamingPosition = pos + TOOL_ARGS_END_INDICATOR.length();
+
+    return true;
+}
+
+bool Lfm2ToolParser::parseInToolCallEndedState() {
+    size_t pos = this->streamingContent.find(TOOL_LIST_END_INDICATOR, this->streamingPosition);
+    size_t toolSeparatorPos = this->streamingContent.find(TOOL_SEPARATOR_STR, this->streamingPosition);
+    size_t toolCallEndTagPos = this->streamingContent.find(TOOL_CALL_END_TAG, this->streamingPosition);
+    SPDLOG_LOGGER_INFO(llm_calculator_logger, "Current state: ToolCallEnded. Streaming content from current position: {}", this->streamingContent.substr(this->streamingPosition));
+    if (pos == std::string::npos && toolSeparatorPos == std::string::npos && toolCallEndTagPos == std::string::npos) {
+        return false;
+    } else if (toolSeparatorPos != std::string::npos && toolSeparatorPos < pos) {
+        this->streamingPosition = toolSeparatorPos + TOOL_SEPARATOR_STR.length();
+        this->currentState = State::ToolCallStarted;
+        SPDLOG_LOGGER_INFO(llm_calculator_logger, "Detected separator between tool calls at position: {}, expecting another tool call to start", toolSeparatorPos);
+    } else if (toolCallEndTagPos != std::string::npos) {


The parser logs many per-token/per-chunk events at INFO level (e.g., every argument split, every state transition). In streaming mode this can produce very high log volume and impact latency/throughput. Please downgrade these to DEBUG/TRACE (and keep INFO for genuinely actionable events/errors).

Copilot · 2026-03-25T12:50:05Z

src/test/llm/output_parsers/lfm2_output_parser_test.cpp

+#include <gtest/gtest.h>
+#include <openvino/genai/tokenizer.hpp>
+#include <string>
+#include <vector>
+
+#include "../../../llm/io_processing/base_output_parser.hpp"
+#include "../../../llm/io_processing/output_parser.hpp"
+#include "../../platform_utils.hpp"


This test file uses standard library types/functions that aren't directly included (e.g., std::tuple, std::optional, std::all_of, isalnum). Please add the corresponding headers (<tuple>, <optional>, <algorithm>, <cctype>) to avoid relying on transitive includes, which can break builds on different platforms/toolchains.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

mzegla · 2026-03-25T14:14:31Z

src/llm/io_processing/lfm2/tool_parser.cpp

+const std::string Lfm2ToolParser::TOOL_ARGS_END_INDICATOR = ")";
+const std::string Lfm2ToolParser::TOOL_SEPARATOR_STR = ", ";
+
+void Lfm2ToolParser::writeArgumentOfAnyType(const rapidjson::Value& arg, rapidjson::Writer<rapidjson::StringBuffer>& writer) {


If that's a standalone function could it be static?
Also it sounds quite generic. Can we extract it to utils? Not sure if we already have such file in io_processing...

mzegla · 2026-03-25T14:16:03Z

src/llm/io_processing/lfm2/tool_parser.cpp

+        }
+        writer.EndObject();
+    } else {
+        writer.String("");


Should we write anything if logging error?

in other case the key will be left like this {key: bad_value, another_key: value} -> {"key", "another_key", "value"}

mzegla · 2026-03-25T14:21:24Z

src/llm/io_processing/lfm2/tool_parser.cpp

+    const char last = normalized.back();
+    if ((first == '{' && last == '}') || (first == '[' && last == ']')) {
+        std::replace(normalized.begin(), normalized.end(), '\'', '"');
+        SPDLOG_LOGGER_TRACE(llm_calculator_logger, "Argument contains curly braces or square brackets, replaced single quotes with double quotes for JSON parsing. Modified string: {}", normalized);


Is it always safe? What if we get argument with list of strings:

["hello", "it's me"]

?

it's right, I will think how can I change it

mzegla · 2026-03-25T14:28:07Z

src/llm/io_processing/lfm2/tool_parser.cpp

+        SPDLOG_LOGGER_TRACE(llm_calculator_logger, "Argument contains curly braces or square brackets, replaced single quotes with double quotes for JSON parsing. Modified string: {}", normalized);
+    }
+
+    if (normalized[0] == '"' && normalized.back() == '"' && normalized.find("\\") != std::string::npos) {


Don't we already have some methods that implement escaping? Please check and if they are, we could extract them and put to some utils file.
Note that other special chars might require escaping too. Like /n, /t etc.

mzegla · 2026-03-25T14:31:46Z

src/llm/io_processing/lfm2/tool_parser.cpp

+    std::string lowerArg = normalized;
+    std::transform(lowerArg.begin(), lowerArg.end(), lowerArg.begin(), ::tolower);
+
+    if (lowerArg == "true" || lowerArg == "false") {


We don't want to capture situation where false/true has for example trailing whitespace? Or it's not expected in that format.

mzegla · 2026-03-25T15:16:17Z

src/llm/io_processing/lfm2/tool_parser.cpp

+            this->currentState = State::Content;
+        }
+    }
+


Consider this. We need to remain functional even if max_tokens is exceeded

mzegla · 2026-03-25T15:22:54Z

src/llm/io_processing/lfm2/tool_parser.cpp

+    size_t pos = 0;
+    int main_guard = 0;
+
+    while (pos != std::string::npos && main_guard < MAX_TOOL_CALLS) {


Is this guard really needed?

mzegla · 2026-03-25T15:26:08Z

src/llm/io_processing/lfm2/tool_parser.cpp

+        if(end == std::string::npos) {
+            end = parsedOutput.content.rfind(TOOL_LIST_END_INDICATOR, end);
+            if (end == std::string::npos) {
+                SPDLOG_LOGGER_ERROR(llm_calculator_logger, "Malformed tool call in content, no tool end tag or tool list end tag found for tool call starting at position {}", start);


I would put it to trace or debug at most

mzegla · 2026-03-25T15:37:22Z

src/test/llm/output_parsers/lfm2_output_parser_test.cpp

+    EXPECT_EQ(parsedOutput.toolCalls[0].id.empty(), false);
+}
+
+TEST_F(LFM2OutputParserTest, HolisticStreaming) {


For better coverage of escaping, could you also add test where we have long argument in string format?
I mean like:

impl = "import package\nimport package2\n\ndef func(a, b):\n\td={"python": "dict"}\n\tl = ["list \"with escaped text\"", 123, []]

mzegla · 2026-03-25T15:40:20Z

src/llm/io_processing/lfm2/tool_parser.cpp

+
+    this->streamingContent += chunk;
+
+    if (parseNewContent()) {


What happens if you receive chunk with both name and first argument? like:
dummy(arg1=
Do we support scenario with bigger chunks that would fit in multiple states?

przepeck added 15 commits March 10, 2026 10:05

save

d10ef38

save

cd90b91

save

087d054

save

2f6d306

save

f6e900d

build fix

5cdcd2e

save

d835022

save

7af29b0

save

53cc07c

save

c2bda14

streaming

70b063b

fix and test

1782e6a

save

e7d8581

save

f69fed0

improvements

bb6366d

przepeck requested review from Copilot, dtrawins and mzegla March 25, 2026 12:41

Copilot started reviewing on behalf of przepeck March 25, 2026 12:42 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

dtrawins requested a review from atobiszei March 25, 2026 13:15

przepeck and others added 5 commits March 25, 2026 14:52

Style

daf83e8

Apply suggestions from code review

d481a5a

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

fixes

7076415

clang-format

93806de

cpplint

3f32e01

mzegla reviewed Mar 25, 2026

View reviewed changes

adding model to script, enabling tests

15d8c5f

+    if (finishReason != ov::genai::GenerationFinishReason::NONE) {
+        if (this->currentState == State::ToolCallParameters || this->currentState == State::ToolCallEnded) {
+            if (!this->toolCall.arguments.empty()) {
+                return wrapDeltaArgs(this->toolCall.arguments, toolCallIndex);
+            }
+        } else if (this->currentState == State::Content) {
+            if (this->streamingPosition < this->streamingContent.size()) {
+                auto content = this->streamingContent.substr(this->streamingPosition);
+                this->streamingPosition = this->streamingContent.size();
+                if (!content.empty()) {
+                    return wrapDeltaContent(content);
+                }
+            }
+        }
+    }

Conversation

przepeck commented Mar 25, 2026

🛠 Summary

🧪 Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants