Summary
When SkillSpector runs its LLM-backed analyzers (semantic analysis, meta-analyzer), the LangChain AIMessage response carries token usage data via response.usage_metadata (input_tokens, output_tokens, total_tokens). That data is currently discarded after findings are extracted. Nothing in the JSON report tells callers how many tokens were consumed.
Motivation
Downstream pipelines that orchestrate multiple scanners need token counts to compute LLM call costs (e.g. (input_tokens × rate_in + output_tokens × rate_out) / 1_000_000). Without this field in the report, cost attribution requires fragile workarounds such as patching LangChain internals or proxying HTTP traffic — approaches tightly coupled to implementation details that break across versions.
Proposed change
src/skillspector/state.py
Extend LLMCallRecord to carry token counts, and update llm_call_record() to accept them:
class LLMCallRecord(TypedDict):
node: str
ok: bool
error: str | None
input_tokens: int # 0 when ok=False
output_tokens: int # 0 when ok=False
src/skillspector/llm_analyzer_base.py
run_batches() and arun_batches() invoke the LLM and discard the response object after calling parse_response(). LangChain's AIMessage exposes usage_metadata; those counts should be extracted and accumulated:
# after: response = self._structured_llm.invoke(prompt) (or ainvoke)
usage = getattr(response, "usage_metadata", None) or {}
input_tokens += usage.get("input_tokens", 0)
output_tokens += usage.get("output_tokens", 0)
src/skillspector/nodes/report.py
Include aggregated token counts in the JSON output path, summed from state["llm_call_log"]:
{
"llm_usage": {
"input_tokens": 1234,
"output_tokens": 567
}
}
Acceptance criteria
Notes
Per the contribution guide I plan to follow the fork → branch → PR process and reference this issue. Happy to take it on — just flagging here first as requested.
Summary
When SkillSpector runs its LLM-backed analyzers (semantic analysis, meta-analyzer), the LangChain
AIMessageresponse carries token usage data viaresponse.usage_metadata(input_tokens,output_tokens,total_tokens). That data is currently discarded after findings are extracted. Nothing in the JSON report tells callers how many tokens were consumed.Motivation
Downstream pipelines that orchestrate multiple scanners need token counts to compute LLM call costs (e.g.
(input_tokens × rate_in + output_tokens × rate_out) / 1_000_000). Without this field in the report, cost attribution requires fragile workarounds such as patching LangChain internals or proxying HTTP traffic — approaches tightly coupled to implementation details that break across versions.Proposed change
src/skillspector/state.pyExtend
LLMCallRecordto carry token counts, and updatellm_call_record()to accept them:src/skillspector/llm_analyzer_base.pyrun_batches()andarun_batches()invoke the LLM and discard the response object after callingparse_response(). LangChain'sAIMessageexposesusage_metadata; those counts should be extracted and accumulated:src/skillspector/nodes/report.pyInclude aggregated token counts in the JSON output path, summed from
state["llm_call_log"]:{ "llm_usage": { "input_tokens": 1234, "output_tokens": 567 } }Acceptance criteria
llm_usage: {input_tokens: N, output_tokens: N}when LLM analyzers ran--no-llm/ static-only scanNotes
Per the contribution guide I plan to follow the fork → branch → PR process and reference this issue. Happy to take it on — just flagging here first as requested.