Description
Currently, the LLM extraction pipeline in src/llm.py fires a separate HTTP request
to Ollama for every field in the form via LLM.main_loop(). A form with 20 fields
produces 20 sequential round-trips to the local model, making the pipeline slow,
fragile, and wasteful. There is also no structured output contract the LLM returns
raw strings, and the -1 sentinel for missing values can silently end up written
into a PDF field.
Proposed Solution
Replace the per-field loop with a single batch extraction call LLM.extract_all() that sends all field names to Mistral at once and receives a single JSON object
containing every field value in one shot.
Suggested Implementation
New method: LLM.extract_all()
def extract_all(self, fields: list[str], transcript: str) -> dict:
prompt = f"""
SYSTEM PROMPT:
You are an AI assistant designed to extract information from transcribed voice
recordings and return the results as a structured JSON object.
You will receive a list of field names and a transcript. For each field, identify
its value in the transcript and include it in the JSON response.
Rules:
- Return a single JSON object where every key is a field name from the list.
- If a field is plural and multiple values are found, return them separated by ";".
- If a value cannot be found in the transcript, return null for that field.
- Return JSON only. No explanation, no markdown, no extra text.
---
Fields to extract: {json.dumps(fields)}
Transcript: {transcript}
"""
response = self.client.chat(
model="mistral",
messages=[{"role": "user", "content": prompt}],
format="json"
)
return json.loads(response.message.content)
Fallback for backward compatibility
The old per-field loop is retained as LLM._legacy_per_field_extract() and called
automatically if the batch method fails (e.g. on older Ollama versions that do not
support JSON mode):
Benefits
- Performance — eliminates N sequential Ollama round-trips, replacing them with
a single call regardless of form size
- Safety —
null for missing values is unambiguous and handled cleanly by the
downstream Validator, preventing sentinel strings from reaching the PDF
- Backward compatibility — legacy per-field loop is preserved as a fallback,
so deployments on older Ollama versions are unaffected
Files Affected
src/llm.py — primary change, new extract_all() and _batch_extract() methods
tests/test_llm.py — new unit tests for batch extraction, JSON parsing, plural
handling, null sentinel, and fallback behaviour
Description
Currently, the LLM extraction pipeline in
src/llm.pyfires a separate HTTP requestto Ollama for every field in the form via
LLM.main_loop(). A form with 20 fieldsproduces 20 sequential round-trips to the local model, making the pipeline slow,
fragile, and wasteful. There is also no structured output contract the LLM returns
raw strings, and the
-1sentinel for missing values can silently end up writteninto a PDF field.
Proposed Solution
Replace the per-field loop with a single batch extraction call
LLM.extract_all()that sends all field names to Mistral at once and receives a single JSON objectcontaining every field value in one shot.
Suggested Implementation
New method:
LLM.extract_all()Fallback for backward compatibility
The old per-field loop is retained as
LLM._legacy_per_field_extract()and calledautomatically if the batch method fails (e.g. on older Ollama versions that do not
support JSON mode):
Benefits
a single call regardless of form size
nullfor missing values is unambiguous and handled cleanly by thedownstream
Validator, preventing sentinel strings from reaching the PDFso deployments on older Ollama versions are unaffected
Files Affected
src/llm.py— primary change, newextract_all()and_batch_extract()methodstests/test_llm.py— new unit tests for batch extraction, JSON parsing, pluralhandling, null sentinel, and fallback behaviour