📝 Description
FireForm currently extracts PDF field names as machine-generated identifiers (e.g. textbox_0_0, textbox_0_1). When these are sent to Mistral for extraction, the model has no semantic context and either returns null for all fields or hallucinates a single value repeated across unrelated fields (see related Bug #173).
A Department Profile system would ship pre-built mappings between human-readable field labels and the internal PDF field identifiers for common agency forms used by Fire Departments, Police, and EMS.
💡 Rationale
FireForm's mission is to serve real first responders out of the box. Currently:
- A firefighter uploads a Cal Fire incident form
- Mistral receives
{"textbox_0_0": "", "textbox_0_1": ""}
- It has no idea what these fields mean → returns null or wrong values
- The filled PDF is blank or incorrect
With department profiles:
- The profile provides
{"Officer Name": "textbox_0_0", "Incident Location": "textbox_0_1"}
- Mistral receives human-readable labels → extracts correctly
- The filled PDF is accurate
This solves the root cause of Issue #173 without requiring changes to the LLM pipeline.
🛠️ Proposed Solution
Profile schema:
{
"department": "Fire Department",
"description": "Standard Cal Fire incident report",
"fields": {
"Officer Name": "textbox_0_0",
"Badge Number": "textbox_0_1",
"Incident Location": "textbox_0_2",
"Incident Date": "textbox_0_3",
"Number of Victims": "textbox_0_4"
},
"example_transcript": "Officer Smith, badge 4421, responding to structure fire at 742 Evergreen Terrace on March 8th. Two victims on scene."
}
Profiles to implement:
✅ Acceptance Criteria
📌 Additional Context
Related bugs this directly addresses: #173 (PDF filler hallucinates repeating values)
Related features this complements: #111 (Field Mapping Wizard — for custom PDFs not covered by profiles)
This is especially important for FireForm's stated mission as a UN Digital Public Good — the system should work correctly for real first responders without requiring technical setup.
📝 Description
FireForm currently extracts PDF field names as machine-generated identifiers (e.g.
textbox_0_0,textbox_0_1). When these are sent to Mistral for extraction, the model has no semantic context and either returnsnullfor all fields or hallucinates a single value repeated across unrelated fields (see related Bug #173).A Department Profile system would ship pre-built mappings between human-readable field labels and the internal PDF field identifiers for common agency forms used by Fire Departments, Police, and EMS.
💡 Rationale
FireForm's mission is to serve real first responders out of the box. Currently:
{"textbox_0_0": "", "textbox_0_1": ""}With department profiles:
{"Officer Name": "textbox_0_0", "Incident Location": "textbox_0_1"}This solves the root cause of Issue #173 without requiring changes to the LLM pipeline.
🛠️ Proposed Solution
src/profiles/directory with JSON profile filesProfile schema:
{ "department": "Fire Department", "description": "Standard Cal Fire incident report", "fields": { "Officer Name": "textbox_0_0", "Badge Number": "textbox_0_1", "Incident Location": "textbox_0_2", "Incident Date": "textbox_0_3", "Number of Victims": "textbox_0_4" }, "example_transcript": "Officer Smith, badge 4421, responding to structure fire at 742 Evergreen Terrace on March 8th. Two victims on scene." }Profiles to implement:
fire_department.json— Cal Fire incident reportpolice_report.json— Standard police incident formems_medical.json— EMS patient care reportsrc/llm.pyto use profile labels in prompt✅ Acceptance Criteria
docs/📌 Additional Context
Related bugs this directly addresses: #173 (PDF filler hallucinates repeating values)
Related features this complements: #111 (Field Mapping Wizard — for custom PDFs not covered by profiles)
This is especially important for FireForm's stated mission as a UN Digital Public Good — the system should work correctly for real first responders without requiring technical setup.