feat: add validation and review pipeline for extraction workflow by Arijit429 · Pull Request #417 · fireform-core/FireForm

Arijit429 · 2026-04-08T20:18:12Z

🚀 Summary

This PR strengthens FireForm’s extraction workflow by introducing a dedicated validation and review pipeline for AI-generated structured outputs. The goal of this change is to improve extraction reliability, enable safer human-in-the-loop verification, and move the system closer to a production-ready workflow.

✨ What Changed

Added a dedicated validator module

Created a new utility file:

src/utils/extraction_validator.py

This introduces the ExtractionValidator class responsible for:

validating extracted incident fields
detecting missing / malformed values
generating confidence score
determining whether manual review is required

Integrated validation into extraction workflow

Updated:

src/file_manipulator.py

The extraction flow now includes a dedicated validation stage before PDF filling.

Updated flow

Frontend Input
   ↓
Structured LLM Extraction
   ↓
Fallback to Legacy Extraction (if needed)
   ↓
Validation Layer
   ↓
Confidence Scoring + requires_review
   ↓
PDF Filling
   ↓
Final Output

📌 Before

Previously, extracted data was passed directly into the PDF generation workflow after structured extraction / fallback.

This could allow:

incomplete fields
empty values
malformed extraction outputs
low-confidence reports

to move forward without sufficient validation.

✅ After

With this PR, every extracted output now passes through a validation pipeline that:

checks required fields
identifies missing data
assigns confidence score
flags incomplete outputs for manual review

This improves both reliability and safety of the generated reports.

🎯 Impact

Improves extraction consistency
Enables human-in-the-loop verification
Reduces incomplete report generation risk
Strengthens production readiness
Improves maintainability through modular validation logic

🧪 Testing

Tested locally using FastAPI Swagger routes.

Verified:

successful extraction flow
fallback extraction path
validation output logging
requires_review generation
successful PDF output creation

🔮 Future Scope

This validation layer also creates a strong foundation for future improvements such as:

field-level confidence scoring
schema-based validation
advanced NLP entity verification
route-level validation reporting

Arijit429 added 7 commits March 18, 2026 18:27

Fixing error handling message when PDF generation fails

a27df64

Added 30 seconds of timeout handling time for Api request on Ollamas

c96ab2a

Replace print statements with logging for better observability

a027545

Improve README with clearer local setup steps

dd6ac2d

Add requires_review flag for incomplete LLM extraction validation

aa98b33

feat: add structured extraction flow with safe fallback

7048088

feat: add extraction validation and review pipeline

11ccc2e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add validation and review pipeline for extraction workflow#417

feat: add validation and review pipeline for extraction workflow#417
Arijit429 wants to merge 7 commits intofireform-core:mainfrom
Arijit429:validation-review-pipeline

Arijit429 commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Arijit429 commented Apr 8, 2026

🚀 Summary

✨ What Changed

Added a dedicated validator module

Integrated validation into extraction workflow

Updated flow

📌 Before

✅ After

🎯 Impact

🧪 Testing

🔮 Future Scope

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant