NLP Assignment 3 | Master AI — Natural Language Processing 2026
This project is built for Assignment 3: LLM Language Application & Evaluation (Option 1). The assignment asks students to design a language-based application using an LLM, with methodology informed by course theory (prompting strategies, model behaviour, human-LLM interaction), and to critically evaluate the system's reliability and consistency.
Getting a CV into a clean, professional format is tedious. You have to parse your own writing, restructure it into sections, and then make it look good on paper.
CV Generator automates this with an LLM pipeline: upload your existing CV (or paste raw text), the model extracts all your information into a structured schema, you fill in any gaps or correct mistakes through a guided UI, and the app generates a polished PDF via LaTeX — with bullet points optionally rewritten to remove AI-sounding phrasing.
Input (PDF / DOCX / raw text)
↓
[1] Text extraction extraction.py — pdfplumber / python-docx
↓
[2] LLM structured extraction GPT-4o-mini → schema.json
↓
[3] Human QA & editing Streamlit UI (app.py / qa.py)
↓
[4] Bullet humanization GPT-4o-mini rewrites AI phrasing
↓
[5] LaTeX rendering Jinja2 → cv_template.tex
↓
[6] Faithfulness check key facts verified in .tex source
↓
[7] PDF compilation pdflatex → downloadable PDF
| File | Role |
|---|---|
app.py |
Streamlit web UI — 4-stage flow (upload → QA → preview → download) |
extraction.py |
Text extraction from PDF/DOCX + LLM-based structured parsing |
qa.py |
CLI alternative for gap-filling interactively |
generation.py |
Bullet humanization, LaTeX escaping, template rendering, PDF compilation, faithfulness check |
schema.json |
CV data schema (personal, education, experience, projects, skills, languages) |
templates/cv_template.tex |
Jinja2-annotated LaTeX CV template |
eval/run_extraction_eval.py |
Automated extraction evaluation against ground-truth test cases |
eval/results/ |
Eval report (JSON + text) from the last evaluation run |
test_cases/ |
5 synthetic test CVs (3 standard profiles + 2 challenging cases) with ground-truth JSON |
Model: GPT-4o-mini — chosen for cost-efficiency and speed. Structured extraction is a well-scoped task that does not require the reasoning depth of larger models. JSON mode (response_format={"type": "json_object"}) guarantees parseable output.
Temperature: 0 for extraction (determinism, reproducibility) and faithfulness judging; 0.4 for bullet rewriting (some creativity while staying factual).
Prompting strategy: The extraction prompt explicitly distinguishes skills.languages (programming languages) from languages (spoken languages) — a common LLM confusion point — and instructs the model to use null rather than hallucinate missing fields. The bullet rewriter is given a banned-phrase list ("leveraged", "spearheaded", "synergy" etc.) and a hard rule to not add or remove facts.
Faithfulness check: After rendering the LaTeX template, key facts (name, email, all institution/company names) are verified to be present in the source before PDF compilation. Discrepancies are surfaced as warnings in the UI.
Human-in-the-loop: The QA stage lets users correct extraction errors, add missing entries, or fill fields the LLM could not find. This hybrid design reduces dependence on perfect LLM output.
The extraction pipeline is evaluated with eval/run_extraction_eval.py against 5 synthetic test CVs:
cv_cs_graduate— standard CS profilecv_ai_graduate— AI/ML researcher profilecv_business_graduate— non-technical business profilecv_challenging_1_formatting— poorly formatted, inconsistent structurecv_challenging_2_content— mixed-language CV (Dutch/English), vague job descriptions
Each test case has a hand-crafted ground-truth JSON. Fields are evaluated with 4 labels:
| Label | Meaning |
|---|---|
| PASS | Correct or semantically equivalent |
| FAIL | Meaningfully wrong |
| MISS | Ground truth has a value, model extracted null/empty |
| HALLUCINATION | Ground truth is null/empty, model invented a value |
Semantic fields (summaries, bullet points, institution names) use an LLM-as-judge (GPT-4o-mini, temperature=0). Dates and skill lists use exact set matching.
| CV | Score |
|---|---|
| cv_cs_graduate | 32/53 (60.4%) |
| cv_ai_graduate | 33/54 (61.1%) |
| cv_business_graduate | 34/57 (59.6%) |
| cv_challenging_1_formatting | 22/59 (37.3%) |
| cv_challenging_2_content | 31/49 (63.3%) |
| Overall | 152/272 (55.9%) |
By category:
| Category | Score |
|---|---|
| personal | 40/40 (100%) |
| skills | 17/20 (85%) |
| experience | 53/110 (48%) |
| projects | 22/52 (42%) |
| education | 20/50 (40%) |
Personal fields (name, email, phone, location) are extracted perfectly. The main failure mode is systematic MISS on dates, roles, and project names — the model extracts the surrounding text but not these structured sub-fields when the CV format deviates from expectations. The challenging formatting case (cv_challenging_1_formatting) additionally triggers hallucination on extra bullet points the model infers from vague text. Zero hallucinations on well-structured CVs.
- Python 3.11+
- An OpenAI API key
pdflatexfor PDF generation:brew install --cask basictex(macOS) orapt install texlive(Linux)
cd CV_LLM_project
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txtCreate a .env file in CV_LLM_project/:
OPENAI_API_KEY=sk-...
Or enter the key directly in the app sidebar.
streamlit run app.pyThe app opens at http://localhost:8501. Steps:
- Upload — drop a PDF/DOCX or paste raw text (or start from an empty form)
- Fill gaps — review extracted data, correct mistakes, add missing entries
- Preview & generate — optionally render a live preview, then generate the PDF
- Download — download your formatted CV
python eval/run_extraction_eval.pyResults are written to eval/results/eval_report.txt and eval/results/eval_report.json.
# Extract structured data from a CV file
python extraction.py path/to/cv.pdf -o extracted.json
# Fill missing fields interactively
python qa.py extracted.json -o complete.json
# Generate PDF
python generation.py complete.json -o output/