Upload Validator — AI-Powered Content Matching

Validates that a user's text description actually matches what they uploaded.
100% local — no ChatGPT, no external AI APIs. All models run on your machine.

┌─────────────────────────────────────────┐
│  Score ≥ 75 %  →  ✅ Approved            │
│  50 – 74 %     →  ⚠️  Warning (user can  │
│                      improve or submit) │
│  Score < 50 %  →  ❌ Rejected            │
└─────────────────────────────────────────┘

Tech Stack

Layer	Technology	Purpose
Backend	Python · FastAPI · Uvicorn	REST API, file handling
CV/OCR	Tesseract · PyMuPDF · OpenCV	Text extraction from files
Embeddings	CLIP (ViT-B/32, local weights)	Visual ↔ text similarity
NLP	Sentence-BERT (all-MiniLM-L6-v2)	Semantic text ↔ OCR similarity
Frontend	React 18 · Vite · JSX	Drag-and-drop upload UI
Container	Docker · Docker Compose	One-command deployment

Project Layout

upload-validator/
├── backend/
│   ├── main.py              # FastAPI app — all CV/NLP logic
│   └── requirements.txt
├── frontend/
│   ├── src/
│   │   ├── UploadValidator.jsx   # Main component
│   │   └── main.jsx
│   ├── index.html
│   ├── package.json
│   └── vite.config.js
├── tests/
│   ├── test_validator.py    # Pytest suite (unit + integration)
│   └── generate_samples.py  # Creates sample_data/ test files
├── sample_data/             # Generated by generate_samples.py
└── docker/
    ├── Dockerfile.backend
    ├── Dockerfile.frontend
    └── docker-compose.yml

Quick Start (Docker — recommended)

Prerequisites

Docker Desktop ≥ 4.x

# 1. Clone / unzip the project
cd upload-validator

# 2. Build and start both services
docker compose -f docker/docker-compose.yml up --build

# Frontend → http://localhost:3000
# Backend  → http://localhost:8000

First run downloads ~600 MB of model weights (CLIP + SBERT).
They are cached in the container layer; subsequent starts are instant.

Quick Start (Local / No Docker)

Prerequisites

Tool	Version	Install
Python	3.10 +	python.org
Tesseract OCR	5.x	see below
Node.js	18 +	nodejs.org

Install Tesseract:

# macOS
brew install tesseract

# Ubuntu / Debian
sudo apt-get install tesseract-ocr tesseract-ocr-eng

# Windows — download installer from:
# https://github.com/UB-Mannheim/tesseract/wiki
# then add install dir to PATH

Backend

cd upload-validator/backend

python -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate

pip install -r requirements.txt

uvicorn main:app --reload --port 8000
# → http://localhost:8000
# → http://localhost:8000/docs   (Swagger UI)

Frontend

cd upload-validator/frontend

npm install
npm run dev
# → http://localhost:3000

Running the Tests

cd upload-validator

# Install test deps (if not already done)
pip install pytest httpx PyMuPDF

# Run the full suite
pytest tests/test_validator.py -v

What the tests cover

Class	What it tests
`TestCombineScores`	Score weighting math
`TestThresholdLogic`	approved / warning / rejected at exact boundaries
`TestHealthEndpoint`	`/health` returns 200
`TestValidateEndpoint`	Empty desc → 400, bad MIME → 415, PNG/PDF/MP4 routing
`TestScoreAccuracy`	High similarity → high score, low → low (mocked models)

All tests mock CLIP and SBERT so they run without GPU or internet access.

Manual Testing with Sample Files

# Generate sample files
python tests/generate_samples.py
# Creates: sample_data/red_square.png, invoice.pdf, sample_video.mp4, etc.

Open http://localhost:3000 and try these combinations:

File	Good description (expect ≥75%)	Bad description (expect <50%)
`red_square.png`	"a red coloured square image"	"quarterly financial report"
`invoice.pdf`	"an invoice with payment amount"	"a photo of a sunset"
`contract.pdf`	"a service agreement between two companies"	"cat video compilation"
`blue_square.png`	"a blue square illustration"	"legal contract document"

API test with curl

# Good match — should be approved
curl -X POST http://localhost:8000/validate \
  -F "file=@sample_data/red_square.png" \
  -F "description=a red square image"

# Bad match — should be rejected
curl -X POST http://localhost:8000/validate \
  -F "file=@sample_data/invoice.pdf" \
  -F "description=a video about cats"

How the Scoring Works

Image / PDF:
  CLIP score  (visual content ↔ description)   × 0.6
+ SBERT score (OCR text ↔ description)         × 0.4
= final %

Video (.mp4):
  Keyframes extracted (OpenCV) → same pipeline as images
  CLIP applied to each keyframe → max similarity used

If CLIP is unavailable (import error), the system falls back to OCR + SBERT only.

Adjusting Thresholds

Edit the decision block in backend/main.py:

if score >= 75:      # ← change to e.g. 80 for stricter approval
    decision = "approved"
elif score >= 50:    # ← change to e.g. 60 for stricter warning
    decision = "warning"
else:
    decision = "rejected"

Adjust model weights:

# In analyse_image / analyse_pdf / analyse_video
score = combine_scores(clip_sim, text_sim,
                       clip_weight=0.6,   # ← visual weight
                       text_weight=0.4)   # ← OCR/text weight

Supported File Types

Type	Extensions	Analysis method
Image	`.jpg` `.jpeg` `.png` `.webp` `.gif` `.bmp`	CLIP + Tesseract OCR
PDF	`.pdf`	PyMuPDF text extraction + page renders → CLIP
Video	`.mp4` `.mov` `.avi` `.mkv` `.webm`	OpenCV keyframes → CLIP

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Upload Validator — AI-Powered Content Matching

Tech Stack

Project Layout

Quick Start (Docker — recommended)

Prerequisites

Quick Start (Local / No Docker)

Prerequisites

Backend

Frontend

Running the Tests

What the tests cover

Manual Testing with Sample Files

API test with curl

How the Scoring Works

Adjusting Thresholds

Supported File Types

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
docker		docker
frontend		frontend
sample_data		sample_data
tests		tests
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Upload Validator — AI-Powered Content Matching

Tech Stack

Project Layout

Quick Start (Docker — recommended)

Prerequisites

Quick Start (Local / No Docker)

Prerequisites

Backend

Frontend

Running the Tests

What the tests cover

Manual Testing with Sample Files

API test with curl

How the Scoring Works

Adjusting Thresholds

Supported File Types

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages