Skip to content

codinggita/medclear

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

12 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿฅ MedClear

Detect Healthcare Overcharging Instantly


Status License PRs Welcome Last Commit Contributors


MedClear is an AI-powered healthcare billing audit tool that detects overcharging in hospital bills using OCR (Optical Character Recognition) and intelligent price comparison against NPPA + CGHS government pricing standards.

Think of it as a " TurboTax for medical bills" โ€” upload your hospital bill, and MedClear instantly tells you if you've been overcharged and by how much.


๐Ÿ“ธ Preview

MedClear Dashboard

๐Ÿšจ Problem Statement

Patients in India receive expensive hospital bills but have no way to verify if they are being overcharged due to lack of accessible pricing transparency.

Healthcare billing is one of the most opaque industries in the world. Patients are expected to pay thousands โ€” sometimes lakhs โ€” without any way to verify if the charges are legitimate.

  • Hidden Charges โ€” Patients have no way to verify if hospital bill items are correctly priced. A simple "Injection" can cost โ‚น50 or โ‚น5,000 with no explanation.
  • No Transparency โ€” Medical billing lacks accessible standard pricing benchmarks. Unlike grocery shopping where you can compare prices, hospital bills are treated as non-negotiable.
  • Exploitation โ€” Estimated โ‚น50,000+ crores is lost annually to overcharging in India alone. In the US, medical billing errors cost patients billions every year.
  • No Recourse โ€” Patients rarely challenge bills because they don't have the data or expertise to prove overcharging. Hospitals know this and exploit it.

๐Ÿ’ก Solution

MedClear bridges the information gap between patients and healthcare pricing.

How it works:

  1. Upload โ€” Patient uploads hospital bill (image or PDF)
  2. Extract โ€” OCR technology pulls all line items and prices from the bill
  3. Match โ€” Smart matching algorithm maps each item to NPPA/CGHS standard codes
  4. Compare โ€” Each item is checked against official government-defined rates
  5. Report โ€” Detailed savings report shows exactly where you were overcharged

The result? Instant clarity. Real savings. Empowerment.


โš™๏ธ Features

Feature Description
๐Ÿ“„ Upload Hospital Bill Drag-and-drop support for bills in JPG, PNG, or PDF format
๐Ÿ–ผ๏ธ OCR Text Extraction Tesseract-powered extraction pulls line items and prices from scanned documents
๐Ÿ” Smart Matching Fuzzy matching algorithm maps bill items to standard NPPA/CGHS codes
๐Ÿ“Š Price Comparison Real-time comparison against official NPPA + CGHS databases
โš ๏ธ Overcharge Detection Flags items exceeding government-defined rates with percentage overcharge
๐Ÿ’ฐ Savings Report Generates downloadable PDF report with itemized savings breakdown
๐Ÿ“ฑ User Dashboard History of all uploaded bills and their audit results
๐Ÿ” Secure Storage Bills encrypted and stored securely with user-level access control

๐Ÿง  How It Works

System Workflow


Step-by-Step Pipeline:

  1. Upload โ€” User drags and drops a hospital bill (image/PDF)
  2. Preprocessing โ€” Image is enhanced, rotated, and noise-reduced for better OCR
  3. OCR โ€” Tesseract + OpenCV extracts all text, line items, and prices
  4. Entity Extraction โ€” NLP parses extracted text into structured data (item name, quantity, unit price, total)
  5. Code Mapping โ€” Fuzzy matching maps each item to NPPA drug codes or CGHS service codes
  6. Price Lookup โ€” Query the database for government-defined rates
  7. Comparison โ€” Calculate overcharge amount and percentage for each item
  8. Output โ€” Generate JSON response + PDF report for the user

๐Ÿ—๏ธ Tech Stack

Why These Technologies?

We chose each technology in MedClear based on performance, developer experience, ecosystem maturity, and scalability. Here's why:


๐Ÿ–ฅ๏ธ Frontend

Technology Why We Chose It
React 18 Industry-standard component library with excellent performance via concurrent rendering. Used by Netflix, Airbnb, and Instagram.
TypeScript Static typing catches 30% of bugs at compile time. Essential for a financial/healthcare application where errors are costly.
Tailwind CSS Utility-first CSS allows rapid UI development without context-switching between files. Smaller bundle size than traditional CSS frameworks.
Vite Next-gen build tool that's 10-100x faster than webpack. Instant server start and lightning-fast HMR.
Zustand Lightweight state management without the boilerplate of Redux. Perfect for our simple auth + UI state needs.
React Query Handles server state, caching, and background refetching. Eliminates manual "loading" management.
Framer Motion Production-ready animations that make the app feel premium and polished.
React + TypeScript + Tailwind + Vite + Zustand + React Query + Framer Motion

โš™๏ธ Backend

Technology Why We Chose It
Node.js Event-driven I/O is perfect for our I/O-heavy OCR pipeline. Same language as frontend = full-stack productivity.
Express.js Minimal, unopinionated framework. We only pay for what we use. Massive ecosystem of middleware.
TypeScript End-to-end type safety from backend to frontend. Auto-complete everywhere.
Prisma Type-safe ORM that feels like a query builder. Migration system is best-in-class.
MongoDB NoSQL database for flexible data storage. Perfect for unstructured billing data with complex queries.
JWT Stateless authentication. Perfect for scalable APIs.
Node.js + Express + TypeScript + Prisma + MongoDB + JWT + Zod

๐Ÿค– OCR & AI Services

Technology Why We Chose It
Python Dominant language for AI/ML. Tesseract, OpenCV, and scikit-learn all have Python-first APIs.
Tesseract OCR Open-source, battle-tested OCR. Supports 100+ languages. No API costs = free at scale.
OpenCV Computer vision library for image preprocessing (contrast, deskew, denoising). Critical for accurate OCR on blurry hospital bills.
FuzzyWuzzy String matching library for matching bill items to NPPA codes. Handles typos and variations.
Pandas Data processing for analyzing large NPPA datasets efficiently.
FastAPI Async Python web framework. High-performance API for OCR results.uvicorn as the ASGI server.
Python + Tesseract + OpenCV + FuzzyWuzzy + Pandas + FastAPI + Uvicorn

๐Ÿ—„๏ธ Database & Infrastructure

Technology Why We Chose It
PostgreSQL The gold standard for relational data. JSON support for flexible metadata. Perfect for billing records.
Pinecone Vector database for semantic search. Enables "find similar drugs/services" functionality.
Docker Containerization ensures the same environment from dev to production. Essential for Python + Node compatibility.
Docker Compose Local development with one command. All services (DB, Redis, API) start together.
GitHub Actions Free CI/CD for open-source. Automated testing and deployment.
PostgreSQL + Pinecone + Docker + Docker Compose + GitHub Actions

๐Ÿ“ Project Structure (Monorepo)

medclear/
โ”œโ”€โ”€ frontend/          # React + TypeScript + Vite
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ”œโ”€โ”€ components/
โ”‚   โ”‚   โ”œโ”€โ”€ pages/
โ”‚   โ”‚   โ”œโ”€โ”€ hooks/
โ”‚   โ”‚   โ”œโ”€โ”€ stores/
โ”‚   โ”‚   โ””โ”€โ”€ utils/
โ”‚   โ””โ”€โ”€ package.json
โ”‚
โ”œโ”€โ”€ backend/           # Node.js + Express + Prisma
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ”œโ”€โ”€ controllers/
โ”‚   โ”‚   โ”œโ”€โ”€ routes/
โ”‚   โ”‚   โ”œโ”€โ”€ middleware/
โ”‚   โ”‚   โ”œโ”€โ”€ services/
โ”‚   โ”‚   โ””โ”€โ”€ utils/
โ”‚   โ””โ”€โ”€ package.json
โ”‚
โ”œโ”€โ”€ ocr-service/       # Python + FastAPI
โ”‚   โ”œโ”€โ”€ app/
โ”‚   โ”‚   โ”œโ”€โ”€ routers/
โ”‚   โ”‚   โ”œโ”€โ”€ services/
โ”‚   โ”‚   โ””โ”€โ”€ utils/
โ”‚   โ””โ”€โ”€ requirements.txt
โ”‚
โ””โ”€โ”€ docker-compose.yml # All services orchestration

๐Ÿ“Š Example Output

{
  "audit_result": {
    "bill_id": "BILL-2024-00123",
    "hospital_name": "Apollo Hospitals",
    "total_bill": "โ‚น1,42,500",
    "overcharged": "โ‚น42,300",
    "savings_percentage": "29.7%",
    "status": "audit_complete",
    "flagged_items": [
      {
        "item": "Private Ward (3 days)",
        "category": "room_charges",
        "charged": "โ‚น45,000",
        "allowed_cghs": "โ‚น12,000",
        "overcharge": "โ‚น33,000",
        "overcharge_percentage": "275%"
      },
      {
        "item": "Injection (Ceftriaxone 1g)",
        "category": "medications",
        "charged": "โ‚น850",
        "allowed_nppa": "โ‚น45",
        "overcharge": "โ‚น805",
        "overcharge_percentage": "1789%"
      },
      {
        "item": "Blood Test (CBC)",
        "category": "diagnostics",
        "charged": "โ‚น600",
        "allowed_cghs": "โ‚น150",
        "overcharge": "โ‚น450",
        "overcharge_percentage": "300%"
      }
    ],
    "recommendations": [
      "Dispute the ward charges with hospital billing department",
      "Request itemized bill with drug NDC codes",
      "File a complaint with NPPA if charges are not rectified"
    ]
  }
}

โš ๏ธ You were overcharged โ‚น42,300 โ€” that's โ‚น33,000 just on ward charges alone (275% over CGHS rates).


๐Ÿš€ Getting Started

Prerequisites

Tool Version Purpose
Node.js 18+ Frontend & Backend runtime
Python 3.9+ OCR Service
MongoDB 14+ Primary database
npm / pip Latest Package management

Clone & Install

# Clone the repository
git clone https://github.com/Rachit-Kakkad1/medclear.git
cd medclear

# Install frontend dependencies
cd frontend
npm install

# Go back and install backend dependencies
cd ../backend
npm install

# Install OCR service dependencies
cd ../ocr-service
pip install -r requirements.txt

Configure Environment

Create a .env file in backend/:

# Database
DATABASE_URL="postgresql://postgres:password@localhost:5432/medclear"

# Authentication
JWT_SECRET="your-super-secret-jwt-key-change-in-production"
JWT_EXPIRES_IN="7d"

# Redis
REDIS_URL="redis://localhost:6379"

# OCR Service
OCR_SERVICE_URL="http://localhost:8000"

# NPPA API (Government pricing data)
NPPA_API_URL="https://api.nppa.gov.in/pricing"
NPPA_API_KEY="your-api-key"

# App Config
NODE_ENV="development"
PORT=3000

Create a .env file in ocr-service/:

# Tesseract OCR
TESSDATA_PREFIX="/usr/share/tesseract-5/ tessdata"

# Image Processing
MAX_IMAGE_SIZE=10485760
SUPPORTED_FORMATS="jpg,jpeg,png,pdf"

# Server
HOST="0.0.0.0"
PORT=8000

Database Setup

cd backend

# Run Prisma migrations
npx prisma migrate dev

# Seed NPPA/CGHS pricing data
npx prisma db seed

Run the Application

Option 1: Manual Start (3 terminals)

# Terminal 1 โ€” Backend API
cd backend
npm run dev
# Starts at http://localhost:3000

# Terminal 2 โ€” Frontend
cd frontend
npm run dev
# Starts at http://localhost:5173

# Terminal 3 โ€” OCR Service
cd ocr-service
uvicorn app.main:app --reload
# Starts at http://localhost:8000

Option 2: Docker Compose (One command)

# Start all services with Docker
docker-compose up --build

Visit http://localhost:5173 to start auditing bills.


๐Ÿ”ฎ Future Scope

We're just getting started. Here's what's on our roadmap:

Feature Status Description
๐Ÿฅ Prescription Scanner Planned Analyze doctor prescriptions for medication overpricing
๐Ÿ›ก๏ธ Insurance Integration Planned Auto-submit audit reports to insurance providers
๐Ÿ“ˆ Real-time Pricing Planned Live API integration from NPPA for latest drug prices
๐Ÿ“ฑ Mobile App Planned Native iOS and Android applications
๐ŸŒ Multi-country Support Researching Support for US (Medicare), UK (NHS), EU pricing standards
๐Ÿค– AI Recommendations Researching Personalized cost-saving suggestions based on medical history
๐Ÿ“Š Analytics Dashboard Planned Hospital-level pricing analytics for researchers
๐Ÿ”” Alert System Planned Push notifications when pricing data updates

๐Ÿค Contribution

We welcome contributions from developers, designers, and healthcare professionals!

How to Contribute

# 1. Fork the repository
# 2. Clone your fork
git clone https://github.com/YOUR_USERNAME/medclear.git

# 3. Create a feature branch
git checkout -b feature/amazing-new-feature

# 4. Make your changes
# 5. Run tests
npm test        # Frontend
npm run test    # Backend
pytest          # OCR Service

# 6. Commit with descriptive message
git commit -m "Add: New feature that does X"

# 7. Push to your fork
git push origin feature/amazing-new-feature

# 8. Open a Pull Request

Contribution Areas

  • ๐Ÿ› Bug Fixes โ€” Help us squash bugs
  • ๐ŸŽจ UI/UX โ€” Make MedClear beautiful
  • ๐Ÿ“ˆ Features โ€” Build new capabilities
  • ๐Ÿ“š Documentation โ€” Improve docs
  • ๐Ÿงช Testing โ€” Increase test coverage
  • ๐Ÿ” Security โ€” Audit for vulnerabilities

๐Ÿ’ก Looking for a way to contribute? Check out our Good First Issues label.


๐Ÿ“œ License

MIT License โ€” see LICENSE for details.


Built with โค๏ธ for healthcare transparency

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors