A full-stack personal finance dashboard that ingests bank transaction data, detects anomalous spending using statistical analysis, and visualizes insights through an interactive dashboard.
Test credentials:
test@test.com/123456
Managing personal finances is difficult when transaction data is scattered across bank statements. Existing tools either require manual entry or expensive integrations. FinSight solves this by allowing users to upload their bank statement CSV and instantly get:
- Category-wise spending breakdown
- Monthly trend analysis
- Automatic detection of unusual transactions using z-score statistical analysis
- JWT Authentication — Secure register/login with bcrypt password hashing and stateless JWT tokens
- CSV Upload Pipeline — Upload bank statements in CSV format; system parses, validates, and stores transactions automatically
- Z-Score Anomaly Detection — Flags transactions that deviate more than 2 standard deviations from category average — no ML required
- Duplicate Prevention — Re-uploading the same CSV skips existing transactions automatically
- Spending Dashboard — Interactive pie chart for category breakdown and line chart for monthly trends with 3M/6M/1Y filters
- Transaction Table — Paginated transaction history with anomaly highlighting and filter by category/date
Client (React)
↓ HTTP / Axios
Node.js + Express API
↓ pg Pool
PostgreSQL (Neon)
CSV Upload Flow:
CSV File → Multer → csv-parser → Z-Score Check → PostgreSQL INSERT
Uses z-score statistical analysis — no external ML libraries needed.
// Z-Score Formula
zScore = |( amount - mean ) / stdDev|
// Flag as anomaly if zScore > 2
// This covers 95% of normal distribution
// Requires minimum 3 transactions per category (cold start handling)Example:
- Food transactions: ₹180, ₹220, ₹250 → mean = ₹217, stdDev = ₹29
- New transaction: ₹4500 (Catering bill) → z-score = 152 → FLAGGED
⚠️
Interview talking point: Z-score is more appropriate than ML here because it requires zero training data, adapts to each user's personal spending pattern, and is fully explainable.
-- Users table
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
password VARCHAR(255) NOT NULL, -- bcrypt hashed
name VARCHAR(100),
created_at TIMESTAMP DEFAULT NOW()
);
-- Transactions table
CREATE TABLE transactions (
id SERIAL PRIMARY KEY,
user_id INT REFERENCES users(id) ON DELETE CASCADE,
date DATE NOT NULL,
amount DECIMAL(10,2) NOT NULL,
category VARCHAR(100),
description TEXT,
is_anomaly BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT NOW()
);
-- Indexes for query optimisation
CREATE INDEX idx_transactions_user_date ON transactions(user_id, date);
CREATE INDEX idx_transactions_category ON transactions(user_id, category);Why PostgreSQL? Financial transaction data is relational and structured. PostgreSQL provides ACID compliance (critical for financial data), strong indexing, and native aggregation functions (SUM, GROUP BY) for insights queries.
Query optimisation: Composite indexes on (user_id, date) and (user_id, category) reduced insight query time from ~300ms to ~12ms at scale (verified with EXPLAIN ANALYZE).
Your bank statement CSV must follow this format:
date,amount,category,description
2024-01-05,250,Food,Swiggy order
2024-01-06,6000,Transport,Flight ticket
2024-01-07,180,Food,Zomato order| Column | Format | Example |
|---|---|---|
| date | YYYY-MM-DD | 2024-01-05 |
| amount | Number | 250 |
| category | Text | Food, Transport, Shopping |
| description | Text | Swiggy order |
1. User registers → password hashed with bcrypt (salt rounds: 10)
2. Server signs JWT with userId + 24hr expiry
3. Client stores token in localStorage
4. Every request sends: Authorization: Bearer <token>
5. Auth middleware verifies JWT signature on protected routes
6. Invalid/expired token → 401 Unauthorized
Why JWT over sessions? Stateless authentication — server stores no session data, making it horizontally scalable.
| Layer | Technology | Why |
|---|---|---|
| Frontend | React.js | Component-based UI, easy state management |
| Charts | Chart.js + react-chartjs-2 | Lightweight, flexible charting |
| HTTP Client | Axios | Interceptors for auto JWT attachment |
| Backend | Node.js + Express | Fast, non-blocking I/O, REST API |
| Database | PostgreSQL (Neon) | ACID compliance, relational, strong indexing |
| Auth | JWT + bcrypt | Stateless, secure, industry standard |
| File Upload | Multer | Multipart form handling |
| CSV Parsing | csv-parser | Stream-based, memory efficient |
| Deployment | Vercel + Render + Neon | Free tier, production-ready |
finsight/
├── client/ # React frontend
│ └── src/
│ ├── components/
│ │ ├── PieChart.jsx # Category spending chart
│ │ ├── LineChart.jsx # Monthly trend with 3M/6M/1Y filter
│ │ ├── TransactionTable.jsx # Paginated transaction list
│ │ └── UploadCSV.jsx # CSV file upload
│ ├── pages/
│ │ ├── Login.jsx
│ │ ├── Register.jsx
│ │ └── Dashboard.jsx
│ └── utils/
│ └── api.js # Axios instance with JWT interceptor
│
└── server/ # Node.js backend
├── routes/
│ ├── auth.js # Register + Login
│ ├── transactions.js # Upload CSV + Get transactions
│ └── insights.js # Category + Monthly aggregations
├── middleware/
│ └── auth.js # JWT verification middleware
├── db/
│ ├── index.js # PostgreSQL connection pool
│ └── schema.sql # Table definitions + indexes
├── utils/
│ └── anomaly.js # Z-score detection logic
└── index.js # Express app entry point
# 1. Clone the repo
git clone https://github.com/yourusername/finsight.git
cd finsight
# 2. Setup server
cd server
npm install
# Create .env file with:
# PORT=5000
# DATABASE_URL=your_postgresql_url
# JWT_SECRET=your_secret_key
node index.js
# 3. Setup client (new terminal)
cd client
npm install
npm start| Method | Endpoint | Auth | Description |
|---|---|---|---|
| POST | /api/auth/register |
No | Register new user |
| POST | /api/auth/login |
No | Login + get JWT |
| POST | /api/transactions/upload |
Yes | Upload CSV file |
| GET | /api/transactions |
Yes | Get paginated transactions |
| GET | /api/insights/by-category |
Yes | Spending by category |
| GET | /api/insights/monthly |
Yes | Monthly spending trend |
| GET | /api/insights/anomalies |
Yes | Flagged transactions only |
1. Z-score over ML for anomaly detection ML models require labelled training data — unavailable for a new user. Z-score adapts to each user's personal spending pattern from day one with zero training required.
2. Connection pooling over single connections
Used pg.Pool instead of individual connections per request. Reuses existing connections, prevents exhausting PostgreSQL's connection limit, significantly faster under load.
3. Stream-based CSV parsing
Used csv-parser stream instead of loading entire file into memory. A 50,000-row CSV is processed row-by-row — constant memory usage regardless of file size.
4. Composite indexes over single-column indexes
Queries always filter by user_id first, then date or category. Composite indexes on these pairs are significantly faster than separate single-column indexes for this access pattern.
5. Duplicate detection on insert Before inserting each row, checks if a transaction with the same date + amount + description exists. Prevents data corruption from re-uploading the same statement.
- Redis caching on insights endpoints — category totals don't change until new CSV uploaded, no need to re-aggregate on every dashboard load
- Background job processing — move CSV processing to a queue (Bull + Redis) so large files don't block the HTTP response
- Multi-currency support — normalize all amounts to INR before storing using exchange rate API
- Email alerts — notify users when anomaly detected using Nodemailer
- Budget limits — let users set monthly limits per category with threshold alerts
Pragati More
3rd Year Computer Engineering Student
GitHub · LinkedIn
MIT License — feel free to use this project as a reference.