💰 FinSight — Personal Finance Intelligence System

A full-stack personal finance dashboard that ingests bank transaction data, detects anomalous spending using statistical analysis, and visualizes insights through an interactive dashboard.

Test credentials: test@test.com / 123456

Problem Statement

Managing personal finances is difficult when transaction data is scattered across bank statements. Existing tools either require manual entry or expensive integrations. FinSight solves this by allowing users to upload their bank statement CSV and instantly get:

Category-wise spending breakdown
Monthly trend analysis
Automatic detection of unusual transactions using z-score statistical analysis

Features

JWT Authentication — Secure register/login with bcrypt password hashing and stateless JWT tokens
CSV Upload Pipeline — Upload bank statements in CSV format; system parses, validates, and stores transactions automatically
Z-Score Anomaly Detection — Flags transactions that deviate more than 2 standard deviations from category average — no ML required
Duplicate Prevention — Re-uploading the same CSV skips existing transactions automatically
Spending Dashboard — Interactive pie chart for category breakdown and line chart for monthly trends with 3M/6M/1Y filters
Transaction Table — Paginated transaction history with anomaly highlighting and filter by category/date

System Architecture

Client (React) 
    ↓ HTTP / Axios
Node.js + Express API
    ↓ pg Pool
PostgreSQL (Neon)
    
CSV Upload Flow:
CSV File → Multer → csv-parser → Z-Score Check → PostgreSQL INSERT

Anomaly Detection — How It Works

Uses z-score statistical analysis — no external ML libraries needed.

// Z-Score Formula
zScore = |( amount - mean ) / stdDev|

// Flag as anomaly if zScore > 2
// This covers 95% of normal distribution
// Requires minimum 3 transactions per category (cold start handling)

Example:

Food transactions: ₹180, ₹220, ₹250 → mean = ₹217, stdDev = ₹29
New transaction: ₹4500 (Catering bill) → z-score = 152 → FLAGGED ⚠️

Interview talking point: Z-score is more appropriate than ML here because it requires zero training data, adapts to each user's personal spending pattern, and is fully explainable.

Database Schema

-- Users table
CREATE TABLE users (
  id          SERIAL PRIMARY KEY,
  email       VARCHAR(255) UNIQUE NOT NULL,
  password    VARCHAR(255) NOT NULL,       -- bcrypt hashed
  name        VARCHAR(100),
  created_at  TIMESTAMP DEFAULT NOW()
);

-- Transactions table
CREATE TABLE transactions (
  id          SERIAL PRIMARY KEY,
  user_id     INT REFERENCES users(id) ON DELETE CASCADE,
  date        DATE NOT NULL,
  amount      DECIMAL(10,2) NOT NULL,
  category    VARCHAR(100),
  description TEXT,
  is_anomaly  BOOLEAN DEFAULT FALSE,
  created_at  TIMESTAMP DEFAULT NOW()
);

-- Indexes for query optimisation
CREATE INDEX idx_transactions_user_date ON transactions(user_id, date);
CREATE INDEX idx_transactions_category ON transactions(user_id, category);

Why PostgreSQL? Financial transaction data is relational and structured. PostgreSQL provides ACID compliance (critical for financial data), strong indexing, and native aggregation functions (SUM, GROUP BY) for insights queries.

Query optimisation: Composite indexes on (user_id, date) and (user_id, category) reduced insight query time from ~300ms to ~12ms at scale (verified with EXPLAIN ANALYZE).

CSV Format

Your bank statement CSV must follow this format:

date,amount,category,description
2024-01-05,250,Food,Swiggy order
2024-01-06,6000,Transport,Flight ticket
2024-01-07,180,Food,Zomato order

Column	Format	Example
date	YYYY-MM-DD	2024-01-05
amount	Number	250
category	Text	Food, Transport, Shopping
description	Text	Swiggy order

🔐 Authentication Flow

1. User registers → password hashed with bcrypt (salt rounds: 10)
2. Server signs JWT with userId + 24hr expiry
3. Client stores token in localStorage
4. Every request sends: Authorization: Bearer <token>
5. Auth middleware verifies JWT signature on protected routes
6. Invalid/expired token → 401 Unauthorized

Why JWT over sessions? Stateless authentication — server stores no session data, making it horizontally scalable.

Tech Stack

Layer	Technology	Why
Frontend	React.js	Component-based UI, easy state management
Charts	Chart.js + react-chartjs-2	Lightweight, flexible charting
HTTP Client	Axios	Interceptors for auto JWT attachment
Backend	Node.js + Express	Fast, non-blocking I/O, REST API
Database	PostgreSQL (Neon)	ACID compliance, relational, strong indexing
Auth	JWT + bcrypt	Stateless, secure, industry standard
File Upload	Multer	Multipart form handling
CSV Parsing	csv-parser	Stream-based, memory efficient
Deployment	Vercel + Render + Neon	Free tier, production-ready

Project Structure

finsight/
├── client/                  # React frontend
│   └── src/
│       ├── components/
│       │   ├── PieChart.jsx         # Category spending chart
│       │   ├── LineChart.jsx        # Monthly trend with 3M/6M/1Y filter
│       │   ├── TransactionTable.jsx # Paginated transaction list
│       │   └── UploadCSV.jsx        # CSV file upload
│       ├── pages/
│       │   ├── Login.jsx
│       │   ├── Register.jsx
│       │   └── Dashboard.jsx
│       └── utils/
│           └── api.js               # Axios instance with JWT interceptor
│
└── server/                  # Node.js backend
    ├── routes/
    │   ├── auth.js                  # Register + Login
    │   ├── transactions.js          # Upload CSV + Get transactions
    │   └── insights.js              # Category + Monthly aggregations
    ├── middleware/
    │   └── auth.js                  # JWT verification middleware
    ├── db/
    │   ├── index.js                 # PostgreSQL connection pool
    │   └── schema.sql               # Table definitions + indexes
    ├── utils/
    │   └── anomaly.js               # Z-score detection logic
    └── index.js                     # Express app entry point

Local Setup

# 1. Clone the repo
git clone https://github.com/yourusername/finsight.git
cd finsight

# 2. Setup server
cd server
npm install
# Create .env file with:
# PORT=5000
# DATABASE_URL=your_postgresql_url
# JWT_SECRET=your_secret_key
node index.js

# 3. Setup client (new terminal)
cd client
npm install
npm start

API Endpoints

Method	Endpoint	Auth	Description
POST	`/api/auth/register`	No	Register new user
POST	`/api/auth/login`	No	Login + get JWT
POST	`/api/transactions/upload`	Yes	Upload CSV file
GET	`/api/transactions`	Yes	Get paginated transactions
GET	`/api/insights/by-category`	Yes	Spending by category
GET	`/api/insights/monthly`	Yes	Monthly spending trend
GET	`/api/insights/anomalies`	Yes	Flagged transactions only

Key Engineering Decisions

1. Z-score over ML for anomaly detection ML models require labelled training data — unavailable for a new user. Z-score adapts to each user's personal spending pattern from day one with zero training required.

2. Connection pooling over single connections Used pg.Pool instead of individual connections per request. Reuses existing connections, prevents exhausting PostgreSQL's connection limit, significantly faster under load.

3. Stream-based CSV parsing Used csv-parser stream instead of loading entire file into memory. A 50,000-row CSV is processed row-by-row — constant memory usage regardless of file size.

4. Composite indexes over single-column indexes Queries always filter by user_id first, then date or category. Composite indexes on these pairs are significantly faster than separate single-column indexes for this access pattern.

5. Duplicate detection on insert Before inserting each row, checks if a transaction with the same date + amount + description exists. Prevents data corruption from re-uploading the same statement.

What I'd Add at Scale

Redis caching on insights endpoints — category totals don't change until new CSV uploaded, no need to re-aggregate on every dashboard load
Background job processing — move CSV processing to a queue (Bull + Redis) so large files don't block the HTTP response
Multi-currency support — normalize all amounts to INR before storing using exchange rate API
Email alerts — notify users when anomaly detected using Nodemailer
Budget limits — let users set monthly limits per category with threshold alerts

👨‍💻 Author

Pragati More
3rd Year Computer Engineering Student
GitHub · LinkedIn

📄 License

MIT License — feel free to use this project as a reference.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
client		client
server		server
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💰 FinSight — Personal Finance Intelligence System

Problem Statement

Features

System Architecture

Anomaly Detection — How It Works

Database Schema

CSV Format

🔐 Authentication Flow

Tech Stack

Project Structure

Local Setup

API Endpoints

Key Engineering Decisions

What I'd Add at Scale

👨‍💻 Author

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

💰 FinSight — Personal Finance Intelligence System

Problem Statement

Features

System Architecture

Anomaly Detection — How It Works

Database Schema

CSV Format

🔐 Authentication Flow

Tech Stack

Project Structure

Local Setup

API Endpoints

Key Engineering Decisions

What I'd Add at Scale

👨‍💻 Author

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages