SAARTHI

System for Aadhaar Analytics, Risk & Trend Highlighting

SAARTHI is a data-driven analytical framework developed as part of the UIDAI Data Hackathon, aimed at uncovering meaningful societal and administrative insights from anonymised Aadhaar enrolment and update datasets.
The project introduces a novel metric — the Update Dependency Index (UDI) — and applies statistical anomaly detection to identify regions exhibiting abnormal update behaviour.

🎯 Problem Statement

Aadhaar enrolment and update activities reflect large-scale demographic, societal, and operational dynamics across India.
While high-level statistics are available, there is limited analytical visibility into post-enrolment update dependency, regional instability, and anomalous update patterns.

SAARTHI addresses this gap by:

Quantifying update dependency using a unified index
Detecting statistically abnormal regions
Translating analytics into actionable insights for governance and system improvement

🧠 Key Concepts

Update Dependency Index (UDI)

UDI measures how dependent Aadhaar records are on post-enrolment updates.

UDI = (Total Demographic Updates + Total Biometric Updates) / Total Enrolments

Low UDI → Stable Aadhaar lifecycle
High UDI → Frequent corrections or lifecycle-driven updates

Anomaly Detection

Statistical Z-score–based anomaly detection is applied to UDI values to flag regions with unusually high update dependency, representing potential risk signals or areas requiring administrative attention.

📊 Datasets Used

The analysis uses anonymised, aggregated datasets provided by UIDAI:

Aadhaar Enrolment Dataset
- Age-wise enrolment counts (0–5, 5–17, 18+)
- Spatial attributes: State, District, Pincode
- Temporal attribute: Date
Aadhaar Demographic Update Dataset
- Aggregated demographic update activity across age groups and regions
Aadhaar Biometric Update Dataset
- Aggregated biometric update information reflecting revalidation and lifecycle changes

Due to large data volume, datasets are provided as multiple state-wise CSV files and consolidated programmatically.

🛠 Methodology Overview

Consolidation of state-wise CSV datasets
Robust parsing of mixed-format date fields
Aggregation of age-wise enrolments and updates
Dataset integration using spatial and temporal keys
Computation of Update Dependency Index (UDI)
Statistical anomaly detection using Z-score analysis
Visualisation and interpretation of findings

📈 Key Insights

Most regions exhibit low update dependency, indicating stable Aadhaar lifecycles
A limited subset of districts and pincodes shows disproportionately high UDI values
Demographic and biometric updates jointly contribute to observed instability
Anomalous regions are structurally distinct from normal regions

🧪 Technology Stack

Language: Python
Libraries:
- Pandas
- NumPy
- SciPy
- Matplotlib
Environment: Jupyter Notebook

📁 Repository Structure

SAARTHI/
│
├── data/
│ ├── enrolment-data/
│ ├── demographic-data/
│ └── biometric-data/
|
├── saarthi.ipynb
├── SAARTHI.pdf
├── README.md
└── LICENSE

🚀 How to Run the Analysis

Clone the repository:

git clone https://github.com/your-username/SAARTHI.git
cd SAARTHI

Install required dependencies:

pip install pandas numpy scipy matplotlib

Launch Jupyter Notebook:
```
jupyter notebook
```
Open and run:
```
saarthi.ipynb
```

Ensure the dataset folders are placed under the data/ directory as shown above.

🔒 Data Privacy Notice

This project uses only anonymised and aggregated datasets provided for the UIDAI Data Hackathon. No personal or sensitive resident-level information is used or inferred.

👤 Author

Vrajkumar Shah

B.Tech, Computer Science & Engineering

Dharmsinh Desai University, Nadiad

📜 License

This project is released under the MIT License. See the LICENSE file for details.

⭐ Acknowledgements

Unique Identification Authority of India (UIDAI)
National Informatics Centre (NIC)
Ministry of Electronics and Information Technology (MeitY)

for providing the datasets and organising the hackathon.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAARTHI

System for Aadhaar Analytics, Risk & Trend Highlighting

🎯 Problem Statement

🧠 Key Concepts

Update Dependency Index (UDI)

Anomaly Detection

📊 Datasets Used

🛠 Methodology Overview

📈 Key Insights

🧪 Technology Stack

📁 Repository Structure

🚀 How to Run the Analysis

🔒 Data Privacy Notice

👤 Author

📜 License

⭐ Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
LICENSE		LICENSE
README.md		README.md
SAARTHI.pdf		SAARTHI.pdf
saarthi.ipynb		saarthi.ipynb

Folders and files

Latest commit

History

Repository files navigation

SAARTHI

System for Aadhaar Analytics, Risk & Trend Highlighting

🎯 Problem Statement

🧠 Key Concepts

Update Dependency Index (UDI)

Anomaly Detection

📊 Datasets Used

🛠 Methodology Overview

📈 Key Insights

🧪 Technology Stack

📁 Repository Structure

🚀 How to Run the Analysis

🔒 Data Privacy Notice

👤 Author

📜 License

⭐ Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages