DarkShield is a premium-style transparency tool that helps teams detect and explain dark patterns in e-commerce interfaces.
DarkShield combines:
- A Chrome extension for on-page detection
- A Flask API for classification and reporting
- A Streamlit dashboard for analytics and trends
- A local SQLite database for persistent evidence
In one sentence: it turns manipulative UI signals into readable risk insights with traceable evidence.
Dark patterns are design/copy tactics that pressure users into decisions they may not intend (for example: fake urgency, hidden fees, hard cancellation flows).
- What this means: users are nudged by friction, pressure, or confusion.
- Why it matters: these patterns can reduce informed consent and trust.
- Example: "Only 1 left" + hidden opt-out path can increase rushed purchases.
- DarkShield scans UI text snippets on supported domains.
- It classifies likely pattern types (local ONNX or API mode).
- It stores detections and site scores in SQLite.
- Dashboard + extension popup explain:
- what was flagged,
- why it was flagged,
- what risk level it implies.
┌─────────────────────────────────────────────────────────────────────┐
│ Chrome Extension (MV3) │
│ - content script extracts snippets │
│ - local ONNX inference or API fallback │
│ - highlights + popup explanations │
└──────────────────────┬──────────────────────────────────────────────┘
│ classify / report
▼
┌─────────────────────────────────────────────────────────────────────┐
│ Flask API │
│ - /api/classify, /api/site-score, /api/v1/* │
│ - optional compliance context mapping │
└──────────────────────┬──────────────────────────────────────────────┘
│ writes + reads
▼
┌─────────────────────────────────────────────────────────────────────┐
│ SQLite (data/darkshield.db) │
│ - sites, detections, score_history, user_reports │
└──────────────────────┬──────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ Streamlit Dashboard │
│ - leaderboard, vertical analysis, pattern report, A/B heuristic │
└─────────────────────────────────────────────────────────────────────┘
Core model flow:
- TF-IDF + Random Forest classifier
- Optional confirm-shaming second pass
- ONNX export for browser inference
- Evaluation outputs under
ml/models/saved/eval_results/
- Real-time snippet classification in extension
- Pattern-specific explanations in popup and dashboard
- Regulatory context mappings (educational, not legal advice)
- Domain leaderboard and risk score
- Vertical-level comparison and pattern distributions
- Site deep dive with trend lines and snippets
- DistilBERT benchmark against RF baseline
- Adversarial robustness scripts
- Human annotation agreement pipeline
cd /path/to/darkshield
python3 -m venv .venv
source .venv/bin/activate
pip install -U pip setuptools wheel
pip install -r requirements.txtpython -c "import nltk; nltk.download('wordnet', quiet=True); nltk.download('omw-1.4', quiet=True)"pip install -r requirements-bert.txtnpm install
cp node_modules/onnxruntime-web/dist/ort.min.js extension/vendor/
cp node_modules/onnxruntime-web/dist/*.wasm extension/vendor/source .venv/bin/activate
python ml/dataset/builder.py
python ml/models/train.py
python ml/models/confirm_shaming_nlp.py
python ml/models/export_onnx.py
python ml/models/evaluate.pysource .venv/bin/activate
python -m api.appIf port 5000 is busy:
PORT=5001 python -m api.appsource .venv/bin/activate
streamlit run dashboard/app.py- Open
chrome://extensions - Enable Developer mode
- Click Load unpacked
- Select the
extension/folder
- Manipulation Risk score
- Pattern breakdown by type + share
- Expandable details per pattern and snippet
- Leaderboard of high-risk domains
- Vertical comparison and pattern distributions
- Detailed site-level snippet evidence
- A/B variability heuristic report
Key scripts:
ml/models/bert_benchmark.py(DistilBERT comparison)ml/models/adversarial_test.py(robustness checks)ml/dataset/annotation_validator.py(agreement metrics)scripts/generate_research_summary.py(paper draft artifacts)
Main evaluation artifacts:
summary.json,metrics.json,chi_square_results.jsonconfusion_matrix.html,vertical_manipulation_bar.htmlvertical_pattern_heatmap.png
- Pattern detection is probabilistic, not legal adjudication.
- Domain support is bounded by extension host permissions.
- Some manipulative behavior is visual/interaction-only (harder to infer from text).
- Chi-square findings show association in this dataset, not causality.
- Wider domain coverage beyond current allowlist
- Better visual-pattern detection (layout and interaction cues)
- Human-in-the-loop review workflow in dashboard
- Time-aware drift monitoring and alerting
- Richer developer docs for deployment and CI
See full details in API_DOCS.md.
curl -s http://127.0.0.1:5000/api/classify \
-H 'Content-Type: application/json' \
-d '{"texts":["Only 2 left — ends in 00:15:00"],"include_compliance":true}'
curl -s 'http://127.0.0.1:5000/api/site-score?domain=amazon.in'
curl -s 'http://127.0.0.1:5000/api/v1/score?domain=amazon.in'DarkShield is for research and transparency workflows. Regulatory strings are illustrative and are not legal advice.
DarkShield is a dark-pattern transparency stack focused on Indian e-commerce: it classifies manipulative UI copy, stores aggregate scores in SQLite, exposes a Flask API, ships a Chrome extension (Manifest V3), and includes a Streamlit dashboard for exploration and reporting.
All data in this repository is synthetic, rule-labeled, or scraped from public pages for research — not user PII.
| Area | Features |
|---|---|
| ML core | TF-IDF + Random Forest pipeline; optional confirm-shaming cosine second pass; training-time text normalization (Unicode / leet / zero-width mitigations). |
| Dataset | Builder merges seed, scraped JSON, synthetic clean lines, Princeton DAPD–inspired samples, FTC-themed lines, CNIL-style lines; cosine dedupe (~0.95); augmentation; adversarial row copies; target 5k+ rows. |
| Evaluation | Holdout metrics, confusion matrix HTML, Cramér’s V, per-pattern χ² vs vertical, seaborn heatmap (vertical_pattern_heatmap.png), summary.json, metrics.json, chi_square_results.json. |
| Research / benchmark | ml/models/bert_benchmark.py — fine-tunes DistilBERT on the same 80/20 split as RF; writes model_comparison.json (also optional requirements-bert.txt + accelerate — not in core requirements.txt). |
| Robustness | ml/models/adversarial_test.py — obfuscation transforms, F1 drop report, JSON under eval_results/. |
| Human validation | ml/dataset/annotation_validator.py — stratified CSV + Cohen’s / Fleiss’ kappa → annotation_agreement.json. Guide: data/labeled/ANNOTATION_GUIDE.md. |
| ONNX / browser | ml/models/export_onnx.py — full pipeline → extension/models/darkshield_pipeline.onnx + labels.json; keyword_rules.json; extension/utils/onnx_classifier.js loads ONNX Runtime Web from extension/vendor/ (copy ort.min.js + .wasm from onnxruntime-web/dist/). |
| Extension UX | Local inference default (chrome.storage inference_mode: local | api); keyword fast-path + ONNX; popup shows score, breakdown, detailed expanders (explanation + each snippet), regulatory risk banner; options page for API base + mode. |
| API | /api/classify (optional include_compliance), /api/compliance-report, site score, leaderboard, stats, health; versioned public API in api/public_routes.py (/api/v1/score, leaderboard, categories, stats) with Flask-Limiter (wired in api/app.py). |
| Compliance | ml/compliance/regulatory_mapper.py — pattern → illustrative regulation snippets (educational only). |
| Database | score_history table + temporal_tracker.py (weekly snapshots); dashboard time-series + sale-window hints. |
| A/B heuristic | ml/analysis/ab_test_detector.py — multi–user-agent fetches + variance; dashboard tab. |
| Landing + docs | landing/index.html (static demo UI); API_DOCS.md — endpoints, curl, rate limits. |
| Paper draft | scripts/generate_research_summary.py → research_paper.pdf (reads real JSON metrics). |
darkshield/
├── api/
│ ├── app.py # Flask factory + CORS + public blueprint + limiter
│ ├── routes.py # classify, compliance, site-score, leaderboard, …
│ ├── public_routes.py # /api/v1/* (rate-limited)
│ └── schemas.py
├── dashboard/
│ └── app.py # Streamlit: leaderboard, verticals, site deep dive,
│ # Full pattern report, A/B tab, statistics
├── extension/
│ ├── manifest.json
│ ├── content.js # DOM scan; local ONNX or API
│ ├── popup.html / popup.js
│ ├── options.html / options.js
│ ├── models/ # ONNX, labels.json, keyword_rules.json (from export)
│ ├── vendor/ # ort.min.js + *.wasm (from onnxruntime-web)
│ └── utils/ # onnx_classifier.js, classifier.js, highlighter.js
├── landing/
│ └── index.html # Public API landing (point at your API origin)
├── ml/
│ ├── analysis/ # ab_test_detector.py
│ ├── compliance/ # regulatory_mapper.py
│ ├── dataset/ # builder, augmenter, external_datasets, annotation_validator, …
│ ├── database/ # transparency_db.py, temporal_tracker.py, site_scorer
│ ├── models/ # train, evaluate, export_onnx, bert_benchmark, adversarial_test, …
│ └── scraper/
├── scripts/
│ ├── generate_research_summary.py
│ └── gen_extension_icon.py # optional raster icons
├── data/
│ ├── labeled/ # dataset.csv, ANNOTATION_GUIDE.md, human_validation_set.csv (generated)
│ ├── scraped_raw/
│ └── darkshield.db # created/updated by evaluate.py + API usage
├── requirements.txt # Core stack (Flask, sklearn, skl2onnx, onnx, onnxruntime, …)
├── requirements-bert.txt # torch, transformers, accelerate — DistilBERT only
├── API_DOCS.md
└── README.md
- Python: 3.10–3.12 is the smoothest for ML wheels. 3.14 works for the core app if pip can install
onnx/skl2onnxwheels; DistilBERT may needrequirements-bert.txtor a 3.12 venv iftokenizersfails to install. - Chrome (for the extension and optional Selenium scraper).
cd /path/to/darkshield
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -U pip setuptools wheel
pip install -r requirements.txtNLTK (WordNet augmenter — may download on first augment):
python -c "import nltk; nltk.download('wordnet', quiet=True); nltk.download('omw-1.4', quiet=True)"Optional: DistilBERT benchmark
pip install -r requirements-bert.txtOptional: extension toolbar icon (PNG)
python scripts/gen_extension_icon.pyONNX Runtime Web (browser): copy ort.min.js and *.wasm from node_modules/onnxruntime-web/dist/ into extension/vendor/ so local inference works offline.
source .venv/bin/activate
cd /path/to/darkshield
# 1) Dataset (offline-capable; optional scrape feeds scraped_raw/)
python ml/dataset/builder.py
# 2) Train RF + persist pipeline
python ml/models/train.py
# 3) Confirm-shaming helper (if you use the merged eval path)
python ml/models/confirm_shaming_nlp.py
# 4) Export ONNX + keyword JSON for the extension
python ml/models/export_onnx.py
# 5) Evaluate, write plots/JSON, refresh SQLite from holdout logic
python ml/models/evaluate.pyOptional:
python ml/models/bert_benchmark.py # DistilBERT vs RF (research)
python ml/models/adversarial_test.py # Robustness JSON
python -m ml.database.temporal_tracker # Append score_history snapshots
python scripts/generate_research_summary.py # research_paper.pdf
python ml/dataset/annotation_validator.py build # human validation CSVAPI (default port 5000):
source .venv/bin/activate
python -m api.appIf port 5000 is busy (common on macOS — AirPlay or another process):
PORT=5001 python -m api.appSet the extension API base URL (Options) to match.
Dashboard:
source .venv/bin/activate
streamlit run dashboard/app.pyUse the sidebar → Full pattern report for plain-language pattern definitions, regulatory context, and every stored detection per site.
Extension: chrome://extensions → Developer mode → Load unpacked → select the extension/ folder. Reload the extension after code changes.
inference_mode(storage):local(default) = ONNX + keywords in the browser;api= POST to Flask/api/classify.- Console (local mode):
[DarkShield] Running local inference — no data leaves your browser. - Popup: manipulation score from DB, pattern counts, expandable detailed view (explanation + snippets), optional regulatory banner.
Full detail: API_DOCS.md.
Examples:
curl -s http://127.0.0.1:5000/api/classify \
-H 'Content-Type: application/json' \
-d '{"texts":["Only 2 left — ends in 00:15:00"],"include_compliance":true}'
curl -s 'http://127.0.0.1:5000/api/site-score?domain=amazon.in'
curl -s 'http://127.0.0.1:5000/api/v1/score?domain=amazon.in'Under ml/models/saved/eval_results/:
summary.json,metrics.json,chi_square_results.jsonconfusion_matrix.html,vertical_manipulation_bar.html,vertical_pattern_heatmap.png
Chi-square / Cramér’s V: tests association of vertical × dark prevalence on the holdout slice; small p means dependence in this dataset — not proof of causality.
Open landing/index.html in a browser (or serve statically). Point the demo API base field at your running Flask origin (e.g. http://127.0.0.1:5000).
┌─────────────────┐ local ONNX / optional API ┌─────────────────┐
│ Chrome extension│ ◄────────────────────────────► │ Flask API │
│ (DOM + keywords)│ │ (RF + confirm) │
└────────┬────────┘ └────────┬────────┘
│ │
│ transparency score / detections │
▼ ▼
┌─────────────────────────────────────────────────────────────────────────┐
│ SQLite (darkshield.db) — sites, detections, score_history, user_reports │
└─────────────────────────────────────────────────────────────────────────┘
▲
│ reads
┌────────┴────────┐
│ Streamlit │
│ dashboard │
└─────────────────┘
Regulatory strings in the codebase are illustrative and not legal advice. The extension and API are intended for research and transparency, not automated enforcement.