perf: add database indexes and optimize dashboard query for production-scale workloads by KaparthyReddy · Pull Request #268 · utksh1/SecuScan

KaparthyReddy · 2026-05-23T16:51:14Z

Description

Profiles and optimizes the four hot query paths identified in the issue scope: dashboard aggregation, findings list, reports list, and task queries.

Query optimization (routes.py):

Replaced SELECT * FROM findings full table load + Python-side severity counting loop with a single SELECT severity, COUNT(*) GROUP BY severity DB-level aggregation — reduces dashboard latency from O(n) to O(1) on large finding datasets
Replaced full findings fetch for recent_findings with SELECT ... ORDER BY discovered_at DESC LIMIT 5 — only 5 rows transferred instead of the full collection
Added SELECT COUNT(*) AS total FROM findings for total count instead of len(all_findings) after full fetch

Index additions (database.py + migrations/001_add_performance_indexes.sql):

idx_tasks_status_created(status, created_at DESC) — composite index for dashboard running tasks query; eliminates full scan + filter
idx_findings_severity — supports GROUP BY severity aggregation
idx_findings_task_id — supports foreign key lookups from tasks
idx_findings_discovered_at DESC — supports ORDER BY on findings list
idx_findings_plugin_id, idx_findings_target — common filter columns
idx_findings_task_severity(task_id, severity) — composite for per-task severity breakdown
idx_reports_task_id, idx_reports_generated_at DESC, idx_reports_status — reports list and filter queries
idx_audit_timestamp DESC, idx_audit_event_type, idx_audit_task_id — audit log queries

Related Issues

Closes #257

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update

How Has This Been Tested?

Integration tests — testing/backend/integration/test_database_indexes.py (14 tests, all passing):

10 index existence tests: verify every new index is present after schema migration
test_dashboard_severity_counts_correct: seeds 50 findings (10 per severity), asserts exact counts from dashboard endpoint
test_dashboard_recent_findings_limit: seeds 200 findings, asserts recent_findings length ≤ 5
test_dashboard_empty_findings: asserts correct zero-state response with no findings
test_dashboard_severity_counts_with_single_severity: seeds 15 critical findings, asserts all other severities return 0

Benchmark script — scripts/benchmark_db.py:

Seeds 10,000 findings and 1,000 tasks
Prints EXPLAIN QUERY PLAN output for hot paths
Reports avg/min/max execution time over 10 runs for each query

Run tests:

python -m pytest testing/backend/integration/test_database_indexes.py -v

Run benchmark:

python scripts/benchmark_db.py

Checklist

My code follows the code style of this project.
I have performed a self-review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have made corresponding changes to the documentation.
My changes generate no new warnings.

- Add composite index idx_tasks_status_created(status, created_at DESC) for dashboard running tasks query — eliminates full scan + filter - Add findings indexes: severity, task_id, discovered_at DESC, plugin_id, target, and composite (task_id, severity) for grouped severity counts - Add reports indexes: task_id, generated_at DESC, status - Add audit_log indexes: timestamp DESC, event_type, task_id - Refactor dashboard query: replace SELECT * FROM findings full table load + Python-side severity counting with a single DB-level GROUP BY query — reduces dashboard latency from O(n) to O(1) on large datasets - Fetch only 5 most recent findings for dashboard instead of entire table - Add migration script 001_add_performance_indexes.sql for existing DBs

- Add composite index idx_tasks_status_created(status, created_at DESC) for dashboard running tasks query — eliminates full scan + filter - Add findings indexes: severity, task_id, discovered_at DESC, plugin_id, target, and composite (task_id, severity) for grouped severity counts - Add reports indexes: task_id, generated_at DESC, status - Add audit_log indexes: timestamp DESC, event_type, task_id - Refactor dashboard query: replace SELECT * FROM findings full table load + Python-side severity counting with a single DB-level GROUP BY query — reduces dashboard latency from O(n) to O(1) on large datasets - Fetch only 5 most recent findings for dashboard instead of entire table - Add migration script 001_add_performance_indexes.sql for existing DBs - Add integration tests: verify all 10 new indexes exist post-migration, verify dashboard severity counts correct on seeded dataset, verify recent_findings limited to 5 regardless of total count - Add benchmark script scripts/benchmark_db.py: seeds 10k findings + 1k tasks, prints EXPLAIN QUERY PLAN and timed results for hot paths

…st compatibility

KaparthyReddy added 3 commits May 23, 2026 22:01

fix: remove redundant fetchone for last_scan_time to restore cache te…

47816c6

…st compatibility

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: add database indexes and optimize dashboard query for production-scale workloads#268

perf: add database indexes and optimize dashboard query for production-scale workloads#268
KaparthyReddy wants to merge 3 commits into
utksh1:mainfrom
KaparthyReddy:perf/database-indexes-and-query-optimization

KaparthyReddy commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

KaparthyReddy commented May 23, 2026

Description

Related Issues

Type of Change

How Has This Been Tested?

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant