Skip to content

Profiler: support an optional reference population baseline (--reference) #56

Description

@yakew7

Summary

faircode/SPEC.md §2 mentions an optional reference distribution, but it is not implemented. Add a --reference baseline.csv flag (CLI) and an optional reference upload (web) so a dataset can be scored against a real population baseline (e.g. US Census age×sex), not just internal balance.

Why

"Balanced internally" ≠ "representative of the target population." A reference baseline catches under-sampling relative to who the model will actually serve.

Tasks

  • Define reference format in SPEC.md (column, expected share)
  • Implement in faircode/profiler.py and mirror in assets/profiler-engine.js
  • Surface deviation as flags + a per-group expected-vs-actual delta
  • Tests in tests/test_profiler.py; keep CLI/web bit-for-bit identical

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions