Summary
faircode/SPEC.md §2 mentions an optional reference distribution, but it is not implemented. Add a --reference baseline.csv flag (CLI) and an optional reference upload (web) so a dataset can be scored against a real population baseline (e.g. US Census age×sex), not just internal balance.
Why
"Balanced internally" ≠ "representative of the target population." A reference baseline catches under-sampling relative to who the model will actually serve.
Tasks
Summary
faircode/SPEC.md§2 mentions an optional reference distribution, but it is not implemented. Add a--reference baseline.csvflag (CLI) and an optional reference upload (web) so a dataset can be scored against a real population baseline (e.g. US Census age×sex), not just internal balance.Why
"Balanced internally" ≠ "representative of the target population." A reference baseline catches under-sampling relative to who the model will actually serve.
Tasks
SPEC.md(column, expected share)faircode/profiler.pyand mirror inassets/profiler-engine.jstests/test_profiler.py; keep CLI/web bit-for-bit identical