Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 7 additions & 4 deletions 2.0/problems/erdos_unit_distance/evaluator.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
DISTANCE_REL_TOL = 1e-10
DISTANCE_ABS_TOL = 1e-10
MIN_SEPARATION = 1e-3
SCORE_POWER = 3.0


def _protect_evaluator_source() -> None:
Expand Down Expand Up @@ -227,14 +228,16 @@ def evaluate(solution_path: str) -> tuple[float, float, str]:
_validate_points(points)
unit_pairs = _count_unit_distance_pairs(points)

if unit_pairs <= 0:
score = 0.0
if unit_pairs <= BASELINE_EDGES:
raw_score = 0.0
else:
score = max(0.0, 100.0 * (unit_pairs - BASELINE_EDGES) / unit_pairs)
raw_score = 100.0 * (unit_pairs - BASELINE_EDGES) / unit_pairs
score = 100.0 * (raw_score / 100.0) ** SCORE_POWER
score_unbounded = score
message = (
f"N={N_POINTS}; unit_pairs={unit_pairs}; unit_distance={UNIT_DISTANCE:.12g}; "
f"baseline={BASELINE_EDGES}; "
f"baseline={BASELINE_EDGES}; score_power={SCORE_POWER:.12g}; "
f"raw_score={raw_score:.6f}; "
f"score={score:.6f}; score_unbounded={score_unbounded:.6f}"
)
return score, score_unbounded, message
Expand Down
15 changes: 12 additions & 3 deletions 2.0/problems/erdos_unit_distance/readme
Original file line number Diff line number Diff line change
Expand Up @@ -63,11 +63,20 @@ baseline = N
X = M
```

If the point set is invalid, or if `X <= baseline`, the score is `0`. Otherwise:
If the point set is invalid, or if `X <= baseline`, the score is `0`. Otherwise
the raw score is:

```text
score = 100 * (X - baseline) / X
raw_score = 100 * (X - baseline) / X
```

The reported score applies a cubic scale:

```text
score = 100 * (raw_score / 100)^3
```

This makes the simple `N`-pair baseline worth `0`, rewards every improvement
above the baseline, and avoids saturating the benchmark at a finite target.
above the baseline, and keeps high-scoring constructions from saturating the
benchmark too quickly. The bounded and unbounded score fields both report this
cubic-scaled score; evaluator messages also include `raw_score` for reference.
Loading
Loading