Skip to content

Add plies-until-progress moves-left head#241

Open
mooskagh wants to merge 3 commits into
LeelaChessZero:masterfrom
mooskagh:progress-head
Open

Add plies-until-progress moves-left head#241
mooskagh wants to merge 3 commits into
LeelaChessZero:masterfrom
mooskagh:progress-head

Conversation

@mooskagh

Copy link
Copy Markdown
Member

Exposes the per-frame plies_until_progress signal (computed by FillPliesUntilProgress) to training as a second moves-left-style scalar head, and tidies up moves-left target selection.

Data

  • Extend the values tensor [6,3] → [6,4] ([q, d, m, p] per value type). Component 3 (p) carries plies_until_progress on the RESULT row only (NaN elsewhere). The 0xff "unknown" sentinel (truncated games / positions after the last progress move) maps to NaN.
  • Rename ply_until_progress → plies_until_progress (and FillPlyUntilProgress → FillPliesUntilProgress) for consistency.

Loss config

  • MovesLeftLossConfig gains a Component enum (Q=0, D=1, MOVES_LEFT=2, PLIES_UNTIL_PROGRESS=3) selecting the target column, plus required scale (a multiplier, e.g. 0.05) and huber_delta (threshold in plies). Q/D are rejected for moves-left losses; no silent defaults.
  • The progress head needs no new model codeMovesLeftHead is reused via config.

NaN masking

  • Data losses (MovesLeft/Value/ValueError/ValueCategorical) now zero their contribution per-sample when the target is NaN. Targets are sanitized before the loss so gradients stay finite, then masked. Also fixes latent NaN poisoning from ORIG targets.

Tests / docs

  • New model/test_loss_function.py (component selection, Q/D rejection, required scaling, NaN masking with finite grads).
  • C++ tensor_generator_test / chunk_rescorer_test updated for [6,4] + the NaN sentinel.
  • docs/example.textproto demonstrates a progress head + loss.

All checks green via just pre-commit (clang-format, ruff, mypy, 16/16 C++ tests, 29/29 pytest).

Note: not yet verified end-to-end in the TUI.

mooskagh added 3 commits May 23, 2026 18:44
Expose the per-frame plies_until_progress signal to training as a second
moves-left-style scalar head, and clean up moves-left target selection.

- Extend the values tensor from [6,3] to [6,4] ([q,d,m,p] per value type).
  Component 3 (p) holds plies_until_progress on the RESULT row only (NaN
  elsewhere); the 0xff "unknown" sentinel (truncated games / after the last
  progress move) maps to NaN.
- MovesLeftLossConfig gains a Component enum (Q=0, D=1, MOVES_LEFT=2,
  PLIES_UNTIL_PROGRESS=3) selecting the target column, plus required scale
  (multiplier) and huber_delta (plies) fields. Q/D are rejected for
  moves-left losses.
- Mask NaN targets in the data losses (MovesLeft/Value/ValueError/
  ValueCategorical): sanitize before computing so gradients stay finite, then
  zero the loss. Fixes latent NaN poisoning from ORIG targets too.
- Rename ply_until_progress -> plies_until_progress (and FillPliesUntilProgress)
  for consistency.
- Add loss tests; demonstrate the progress head in docs/example.textproto.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant