Add plies-until-progress moves-left head#241
Open
mooskagh wants to merge 3 commits into
Open
Conversation
Expose the per-frame plies_until_progress signal to training as a second moves-left-style scalar head, and clean up moves-left target selection. - Extend the values tensor from [6,3] to [6,4] ([q,d,m,p] per value type). Component 3 (p) holds plies_until_progress on the RESULT row only (NaN elsewhere); the 0xff "unknown" sentinel (truncated games / after the last progress move) maps to NaN. - MovesLeftLossConfig gains a Component enum (Q=0, D=1, MOVES_LEFT=2, PLIES_UNTIL_PROGRESS=3) selecting the target column, plus required scale (multiplier) and huber_delta (plies) fields. Q/D are rejected for moves-left losses. - Mask NaN targets in the data losses (MovesLeft/Value/ValueError/ ValueCategorical): sanitize before computing so gradients stay finite, then zero the loss. Fixes latent NaN poisoning from ORIG targets too. - Rename ply_until_progress -> plies_until_progress (and FillPliesUntilProgress) for consistency. - Add loss tests; demonstrate the progress head in docs/example.textproto.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Exposes the per-frame
plies_until_progresssignal (computed byFillPliesUntilProgress) to training as a second moves-left-style scalar head, and tidies up moves-left target selection.Data
[6,3] → [6,4]([q, d, m, p]per value type). Component 3 (p) carriesplies_until_progresson the RESULT row only (NaN elsewhere). The0xff"unknown" sentinel (truncated games / positions after the last progress move) maps to NaN.ply_until_progress → plies_until_progress(andFillPlyUntilProgress → FillPliesUntilProgress) for consistency.Loss config
MovesLeftLossConfiggains aComponentenum (Q=0, D=1, MOVES_LEFT=2, PLIES_UNTIL_PROGRESS=3) selecting the target column, plus requiredscale(a multiplier, e.g.0.05) andhuber_delta(threshold in plies).Q/Dare rejected for moves-left losses; no silent defaults.MovesLeftHeadis reused via config.NaN masking
MovesLeft/Value/ValueError/ValueCategorical) now zero their contribution per-sample when the target is NaN. Targets are sanitized before the loss so gradients stay finite, then masked. Also fixes latent NaN poisoning fromORIGtargets.Tests / docs
model/test_loss_function.py(component selection, Q/D rejection, required scaling, NaN masking with finite grads).tensor_generator_test/chunk_rescorer_testupdated for[6,4]+ the NaN sentinel.docs/example.textprotodemonstrates aprogresshead + loss.All checks green via
just pre-commit(clang-format, ruff, mypy, 16/16 C++ tests, 29/29 pytest).