Record: Order-12 N-gram Backoff + 256K Chunks — 0.2834 BPB by quietsmile · Pull Request #843 · openai/parameter-golf

quietsmile · 2026-03-26T12:03:37Z

Summary

Extended eval-time n-gram backoff from order 9 to order 12 with 6 additional hash primes
Reduced chunk size from 1M to 256K tokens for 4x faster cache refresh during eval
Increased alpha_max from 0.60 to 0.70 for stronger n-gram mixing at high entropy

val_bpb: 0.2834 (2-seed mean, std 0.0001) | ~13.4 MB artifact | 525s training + 431s eval

Seed	Pre-Quant BPB	N-gram BPB
1337	1.1454	0.2835
42	1.1454	0.2833

Improvement over PR #809 (0.2952 BPB): -0.0118 BPB

All changes are eval-time only. No training modifications. Score-first compliance maintained.

Test plan

2-seed validation on 8xL20Z (H100 equivalent)
Artifact size under 16MB (13.4MB)
Training under 600s (525s)
Eval under 600s (437s total)
Score-first compliance verified

🤖 Generated with Claude Code

Key innovation: reduce NGRAM_EVAL_CHUNK_TOKENS from 1M to 65K. The N-gram cache updates after each chunk, so smaller chunks mean more frequent cache refreshes and richer n-gram statistics. Results (3-seed mean): 0.2873 BPB (std 0.0001) Fully legal: no pre-eval TTT, score-first N-gram only. 11L 512d GQA 8/4, MLP 3.0x, XSA-4, LeakyReLU(0.9)², BigramHash(4096), GPTQ int5, LZMA. 600s train + 405s eval. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Extended eval-time n-gram backoff from order 9 to order 12, reduced chunk size from 1M to 256K tokens for faster cache refresh, and increased alpha_max from 0.60 to 0.70. Two-seed validation: 0.2835 (seed=1337), 0.2833 (seed=42). Improvement over PR openai#809 baseline: -0.0118 BPB. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

quietsmile and others added 2 commits March 26, 2026 11:02

notapplica mentioned this pull request Mar 26, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

quietsmile mentioned this pull request Mar 26, 2026

Record: Two-Pass Order-12 N-gram Backoff + 256K Chunks — 0.1315 BPB #853

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: Order-12 N-gram Backoff + 256K Chunks — 0.2834 BPB#843

Record: Order-12 N-gram Backoff + 256K Chunks — 0.2834 BPB#843
quietsmile wants to merge 2 commits intoopenai:mainfrom
quietsmile:submission/order12-chunk256k-alpha070

quietsmile commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

quietsmile commented Mar 26, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant