-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Pull requests: openai/parameter-golf
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Progressive Depth Training — val_bpb 1.1980
#835
opened Mar 26, 2026 by
iverbovoy
Loading…
4 tasks done
Record: 0.1663 BPB - N-gram-Aware Training + Frozen N-gram Oracle + Backoff TTT
#834
opened Mar 26, 2026 by
AnirudhRahul
Loading…
6 tasks done
Non-record: Byte-level transformer + JEPA auxiliary loss (val_bpb: 1.1903)
#832
opened Mar 26, 2026 by
jfprincz
Loading…
Research: Why Novel Architectures Fail at 16MB — Throughput-Quantization Co-optimization
#831
opened Mar 26, 2026 by
sseanliu
Loading…
3 tasks done
Non-record: LeakyMixer: 11L leaky_relu(0.5)^2 + backoff n-gram mixer
#830
opened Mar 26, 2026 by
zlxi02
Loading…
Record: 0.9076 BPB — 10L + N-gram Backoff + Matrix LR 0.03
#828
opened Mar 26, 2026 by
bigbag
Loading…
4 of 5 tasks
Record: LeakyReLU² + XSA4 + LN Scale + Partial RoPE — val_bpb 1.3999
#827
opened Mar 26, 2026 by
Programmerryoki
Loading…
Record: Order-Adaptive BackoffMixer (mean val_bpb=0.5440)
#825
opened Mar 26, 2026 by
hypery11
Loading…
4 tasks done
GatedAttn + ValueResid + XSA6 + HedgeMixer + Legal TTT — val_bpb: 1.08965 (3-seed mean)
#824
opened Mar 26, 2026 by
sahiee-dev
Loading…
Add non-record 16MB submission: MOEA outer-loop proxy F2 on 1xRTX 3090
#823
opened Mar 26, 2026 by
ai-wes
Loading…
Add baseline and depth recurrence submissions (1xH100 20min runs)
#822
opened Mar 26, 2026 by
henrycashe26
Loading…
[non-record] Masked Diffusion Language Model (val_var_bpb=1.625)
#820
opened Mar 26, 2026 by
mtybadger
Loading…
Add PES RosehipV1: Precision Error Signal + frontier stack
#819
opened Mar 26, 2026 by
Tetrahedroned
Loading…
Record submission: Poly5 Softcap + BigramHash(3072) + Wider GPTQ-lite…
#816
opened Mar 26, 2026 by
jimliu741523
Loading…
Record: X-WING 3D Cubric + Complementary Training (val_bpb=0.4820)
#814
opened Mar 26, 2026 by
newjordan
Loading…
6 tasks done
Record: BackoffNgramMixer (mean val_bpb=0.6671)
#813
opened Mar 26, 2026 by
hypery11
Loading…
4 tasks done
[non-record track] BankLinear: cross-layer shared weight bank with learned + random mixtures
#812
opened Mar 26, 2026 by
andrewmouldon
Loading…
Record: Per-Order Adaptive Alpha + N-gram Backoff (val_bpb=0.2995, 3-seed)
#810
opened Mar 26, 2026 by
Idan3011
Loading…
Record: Chunk-Based N-gram Backoff + Score-First TTT (0.295 BPB)
#809
opened Mar 26, 2026 by
AayushBaniya2006
Loading…
Record: 0.6364 BPB - Depth Recurrence + Multi-Order N-gram Backoff
#808
opened Mar 26, 2026 by
Naazimsnh02
Loading…
Non-record: Sequential Momentum TTT (val_bpb=1.0116, 3-seed mean, 4xA10G)
#807
opened Mar 26, 2026 by
connectwithprakash
Loading…
3 tasks
Record: Backoff N-gram Cache + LeakyReLU(0.9)² (val_bpb=0.6678)
#806
opened Mar 26, 2026 by
ibarrajo
Loading…
2 of 4 tasks
[Non-record] CAGE5 Colab T4 smoke: strictly causal 5-gram mixer
#804
opened Mar 26, 2026 by
Devchandrasen
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.