Skip to content

feat: add MoLT (Mixture of Linear Transforms) model#21

Open
Ky-Ng wants to merge 1 commit into
Goreg12345:masterfrom
Ky-Ng:feat/molt-on-master
Open

feat: add MoLT (Mixture of Linear Transforms) model#21
Ky-Ng wants to merge 1 commit into
Goreg12345:masterfrom
Ky-Ng:feat/molt-on-master

Conversation

@Ky-Ng
Copy link
Copy Markdown
Collaborator

@Ky-Ng Ky-Ng commented Apr 22, 2026

Port the MoLT model from the original pre-reorg branch onto the current crosslayer_transcoder/ package layout.

  • New crosslayer_transcoder/model/molt.py containing the Molt nn.Module (paired U/V factors across a configurable list of ranks, a shared gate produced by an arbitrary nonlinearity, and a transform_norm() helper for weighted sparsity).
  • New MoltModule in crosslayer_transcoder/model/clt_lightning.py built on top of CrossLayerTranscoderModule: single-layer training loop, sparsity penalty on gate * ||U@V||, and forward override.
  • CrossLayerTranscoderModule.init accepts Union[CrossLayerTranscoder, Molt] and sizes last_active from model.n_features when the model is a Molt.
  • JumpReLU supports n_layers=1 by dropping the layer dimension from theta (shape becomes (1, d_features)). n_layers > 1 is unchanged.
  • config/molt.yaml with class_paths wired to the current package.

Port the MoLT model from the original pre-reorg branch onto the current
crosslayer_transcoder/ package layout.

- New crosslayer_transcoder/model/molt.py containing the Molt nn.Module
  (paired U/V factors across a configurable list of ranks, a shared gate
  produced by an arbitrary nonlinearity, and a transform_norm() helper
  for weighted sparsity).
- New MoltModule in crosslayer_transcoder/model/clt_lightning.py built
  on top of CrossLayerTranscoderModule: single-layer training loop,
  sparsity penalty on gate * ||U@V||, and forward override.
- CrossLayerTranscoderModule.__init__ accepts Union[CrossLayerTranscoder,
  Molt] and sizes last_active from model.n_features when the model is a
  Molt.
- JumpReLU supports n_layers=1 by dropping the layer dimension from
  theta (shape becomes (1, d_features)). n_layers > 1 is unchanged.
- config/molt.yaml with class_paths wired to the current package.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants