Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models. CVPR Oral 2025.
-
Updated
Apr 4, 2025 - Python
Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models. CVPR Oral 2025.
Next-token prediction in JavaScript — build fast language and diffusion models.
Generative model for sequential recommendation based on Convolution Neural Networks (CNN))
[ICLR 2025] Where Am I and What Will I See : An Auto-Regressive Model for Spatial Localization and View Prediction
Curated collection of research on the limitations of next-token prediction and methods that go beyond it.
LLM pipeline: data→tokenizer→attention→GPT train/eval→instruction FT→sampling. Reproducible, clean configs, RTX-4060 defaults, ready for AMP/LoRA/DDP.
Training repo for Toy GPT: context-3 model + small structured corpus (001_animals.txt)
Training repo for Toy GPT: Context-2 model with embeddings + small domain corpus (030_analytics.txt). Much more efficient use of space.
Training repo for Toy GPT: unigram + small neutral corpus (000_cat_dog.txt)
Implementation of a simplified GPT model trained on the Friends Series' dialogue dataset to generate dialogue-like text using a transformer architecture with multi-head attention and feed-forward networks. Includes training scripts, model definition, and text generation functionality.
Training repo for Toy GPT: context-2 model + small domain corpus (010_llm_glossary.txt)
Training repo for Toy GPT: unigram model + small structured corpus (001_animals.txt)
ngram language model next token prediction
Training repo for Toy GPT: bigram model + small domain corpus (010_llm_glossary.txt)
Training repo for Toy GPT: Context-3 model + small neutral corpus (000_cat_dog.txt)
Interactive visualization of next-word (token) prediction in GPT-style language models.
Training repo for Toy GPT: Context-3 model with attention + small domain corpus (030_analytics.txt). Attention requires scale.
Training repo for Toy GPT: bigram model + small structured corpus (001_animals.txt)
Training repo for Toy GPT: context-2 model + small structured corpus (001_animals.txt)
Training repo for Toy GPT: Context-2 model with attention + small domain corpus (030_analytics.txt). Attention requires scale.
Add a description, image, and links to the next-token-prediction topic page so that developers can more easily learn about it.
To associate your repository with the next-token-prediction topic, visit your repo's landing page and select "manage topics."