Skip to content

roulbac/titanpg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

titanpg

titanpg (TorchTitan Playground) is a learning project for understanding how to build and run training workloads with TorchTitan.

Current examples in this repo:

  • nanogpt (GPT-style language modeling)
  • dinov3 (self-supervised vision pre-training)

Why this repo exists

The goal is practical learning, not a polished framework. This repo is where we implement, test, and compare a small set of TorchTitan-based training examples end to end.

Project structure

  • train.py: unified launcher for all examples
  • nanogpt/: NanoGPT model, data pipeline, and parallelization wiring
  • dinov3/: DINOv3 SSL model, data pipeline, trainer, and infra hooks
  • docs/: guided walkthroughs (environment setup, data, first runs, scaling, debugging)
  • tests/: behavior and integration tests for launcher, models, and distributed setup

Quick start

1) Install dependencies

Using uv:

uv sync

Or using pip in a virtualenv:

pip install -e .

2) Run NanoGPT smoke training (1 GPU)

torchrun --standalone --nnodes=1 --nproc-per-node=1 train.py \
  --model.name=nanogpt \
  --model.flavor=gpt2_small \
  --training.dataset=tinyshakespeare \
  --training.seq_len=128 \
  --training.local_batch_size=4 \
  --training.steps=20

3) Run DINOv3 SSL smoke training (1 GPU)

torchrun --standalone --nnodes=1 --nproc-per-node=1 train.py \
  --model.name=dinov3 \
  --training.dataset_path='ImageFolder:root=/data/imagenet1k_hf:split=train' \
  --training.local_batch_size=8 \
  --training.steps=20

Data helpers

Prepare FineWeb shards for NanoGPT:

python nanogpt/populate_fineweb.py /data/edu_fineweb10B

Prepare ImageNet-style layout for DINOv3:

python dinov3/populate_imagenet.py /data/imagenet1k_hf train 0.33

Documentation

Start here:

Notes

  • Default launcher target is nanogpt if --model.name is omitted.
  • For --model.name=dinov3, launcher injects --job.custom_config_module=dinov3.job_config when not set.
  • This repository is intentionally narrow in scope: it currently focuses only on NanoGPT and DINOv3 SSL pre-training.

About

TorchTitan playground for distributed training with NanoGPT and DINOv3 SSL pre-training examples.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages