titanpg

titanpg (TorchTitan Playground) is a learning project for understanding how to build and run training workloads with TorchTitan.

Current examples in this repo:

nanogpt (GPT-style language modeling)
dinov3 (self-supervised vision pre-training)

Why this repo exists

The goal is practical learning, not a polished framework. This repo is where we implement, test, and compare a small set of TorchTitan-based training examples end to end.

Project structure

train.py: unified launcher for all examples
nanogpt/: NanoGPT model, data pipeline, and parallelization wiring
dinov3/: DINOv3 SSL model, data pipeline, trainer, and infra hooks
docs/: guided walkthroughs (environment setup, data, first runs, scaling, debugging)
tests/: behavior and integration tests for launcher, models, and distributed setup

Quick start

1) Install dependencies

Using uv:

uv sync

Or using pip in a virtualenv:

pip install -e .

2) Run NanoGPT smoke training (1 GPU)

torchrun --standalone --nnodes=1 --nproc-per-node=1 train.py \
  --model.name=nanogpt \
  --model.flavor=gpt2_small \
  --training.dataset=tinyshakespeare \
  --training.seq_len=128 \
  --training.local_batch_size=4 \
  --training.steps=20

3) Run DINOv3 SSL smoke training (1 GPU)

torchrun --standalone --nnodes=1 --nproc-per-node=1 train.py \
  --model.name=dinov3 \
  --training.dataset_path='ImageFolder:root=/data/imagenet1k_hf:split=train' \
  --training.local_batch_size=8 \
  --training.steps=20

Data helpers

Prepare FineWeb shards for NanoGPT:

python nanogpt/populate_fineweb.py /data/edu_fineweb10B

Prepare ImageNet-style layout for DINOv3:

python dinov3/populate_imagenet.py /data/imagenet1k_hf train 0.33

Documentation

Start here:

Notes

Default launcher target is nanogpt if --model.name is omitted.
For --model.name=dinov3, launcher injects --job.custom_config_module=dinov3.job_config when not set.
This repository is intentionally narrow in scope: it currently focuses only on NanoGPT and DINOv3 SSL pre-training.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
dinov3		dinov3
docs		docs
nanogpt		nanogpt
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
train.py		train.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

titanpg

Why this repo exists

Project structure

Quick start

1) Install dependencies

2) Run NanoGPT smoke training (1 GPU)

3) Run DINOv3 SSL smoke training (1 GPU)

Data helpers

Documentation

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

titanpg

Why this repo exists

Project structure

Quick start

1) Install dependencies

2) Run NanoGPT smoke training (1 GPU)

3) Run DINOv3 SSL smoke training (1 GPU)

Data helpers

Documentation

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages