README

NewComputeBench is a project to develop a benchmark suite for the new compute paradigm (Spiking neural networks, Optical computation, In-Memory computation, etc). The project is divided into three main components:

Model Training
Model Behavior-Level Simulation
Hardware-Performance Simulation

Model Training

ARIA-LLaMA (LLaMA with Group Query Attention)

We aim to support the following features:

Pretraining;
- Model Architecture: ARIA_LLM
- Transformer building block: GPT, Block, CausalSelfAttention, etc
- Datasets: src/aria_models/data
Generation (inference)
- Generation function: LLM.generate(...)
🚧 TODO: Supervised-fine-tuning
🚧 TODO: LoRA fine-tuning;
🚧 TODO: Evaluation

ARIA-LLM-135M:

Model config ARIA_LLM["ARIA-LLM-135M"]
Pretraining data: TinyStores

Pretraining scripts: aria-llm.py and justfile

# the justfile wraps the following commands
# for preprocess data and pretrain aria-llama-135m with 3B tokens
just aria-135m

ARIA-LLM-1.1B:

🚧 TODO ARIA-LLM-1B (We aim to scale the ARIA-LLM-135M model to 1B parameters and pretrain with 3T tokens)
- Model config
- Pretraining data: SlimPajama
- Pretraining scripts
- Supervised Fine-tuning data SmolTalk
- Supervised Fine-tuning scripts

TBD-8B

🚧 TODO: 8B (We aim to fine-tune a LLM around 7B parameters using LoRA)
- LoRA fine-tuning data
- LoRA fine-tuning scripts

Model Behavior Simulation

🚧 TODO

Hardware-Performance Simulation

🚧 TODO

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
experiments/llm/pretrain		experiments/llm/pretrain
src		src
submodules		submodules
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

README

Model Training

ARIA-LLaMA (LLaMA with Group Query Attention)

ARIA-LLM-135M:

ARIA-LLM-1.1B:

TBD-8B

Model Behavior Simulation

Hardware-Performance Simulation

About

Uh oh!

Releases

Packages

Languages

AICrossSim/NewComputeBench-archived

Folders and files

Latest commit

History

Repository files navigation

README

Model Training

ARIA-LLaMA (LLaMA with Group Query Attention)

ARIA-LLM-135M:

ARIA-LLM-1.1B:

TBD-8B

Model Behavior Simulation

Hardware-Performance Simulation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages