Skip to content

ksd3/jobber

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Jobber Documentation

Jobber is a helper to:

  • Build Docker images (optionally from templates)
  • Push images to AWS ECR or GCP Artifact Registry
  • Submit training jobs to AWS SageMaker or GCP Vertex AI (config-driven defaults)
  • Sync local data to S3 or GCS and seed empty data prefixes

Contents

  • configuration.md: Config file format, merging, hyperparameters.
  • cli.md: Command-by-command reference with examples.
  • templates.md: Dockerfile templates (list/show/add/delete).
  • docker.md: Building images, .dockerignore, local smoke tests.
  • sagemaker.md: Channel paths, hyperparameters, outputs, ensure_data.
  • troubleshooting.md: Common issues and fixes.

Prerequisites

  • Docker installed (and NVIDIA Container Toolkit if using GPU images).
  • AWS CLI installed and configured (aws configure).
  • Python 3.10+ with pip install -e . (or uv pip install -e .).
  • SageMaker execution role ARN with S3/List/Get/Put and ECR pull permissions.
  • For ECR push: IAM perms to create repo/login/push.
  • (Optional but recommended) uv installed from https://github.com/astral-sh/uv

How to install

Using uv (recommended):

uv pip install -e .

Using pip (inside your venv):

pip install -e .

Quick sanity check

jobber --help

Quickstart (no config)

# 1) Build (render a template)
jobber build --image my-training --template gpu-cu121

# 2) Push to ECR
jobber push --image my-training --repo my-training --region us-east-1

# 3) Submit a SageMaker job
jobber submit \
  --image-uri <acct>.dkr.ecr.us-east-1.amazonaws.com/my-training:latest \
  --role-arn arn:aws:iam::<acct>:role/<sagemakerrole> \
  --bucket your-bucket --prefix custom-run \
  --entry-point train.py --source-dir code-bundle \
  --param epochs=5 --param batch-size=64 \
  --tail-logs

# 3b) Submit a Vertex AI job (GCP)
jobber submit \
  --provider gcp \
  --project my-gcp-project \
  --region us-central1 \
  --image-uri us-central1-docker.pkg.dev/my-gcp-project/my-repo/my-training:latest \
  --gcs-bucket your-gcs-bucket --gcs-prefix custom-run \
  --entry-point train.py \
  --param epochs=5 --param batch-size=64 \
  --machine-type a2-highgpu-1g \
  --accelerator-type NVIDIA_TESLA_A100 --accelerator-count 1

Quickstart (with config)

  1. Create a config (prompts; choose aws or gcp):
    jobber init --path jobber.yml
  2. Edit jobber.yml (bucket/prefix/image-uri/params/etc.).
  3. Run:
    jobber build --config jobber.yml
    jobber push  --config jobber.yml
    jobber submit --config jobber.yml --tail-logs

Data upload

  • Auto-placeholder: enabled by default (ensure_data), seeds prefix/data/placeholder.txt if empty.
  • Upload real data:
    jobber sync-data --src ./mnist_data --dest s3://bucket/prefix/data --region us-east-1
    jobber sync-data --provider gcp --src ./mnist_data --dest gs://bucket/prefix/data --region us-central1

About

Python package for programmatic job submission to cloud platforms

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors