Skip to content

Latest commit

 

History

History
394 lines (267 loc) · 7.76 KB

File metadata and controls

394 lines (267 loc) · 7.76 KB

Getting Started

This guide walks through the current git-drs workflow on the cleaned CLI path.

Navigation: Installation -> Getting Started -> Commands Reference -> Troubleshooting

What git-drs Does

git-drs manages:

  • Git-compatible pointer files
  • local DRS metadata
  • remote Syfon/Gen3 configuration
  • pointer hydration and object registration workflows

It no longer tries to be a mixed bag of Git, Git LFS, and DRS transport wrappers.

Cloning an Existing Repository

  1. Clone the repository:

    git clone <repo-clone-url>.git
    cd <name-of-repo>
  2. If you use SSH remotes, make sure your SSH setup is already working for that host.

    A typical keepalive configuration looks like:

    Host github.com
        TCPKeepAlive yes
        ServerAliveInterval 30
    
  3. Initialize git-drs in the repo:

    git drs init
  4. Hydrate tracked files if needed:

    git drs pull

This is the normal onboarding flow for an existing repo. git drs pull hydrates pointer files already present in the checkout. It does not replace git pull.

One-Time Machine Setup

Install git-drs and the global Git filter configuration:

git drs install

One-Time Repository Setup

After cloning or creating a repository:

git drs init

That sets up repository-local git-drs state and hooks.

Add a Gen3 Remote

The current shape is:

git drs remote add gen3 [remote-name] <organization/project> [--cred <file> | --token <token>]

Example:

git drs remote add gen3 production HTAN_INT/BForePC --cred /path/to/credentials.json

Notes:

  • scope is one positional argument: organization/project
  • users do not provide --bucket
  • users do not provide --url
  • bucket resolution is scope-based and server-backed

Verify:

git drs remote list

New Repository Setup

For a new repository or a repository that has not yet been configured with git-drs:

  1. Initialize the repository:

    git drs init
  2. Add the target remote:

    git drs remote add gen3 production HTAN_INT/BForePC --cred /path/to/credentials.json
  3. Verify the configuration:

    git drs remote list

Steward/Admin Prerequisite

Push and pull depend on server-side bucket mapping for the target scope.

That usually means a steward/admin has already done something like:

git drs bucket add production \
  --bucket cbds \
  --region us-east-1 \
  --access-key "$AWS_ACCESS_KEY_ID" \
  --secret-key "$AWS_SECRET_ACCESS_KEY"

git drs bucket add-organization production \
  --organization HTAN_INT \
  --path s3://cbds/htan-int

git drs bucket add-project production \
  --organization HTAN_INT \
  --project BForePC \
  --path s3://cbds/htan-int/bforepc

End users generally should not need to know the bucket name.

Credentials

For Gen3-backed deployments:

  • obtain a credential JSON or token from the target data commons
  • the common path is: log in -> profile -> create API key -> download JSON
  • refresh it when it expires
  • re-run git drs remote add gen3 ... --cred ... when you need to refresh the stored profile

Example:

git drs remote add gen3 production HTAN_INT/BForePC --cred /path/to/new-credentials.json

Managing Additional Remotes

You can add multiple remotes for multi-environment workflows.

git drs remote add gen3 staging HTAN_INT/BForePC --cred /path/to/staging-credentials.json
git drs remote list
git drs remote set staging

Or target a non-default remote for a single command:

git drs push production
git drs copy-records staging production HTAN_INT/BForePC

Track Files

Track file types or paths you want managed by git-drs:

git drs track "*.bam"
git add .gitattributes

You can also track explicit paths or path globs:

git drs track "data/**"
git add .gitattributes

View current tracking:

git drs track

Stop tracking patterns:

git drs untrack "*.bam"
git add .gitattributes

Add, Commit, and Push

git add sample.bam
git commit -m "Add sample"
git push

git-drs handles pointer/object registration behavior around the Git workflow.

Inspect Tracked Files

Use ls-files as the local inventory command:

git drs ls-files
git drs ls-files -l
git drs ls-files --drs
git drs ls-files -I "*.bam"

Interpretation:

  • * means localized/hydrated in the worktree
  • - means the worktree still contains a pointer

Hydrate Files

Use git drs pull only for hydration.

git drs pull
git drs pull -I "*.bam"
git drs pull -I "results/**" -I "*.txt"

Important:

  • git drs pull does not run git pull
  • run plain git pull yourself when you want new commits/trees
  • then run git drs pull if you need to hydrate pointer files in the checkout

Add Existing Bucket Objects

If the object already exists in provider storage, use add-url:

# Track the file pattern first
git drs track "myfile.txt"
git add .gitattributes

# Add object reference (known sha256 path)
git drs add-url s3://bucket/path/to/file myfile.txt \
  --sha256 <file-hash>

# Or use unknown-sha
git drs add-url s3://bucket/path/to/file myfile.txt

# Commit and push
git add myfile.txt
git commit -m "Add S3 file reference"
git push

Scoped bucket-key mode also works:

git drs add-url path/to/object.bin data/from-bucket.bin --scheme s3
git commit -m "Add bucket-backed object reference"
git push

Explicit provider URL mode also works:

git drs add-url s3://my-bucket/path/to/object.bin data/from-bucket.bin

Session Workflow

Note: You do not need to run git drs init again. Initialization is a one-time setup per local repository clone.

For a normal work session:

  1. Refresh credentials if needed

    git drs remote add gen3 production HTAN_INT/BForePC --cred /path/to/new-credentials.json
  2. Update Git history if needed

    git pull
  3. Hydrate tracked files if needed

    git drs pull
  4. Work with files normally

    git add ...
    git commit -m "..."
    git push

Configuration Management

View current remote configuration:

git drs remote list

Refresh or update credentials by re-adding the remote:

git drs remote add gen3 production HTAN_INT/BForePC --cred /path/to/new-credentials.json

Local DRS Server Setup

Use this flow when developing against a local Syfon/DRS server instead of a hosted Gen3 deployment.

  1. Initialize the repo:

    git drs init
  2. Add the local remote:

    git drs remote add local origin http://localhost:8080

    If your local server requires basic auth, include the local auth flags supported by that command.

  3. Track and push:

    git drs track "*.bin"
    git add .gitattributes data/example.bin
    git commit -m "Add local DRS test file"
    git drs push
  4. Verify hydration:

    git drs pull

For full local/remote runbooks, see E2E Modes + Local Setup.

Copy Metadata Between Remotes

Use copy-records to copy Syfon metadata records between remotes for a single scope:

git drs copy-records dev prod HTAN_INT/BForePC

Or let the default remote be the source:

git drs copy-records prod HTAN_INT/BForePC

This copies metadata only. It does not copy object bytes between buckets.

Common Flow Summary

git drs install
git drs init
git drs remote add gen3 production HTAN_INT/BForePC --cred /path/to/credentials.json
git drs track "*.bam"
git add .gitattributes
git add sample.bam
git commit -m "Add sample"
git push
git drs ls-files
git drs pull -I "*.bam"

For command details, see commands.md.