Small Python utility to turn a local directory into one LLM-friendly text digest.
It walks a directory, prints the included file tree, and appends the content of each included text file into a single output text file.
python -m pip install -r requirements.txtUseful when:
- a repo is private
- you want to share project context with an LLM without pushing the repo
- you want a simple local alternative to remote ingestion tools
- scans a local directory only
- applies the source directory's
.gitignoreby default - skips common cache / build / binary files
- includes code, config, and doc files up to a configurable size limit
- supports include / exclude glob filters
- can convert
.ipynbnotebooks into readable text - estimates tokens with
tiktokenusing theo200k_baseencoding - can print lightweight scan progress
- writes to a file or stdout
- Python 3.9+
pathspec>=0.12.1tiktoken>=0.7.0
python local_dir_ingest.py /path/to/projectThis writes digest.txt in the current directory.
python local_dir_ingest.py /path/to/project -o project_digest.txtpython local_dir_ingest.py /path/to/project -o -python local_dir_ingest.py /path/to/project --progresspython local_dir_ingest.py /path/to/project --max-file-kb 200python local_dir_ingest.py /path/to/project -i "*.py" -i "*.md"python local_dir_ingest.py /path/to/project -e "data/*" -e "*.pt"python local_dir_ingest.py /path/to/project --include-notebook-outputpython local_dir_ingest.py /path/to/project --follow-symlinkspython local_dir_ingest.py /path/to/project --no-gitignoreThe generated digest contains:
- source directory summary
- traversal stats
- included directory tree
- file-by-file content blocks
Example:
Directory: /path/to/project
Scanned directories: 24
Scanned files: 312
Files analyzed: 12
Included bytes: 41,203 (40.2 KB)
Estimated tokens: 12.3k
...
Directory structure:
└── project/
├── README.md
├── app.py
└── utils/
└── helpers.py
================================================
FILE: README.md
================================================
...- default max file size is
50 KB - default include patterns cover common code, config, and docs such as
*.py,*.sh,*.md,*.txt,*.yaml,*.json, and*.toml - the source directory's
.gitignoreis applied by default - binary / media / archive files are skipped
- common directories like
.git,node_modules,dist,build, and virtual envs are skipped - token estimates use the same
o200k_basetokenizer approach as gitingest - if no files match, the tool still writes a valid digest with a message
python local_dir_ingest.py ~/dev/my_private_repo -i "*.py" -i "*.md" -e "data/*" -o digest.txtMIT