Extremely fast CLI tool for previewing CSV and Parquet files — built for data engineers and data scientists.
Think head/tail but data-aware: shows row counts, column counts, inferred types, and colorized tabular output.
Built on Polars — a high-performance DataFrame library written in Rust. Output uses the standard Polars table format (column names, types, and values).
uvx dpeek file.parquet # auto-installs on first run (uv required)uvx is an alias for uv tool run. It downloads dpeek on first use, caches it, and runs it — no manual install step needed. Get uv at docs.astral.sh/uv.
# Add to global PATH via uv (recommended for daily use)
uv tool install dpeek # then just: dpeek file.parquet
# Homebrew (macOS / Linux)
brew install Xyc2016/tap/dpeek
# Rust toolchain — downloads pre-built binary, no compile needed
cargo binstall dpeek
# Rust toolchain — compile from source
cargo install dpeek
# Python / pip
pip install dpeekDownload a pre-built binary from GitHub Releases:
| Platform | File |
|---|---|
| macOS (Apple Silicon) | dpeek-aarch64-apple-darwin |
| Linux (x86-64) | dpeek-x86_64-unknown-linux-gnu |
| Windows (x86-64) | dpeek-x86_64-pc-windows-msvc.exe |
# macOS example
sudo curl -fsSL https://github.com/Xyc2016/dpeek/releases/latest/download/dpeek-aarch64-apple-darwin \
-o /usr/local/bin/dpeek && sudo chmod +x /usr/local/bin/dpeekdpeek file.parquet # head 5 rows
dpeek file.csv -n 20 # head 20 rows
dpeek tail file.parquet # tail 5 rows
dpeek schema file.parquet # show column names and types
dpeek -c col1,col2 file.csv # select columns by name
dpeek -c 0:5 file.parquet # select columns by range (0-based)
dpeek -d '|' file.csv # custom delimiter (tab: \t)
dpeek --fast file.csv # skip full scan (faster, no row count)$ dpeek examples/titanic.parquet
examples/titanic.parquet 891 rows × 15 cols (showing top 5)
┌──────────┬────────┬────────┬──────┬───────┬───────┬─────────┬──────────┬───────┬───────┬────────────┬──────┬─────────────┬───────┬───────┐
│ survived ┆ pclass ┆ sex ┆ age ┆ sibsp ┆ parch ┆ fare ┆ embarked ┆ class ┆ who ┆ adult_male ┆ deck ┆ embark_town ┆ alive ┆ alone │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ str ┆ f64 ┆ i64 ┆ i64 ┆ f64 ┆ str ┆ str ┆ str ┆ bool ┆ str ┆ str ┆ str ┆ bool │
╞══════════╪════════╪════════╪══════╪═══════╪═══════╪═════════╪══════════╪═══════╪═══════╪════════════╪══════╪═════════════╪═══════╪═══════╡
│ 0 ┆ 3 ┆ male ┆ 22.0 ┆ 1 ┆ 0 ┆ 7.25 ┆ S ┆ Third ┆ man ┆ true ┆ null ┆ Southampton ┆ no ┆ false │
│ 1 ┆ 1 ┆ female ┆ 38.0 ┆ 1 ┆ 0 ┆ 71.2833 ┆ C ┆ First ┆ woman ┆ false ┆ C ┆ Cherbourg ┆ yes ┆ false │
│ 1 ┆ 3 ┆ female ┆ 26.0 ┆ 0 ┆ 0 ┆ 7.925 ┆ S ┆ Third ┆ woman ┆ false ┆ null ┆ Southampton ┆ yes ┆ true │
│ 1 ┆ 1 ┆ female ┆ 35.0 ┆ 1 ┆ 0 ┆ 53.1 ┆ S ┆ First ┆ woman ┆ false ┆ C ┆ Southampton ┆ yes ┆ false │
│ 0 ┆ 3 ┆ male ┆ 35.0 ┆ 0 ┆ 0 ┆ 8.05 ┆ S ┆ Third ┆ man ┆ true ┆ null ┆ Southampton ┆ no ┆ true │
└──────────┴────────┴────────┴──────┴───────┴───────┴─────────┴──────────┴───────┴───────┴────────────┴──────┴─────────────┴───────┴───────┘
$ dpeek schema examples/titanic.parquet
examples/titanic.parquet 891 rows × 15 cols
survived i64
pclass i64
sex str
age f64
sibsp i64
parch i64
fare f64
embarked str
class str
who str
adult_male bool
deck str
embark_town str
alive str
alone bool
$ dpeek -c survived,sex,age examples/titanic.parquet
examples/titanic.parquet 891 rows × 15 cols (showing top 5, 3 cols)
┌──────────┬────────┬──────┐
│ survived ┆ sex ┆ age │
│ --- ┆ --- ┆ --- │
│ i64 ┆ str ┆ f64 │
╞══════════╪════════╪══════╡
│ 0 ┆ male ┆ 22.0 │
│ 1 ┆ female ┆ 38.0 │
│ 1 ┆ female ┆ 26.0 │
│ 1 ┆ female ┆ 35.0 │
│ 0 ┆ male ┆ 35.0 │
└──────────┴────────┴──────┘
| Flag | Description |
|---|---|
-n N |
Number of rows to show (default: 5) |
--fast |
Fast mode: skip full CSV scan (CSV only, see below) |
-c COLS |
Column selection: col1,col2 (names) or 0:5 (0-based range) |
-d CHAR |
Field delimiter for CSV (default: ,). Use \t for tab |
--fast only affects CSV files. Parquet stores schema and row count in its file footer — dpeek reads that metadata directly at no extra cost, so there's nothing to skip.
dpeek defaults to accuracy. In default mode:
- Parquet: row count comes from file metadata (free, no scan needed). Type inference is exact.
- CSV: dpeek scans the entire file to count rows and infer types across all data.
--fast trades accuracy for speed. With --fast:
- Type inference uses only the first 100 rows (may mis-detect types in dirty data)
- Row count is skipped (not shown in output header)
tailis disabled for CSV (requires a full scan to find the end)
Use --fast when you just need a quick look and the file is large.
| Subcommand | Description |
|---|---|
tail |
Show the last N rows |
schema |
Show column names and types without loading data |
Measured on Apple M4, macOS, release build. All times are for the default head command (5 rows). All times include process startup.
| File | Size | Mode | Time |
|---|---|---|---|
titanic.parquet |
11 KB | default | ~23ms |
iris.csv |
4 KB | default | ~24ms |
yellow_tripdata_2015-01.parquet |
167 MB | default | ~35ms |
yellow_tripdata_2015-01.csv |
1.8 GB | --fast |
~24ms |
yellow_tripdata_2015-01.csv |
1.8 GB | default | ~30s |
The last row shows why --fast exists: default mode on a 1.8 GB CSV scans the entire file to count rows and infer types accurately. --fast drops that to 24ms by reading only the first 100 rows.
Cold cache is the more realistic metric for day-to-day use — the first time you open a file after receiving it, it won't be in the OS cache.
| File | Size | Mode | Time |
|---|---|---|---|
titanic.parquet |
11 KB | default | ~80ms |
iris.csv |
4 KB | default | ~80ms |
yellow_tripdata_2015-01.parquet |
167 MB | default | ~210ms |
yellow_tripdata_2015-01.csv |
1.8 GB | --fast |
~90ms |
Why Parquet is fast even cold: dpeek reads only the file footer (schema + row count metadata) and the first row group. For a 167 MB file, that's typically well under 10 MB of actual I/O.
Why CSV --fast is fast cold: with --fast, dpeek reads only the first 100 rows (~a few KB). Even cold, almost no I/O is needed.
For comparison: Python tools (pandas, pyarrow) add ~500ms of interpreter startup on top of these numbers when invoked as CLI commands.
| Format | Extension |
|---|---|
| Parquet | .parquet |
| CSV | .csv |
cargo build --release # binary at target/release/dpeek
cargo test # run unit tests