cloakrs is a Rust library and CLI for detecting and masking personally identifiable information in text, logs, JSON, CSV, and database dumps.
It ships universal recognizers for emails, phone numbers, credit cards, IBANs, IP addresses, URLs, API keys, JWTs, AWS access keys, MAC addresses, crypto wallet addresses, and context-dependent dates of birth. Locale bundles add identifiers such as US SSNs, Dutch BSNs, UK NINO/NHS numbers, German Steuer-IDs, Indian Aadhaar/PAN values, Brazilian CPF/CNPJ values, and French INSEE/NIR numbers.
See supported entities for the full detection matrix, including validation algorithms, confidence ranges, and examples.
cargo install cloakrs-cliFor local development:
cargo build --workspace
cargo test --workspace
cargo run -p cloakrs-cli -- scan tests/fixtures/sample_text.txtuse cloakrs_core::Locale;
let scanner = cloakrs_locales::default_registry()
.into_scanner_builder()
.locale(Locale::US)
.build()?;
let result = scanner.scan("Contact jane@example.com or ssn 123-45-6789")?;
assert_eq!(result.masked_text.as_deref(), Some("Contact [EMAIL] or ssn [SSN]"));
# Ok::<(), cloakrs_core::CloakError>(())# Scan a file and print a human-readable report.
cloakrs scan tests/fixtures/sample_text.txt --locale us --output-format text
# Produce SARIF for code scanning systems.
cloakrs audit . --output-format sarif --output cloakrs.sarif
# Mask a CSV file, scanning selected columns only.
cloakrs scan users.csv --format csv --columns email,phone --output users.masked.csvThe workspace is split into five crates with one-way dependencies:
cloakrs-core -> cloakrs-patterns -> cloakrs-locales -> cloakrs-adapters -> cloakrs-cli
cloakrs-core: scanner, recognizer trait, shared types, masking strategiescloakrs-patterns: universal recognizers such as email, phone, card, IBANcloakrs-locales: country-specific recognizers such as US SSN and Dutch BSNcloakrs-adapters: streaming handlers for text, JSON, CSV, logs, and SQL dumpscloakrs-cli: thecloakrscommand-line interface
| Tool | Language | Runtime requirements | Primary fit | Benchmark status |
|---|---|---|---|---|
| cloakrs | Rust | Single native binary | Fast local scanning and masking | Criterion suite included |
| Microsoft Presidio | Python | Python plus NLP dependencies | NLP-rich enterprise workflows | Run locally for same-hardware numbers |
| DataFog | Python | Python runtime | App-level PII detection | Run locally for same-hardware numbers |
| scrubadub | Python | Python runtime | Text scrubbing | Not benchmarked in-tree |
| piidetect | Go | Native binary | Lightweight PII detection | Not benchmarked in-tree |
Run the local benchmark suite with:
cargo bench -p cloakrs-cli --bench scan_benchmarkThe benchmark harness covers 1KB through 10MB inputs for plain text, JSON, and CSV, each recognizer individually, and all masking strategies. See docs/benchmarking.md.
- Adding recognizers
- Adding locale recognizers
- Supported entities
- CI/CD integration
- Benchmarking
- Release checklist
The first Rust release is published on crates.io. See implementation status for completed work and known gaps.
MIT. See LICENSE.md.