Train and run a personal task-type classifier (e.g. coding / writing / meetings) using privacy-preserving computer activity signals such as foreground app/window metadata and aggregated input statistics (counts/rates only).
This project is intentionally scoped as a personalized classifier (single-user first). The architecture keeps:
- Collectors (platform/tool dependent) isolated behind adapters
- Features as a versioned, validated contract
- Models as bundled artifacts with schema checks
- Inference as a small, stable loop that emits task segments and daily summaries
- Fast iteration: first useful model in < 1 week of data
- Privacy: no raw keystrokes, no raw window titles persisted
- Stability: feature schema versioning + schema hash gates
- Extensibility: add new collectors and models without breaking consumers
- Universal (multi-user) generalization out of the box
- Storing or analyzing raw typed content
- "Perfect" labeling UI (start minimal, iterate later)
Eight core labels defined in schema/labels_v1.json:
| ID | Label | Description |
|---|---|---|
| 0 | Build |
Writing or implementing structured content in editor/terminal |
| 1 | Debug |
Investigating issues, terminal-heavy troubleshooting |
| 2 | Review |
Reviewing technical material or diffs with light edits |
| 3 | Write |
Writing structured non-code content |
| 4 | ReadResearch |
Consuming information with minimal production |
| 5 | Communicate |
Asynchronous coordination (chat/email) |
| 6 | Meet |
Synchronous meetings or calls |
| 7 | BreakIdle |
Idle or break period |
Labels are stored as time spans (not per-keystroke events). Users can remap
core labels to personal categories via a taxonomy config
(see configs/user_taxonomy_example.yaml).
- ETL pipeline reads raw → produces features parquet
- Training pipeline reads features + labels → produces model
- Inference pipeline reads new events → emits predictions + segments
- Ingest: pull ActivityWatch export →
data/raw/aw/ - Feature build: events → per-minute features →
data/processed/features_v1/ - Label import: label spans →
data/processed/labels_v1/ - Build dataset: join features + labels, split by time → training arrays
- Train: fit model →
models/<run_id>/ - Evaluate: metrics, acceptance checks, calibration
- Report: daily summaries →
artifacts/
Every N seconds:
- read the last minute(s) of events
- compute the latest feature bucket
- predict + smooth (with optional calibration and taxonomy mapping)
- append predictions →
artifacts/
At end-of-day:
- produce report
This repo enforces the following:
- No raw keystrokes are stored (only aggregate counts/rates).
- No raw window titles are stored by default.
- Titles are hashed or locally tokenized; you can keep a local mapping if you choose.
- Dataset artifacts stay local-first.
- Python >= 3.14
- For the recommended CLI install: uv
Command-line (PyPI) — install the taskclf CLI:
uv tool install taskclfOr with pip only:
pip install taskclfThen run taskclf --help.
Desktop app (optional) — a small Electron shell is built for Windows, Linux, and macOS. Download the latest launcher installers from GitHub Releases. Choose the file for your OS:
| OS | File |
|---|---|
| Windows | *.exe (NSIS installer) |
| Linux | *.AppImage |
| macOS | *.dmg (open and drag taskclf to Applications) |
Those assets are published on GitHub releases whose tag starts with
launcher-v. The PyInstaller backend that the shell downloads at runtime is
published on separate v* tags (see make build-payload and payload release CI).
Development (from a git checkout):
uv sync
uv run taskclf --helpuv run taskclf ingest aw --input /path/to/activitywatch-export.jsonThis parses an ActivityWatch JSON export, normalizes app names to reverse-domain
identifiers, hashes window titles (never storing raw text), and writes
privacy-safe events to data/raw/aw/<YYYY-MM-DD>/events.parquet partitioned by
date.
Options:
--out-dir— output directory (default:data/raw/aw)--title-salt— salt for hashing window titles (default:taskclf-default-salt)
uv run taskclf features build --date 2026-02-16uv run taskclf labels import --file labels.csvOr add individual label blocks:
uv run taskclf labels add-block \
--start 2026-02-16T09:00:00 --end 2026-02-16T10:00:00 --label BuildOr label what you're doing right now (no timestamps needed):
uv run taskclf labels label-now --minutes 10 --label BuildThis queries ActivityWatch for a live summary of apps used in the last N minutes and creates the label span automatically.
uv run taskclf labels export --out my_labels.csvuv run taskclf train lgbm --from 2026-02-01 --to 2026-02-16uv run taskclf infer batch --model-dir models/<run_id> --from 2026-02-01 --to 2026-02-16uv run taskclf infer online --model-dir models/<run_id>Starts a polling loop that queries a running ActivityWatch server, builds
feature rows from live window events, predicts task types using a trained model,
smooths predictions, and writes running outputs to artifacts/. Press Ctrl+C
to stop; a final daily report is generated on shutdown.
Options:
--poll-seconds— seconds between polls (default: 60)--aw-host— ActivityWatch server URL (default:http://localhost:5600)--smooth-window— rolling majority window size (default: 3)--title-salt— salt for hashing window titles (default:taskclf-default-salt)--out-dir— output directory (default:artifacts)--label-queue/--no-label-queue— auto-enqueue low-confidence predictions for manual labeling--label-confidence— confidence threshold for auto-enqueue (default: 0.55)
uv run taskclf infer baseline --from 2026-02-01 --to 2026-02-16Rule-based classifier useful for day-1 bootstrapping before you have a trained model.
uv run taskclf report daily --segments-file artifacts/segments.jsonAll commands: uv run taskclf --help
| Group | Commands | Purpose |
|---|---|---|
ingest |
aw |
Import ActivityWatch exports |
features |
build |
Build per-minute feature rows |
labels |
import, add-block, label-now, show-queue, project |
Manage label spans and labeling queue |
train |
build-dataset, lgbm, evaluate, tune-reject, calibrate, retrain, check-retrain |
Training, evaluation, and retraining pipeline |
taxonomy |
validate, show, init |
User-defined label groupings |
infer |
batch, online, baseline, compare |
Prediction (ML, rule-based, comparison) |
report |
daily |
Daily summaries (JSON/CSV/Parquet) |
monitor |
drift-check, telemetry, show |
Feature drift and telemetry tracking |
| (top-level) | tray |
System tray labeling app with activity transition detection |
| (top-level) | ui |
Web UI for labeling, queue, and live prediction streaming |
Full CLI docs: docs/api/cli/main.md
src/taskclf/— application code (adapters, core, features, labels, train, infer, report, ui)schema/— versioned JSON schemas for features and labelsconfigs/— configuration files (model params, retrain policy, taxonomy examples)docs/— API reference and guides (served viamake docs-serve)data/— raw and processed datasets (local, gitignored)models/— trained model bundles (one folder per run)artifacts/— predictions, segments, reports, evaluation outputstests/— test suite
Every saved model bundle (models/<run_id>/) contains:
- the model file
metadata.json: feature schema version + hash, label set, training date range, params, dataset hashmetrics.json: macro/weighted F1, per-class metricsconfusion_matrix.csv- categorical encoders (if applicable)
Inference refuses to run if the schema hash mismatches the model bundle.
Common tasks are in the Makefile:
make lint # ruff check .
make test # pytest
make typecheck # mypy src
make docs-serve # local preview at http://127.0.0.1:8000
make docs-build # static site in site/Electron backend payload (for packaged app downloads): after make ui-build,
build the PyInstaller one-folder sidecar zip used by GitHub releases (v* tags):
uv sync --group bundle
make build-payload # writes build/payload-<triple>.zipSee docs/api/scripts/payload_build.md for details.
TBD (local-first personal project by default).