12 data science skills that turn any AI coding assistant into a data agent.
Load, profile, clean, transform, visualize, validate, and synthesize data — all through natural language. Works with Claude Code, Cursor, Windsurf, Gemini CLI, and 30 AI tools in total.
Each skill is a self-contained knowledge package. When an agent receives a data task:
- Agent reads
SKILL.md→ gets domain knowledge, code patterns, and procedures - Agent reads
references/*.md→ gets detailed patterns on demand - Agent reads
scripts/*.py→ sees reference implementations (not executed directly) - Agent writes its own code adapted to the specific task
- Agent verifies quality using the patterns and constraints from the skill
Skills provide knowledge and patterns. The agent decides how to act — it may follow the reference scripts closely, adapt them, or write entirely custom code.
| Domain | Skill | Description |
|---|---|---|
| Setup | magic-workspace-init |
Workspace scaffolding, environment verification, dependency installation |
| Lifecycle | magic-data-lifecycle |
Multi-skill orchestration, routing, quality gating |
| Ingestion | magic-data-loading |
Multi-format file detection, auto-encoding, CJK support, databases, HuggingFace |
| Profiling | magic-data-profiling |
Quality scoring, distribution analysis, outlier detection, correlation |
| Cleaning | magic-data-cleaning |
Missing values, normalization, sentinel replacement, cleaning plans |
| Validation | magic-data-validation |
Schema inference, constraint checking, fitness-for-use assessment |
| Exploration | magic-data-exploration |
Pattern discovery, segment analysis, relationship exploration |
| Transformation | magic-data-transformation |
Reshape, aggregate, merge, derive columns, deliver to DB/HuggingFace |
| Synthesis | magic-data-synthesis |
LLM-based generation via DataDesigner — fill missing, translate, enrich |
| Analysis | magic-statistical-analysis |
Descriptive stats, hypothesis testing, correlation analysis |
| Visualization | magic-data-visualization |
Chart selection, generation (static + interactive), validation |
| Reporting | magic-report-generation |
Structured report assembly, table formatting |
- Node.js 20+ — required for the CLI installer
- Python 3.12+ — required for skill scripts and reference implementations
- After installing, run
pip install -r requirements.txtfor Python dependencies
Installs skills and slash commands for 30 AI coding tools.
# Install globally
npm install -g @votee-ai/magic-data-agent-skills
# Interactive — auto-detects tools in your project
magic-data-agent-skills init
# Non-interactive — specify tools directly
magic-data-agent-skills init --tools claude,cursor,windsurf
# Install for all supported tools
magic-data-agent-skills init --tools allOr run without installing:
npx @votee-ai/magic-data-agent-skills init --tools claude,cursorThe installer copies skill packages into each tool's configuration directory (e.g., .claude/skills/, .cursor/skills/) and adapts slash commands to each tool's native format.
Managing your installation:
# Update skills after upgrading the package
magic-data-agent-skills update
# Remove skills from specific tools
magic-data-agent-skills remove --tools cursor
# Remove all installed skills and config
magic-data-agent-skills remove
# Reinstall over an existing installation
magic-data-agent-skills init --forceA magic-data-agent-skills.json config file is saved in your project root to track installed tools.
Compatible with the open agent skills ecosystem:
# Install all MAGIC data skills
npx skills add Votee-AI/magic-data-agent-skills
# Install only specific skills
npx skills add Votee-AI/magic-data-agent-skills --skill magic-data-synthesis
npx skills add Votee-AI/magic-data-agent-skills --skill magic-data-cleaning --skill magic-data-profiling
# Target a specific agent (Claude Code, Cursor, Windsurf, Codex, ...)
npx skills add Votee-AI/magic-data-agent-skills --agent claude-code
# Install globally (available across all projects)
npx skills add Votee-AI/magic-data-agent-skills --global
# List / update / remove
npx skills list
npx skills update
npx skills remove magic-data-cleaningEach skill works independently when installed individually — no shared dependencies required.
| Method | Skills | Commands | Tools supported |
|---|---|---|---|
CLI Installer (npx @votee-ai/...) |
12 | 13 | 30 AI tools |
Vercel Skills CLI (npx skills add) |
12 | — | Any (convention-based) |
After installing, just ask your agent:
"Load and profile this CSV, clean any issues, and generate a summary report"
The agent reads the installed SKILL.md files and handles the rest.
Each skill is a directory with a SKILL.md definition and reference scripts:
skills/magic-data-loading/
├── SKILL.md # Skill definition (the agent reads this)
├── scripts/
│ ├── detect_format.py # Scriptable tool: auto-detect file format
│ ├── load_file.py # Scriptable tool: load any supported format
│ └── inspect_hf_dataset.py # Reference: HuggingFace dataset inspection
├── evals/ # Quality benchmarks
└── references/ # Domain knowledge documents
The SKILL.md contains structured knowledge — thinking patterns, rules,
seed patterns, and script documentation — that the agent studies before
writing code tailored to your specific task.
Commands are available via the CLI installer.
| Command | Description | Category |
|---|---|---|
/magic:lifecycle |
Run the full data pipeline (Discover → Plan → Execute → Validate → Deliver) | Core |
/magic:explore |
Enter interactive data exploration mode | Core |
/magic:status |
Quick snapshot of workspace state | Core |
/magic:connect |
Connect to a database and inspect schema | Core |
/magic:deliver |
Deliver processed data to DB, HuggingFace, or local file | Core |
/magic:init-workspace |
Initialize workspace directory structure | Core |
/magic:findings |
Show structured findings from profiling/analysis | Support |
/magic:propose |
Generate a processing plan from findings | Support |
/magic:decide |
Record a user decision | Support |
/magic:spec |
Create or show the data processing spec | Support |
/magic:review |
Review decisions and progress | Support |
/magic:rollback |
Revert to a previous checkpoint | Support |
/magic:help |
Show command reference | Support |
1. Setup → magic-workspace-init (create dirs, verify environment)
2. Load → magic-data-loading (ingest files, detect format/encoding)
3. Profile → magic-data-profiling (quality score, distributions, outliers)
4. Clean → magic-data-cleaning (fix issues found by profiling)
5. Transform → magic-data-transformation (reshape, join, aggregate)
6. Synthesize → magic-data-synthesis (LLM-based fill/translate/enrich via DataDesigner)
7. Validate → magic-data-validation (schema + constraints + sanity checks)
8. Analyze → magic-statistical-analysis (hypothesis testing, correlations)
9. Visualize → magic-data-visualization (charts, plots)
10. Report → magic-report-generation (structured deliverable)
The agent uses magic-data-lifecycle to orchestrate this flow, adapting the sequence to the specific task. Not every step is needed for every task.
magic-data-agent-skills/
├── skills/ # 12 skill packages
│ ├── magic-data-loading/ # Each skill is self-contained:
│ │ ├── SKILL.md # Domain knowledge + patterns + procedures
│ │ ├── evals/evals.json # Machine-verifiable eval assertions
│ │ ├── references/ # Detailed reference docs (loaded on demand)
│ │ ├── scripts/ # Reference implementations (read, not executed)
│ │ └── templates/ # Copy-adaptable starter files
│ └── ... (12 total)
├── commands/ # 13 slash commands
│ └── magic/
│ ├── explore.md
│ ├── lifecycle.md
│ └── ... (13 total)
├── cli/ # CLI installer (npm package)
├── tests/ # Unit, integration, and eval tests
├── .github/ # CI workflows and templates
├── LICENSE # Apache 2.0
├── CONTRIBUTING.md
└── README.md
- ENV_VARS.md — All environment variables reference
- PRIVACY.md — No-telemetry policy
- ROADMAP.md — Planned features and direction
- CONTRIBUTING.md — How to contribute
- SECURITY.md — Vulnerability disclosure
- CHANGELOG.md — Release history
Skills provide knowledge; the agent decides execution:
- Agents read SKILL.md to understand patterns, then write custom code
- Reference scripts demonstrate approaches but are not called directly
- The agent adapts patterns to each unique task rather than following rigid pipelines
Each skill has evals/evals.json with machine-verifiable assertions:
{
"skill_name": "magic-data-synthesis",
"evals": [
{
"name": "fill-missing-definitions",
"prompt": "I have a CSV with 10 product items. Generate English descriptions.",
"assertions": [
{"type": "must_contain_one", "values": ["DataDesigner", "data-designer", "preview"]},
{"type": "must_not_contain", "values": ["batch_synthesize.py --mode"]}
]
}
]
}For complex multi-step tasks, the agent may create:
workspace_state.md— tracks processing state across sessionslogs/analysis_journal.md— records findings and decisionsdata/checkpoints/ckpt_NN_*.csv— intermediate results for rollback
The CLI installer supports 30 AI coding tools. Use the --tools value when running init:
Tools with slash command support:
| Tool | --tools value |
Command format |
|---|---|---|
| Claude Code | claude |
markdown |
| Cursor | cursor |
markdown |
| Windsurf | windsurf |
markdown |
| Gemini CLI | gemini |
TOML |
| GitHub Copilot | github-copilot |
prompt |
| Cline | cline |
markdown |
| RooCode | roocode |
markdown |
| Continue | continue |
prompt |
| Kilo Code | kilocode |
markdown |
| Kiro | kiro |
markdown |
| Junie | junie |
markdown |
| Amazon Q Developer | amazon-q |
markdown |
| Auggie | auggie |
markdown |
| OpenCode | opencode |
markdown |
| Qoder | qoder |
markdown |
| Qwen Code | qwen |
TOML |
| Lingma | lingma |
TOML |
| CodeBuddy | codebuddy |
markdown |
| Crush | crush |
markdown |
| Antigravity | antigravity |
markdown |
Tools with skills only (no commands):
codex, bob, costrict, factory, forgecode, iflow, kimi, pi, trae, agents
# Unit tests
MPLBACKEND=Agg pytest tests/unit/ -q --tb=short
# Integration tests
MPLBACKEND=Agg pytest tests/integration/ -q --tb=short
# Eval dry-run (validates eval JSON structure)
python tests/e2e/evals/run_evals.py --all --dry-runSee CONTRIBUTING.md for development setup, testing, and how to add new skills. Please follow our Code of Conduct.
See SECURITY.md for reporting vulnerabilities.
Apache 2.0 — see LICENSE for details.
Built by Votee AI.


