Skip to content

Votee-AI/magic-data-agent-skills

MAGIC

MAGIC Data Agent Skills

License Python 3.12+ npm GitHub Release skills.sh

12 data science skills that turn any AI coding assistant into a data agent.

Load, profile, clean, transform, visualize, validate, and synthesize data — all through natural language. Works with Claude Code, Cursor, Windsurf, Gemini CLI, and 30 AI tools in total.

How It Works

Each skill is a self-contained knowledge package. When an agent receives a data task:

  1. Agent reads SKILL.md → gets domain knowledge, code patterns, and procedures
  2. Agent reads references/*.md → gets detailed patterns on demand
  3. Agent reads scripts/*.py → sees reference implementations (not executed directly)
  4. Agent writes its own code adapted to the specific task
  5. Agent verifies quality using the patterns and constraints from the skill

Skills provide knowledge and patterns. The agent decides how to act — it may follow the reference scripts closely, adapt them, or write entirely custom code.

The agent works in three layers

Skills

Domain Skill Description
Setup magic-workspace-init Workspace scaffolding, environment verification, dependency installation
Lifecycle magic-data-lifecycle Multi-skill orchestration, routing, quality gating
Ingestion magic-data-loading Multi-format file detection, auto-encoding, CJK support, databases, HuggingFace
Profiling magic-data-profiling Quality scoring, distribution analysis, outlier detection, correlation
Cleaning magic-data-cleaning Missing values, normalization, sentinel replacement, cleaning plans
Validation magic-data-validation Schema inference, constraint checking, fitness-for-use assessment
Exploration magic-data-exploration Pattern discovery, segment analysis, relationship exploration
Transformation magic-data-transformation Reshape, aggregate, merge, derive columns, deliver to DB/HuggingFace
Synthesis magic-data-synthesis LLM-based generation via DataDesigner — fill missing, translate, enrich
Analysis magic-statistical-analysis Descriptive stats, hypothesis testing, correlation analysis
Visualization magic-data-visualization Chart selection, generation (static + interactive), validation
Reporting magic-report-generation Structured report assembly, table formatting

Installation

Prerequisites

  • Node.js 20+ — required for the CLI installer
  • Python 3.12+ — required for skill scripts and reference implementations
  • After installing, run pip install -r requirements.txt for Python dependencies

Option 1: CLI Installer (Recommended)

Installs skills and slash commands for 30 AI coding tools.

# Install globally
npm install -g @votee-ai/magic-data-agent-skills

# Interactive — auto-detects tools in your project
magic-data-agent-skills init

# Non-interactive — specify tools directly
magic-data-agent-skills init --tools claude,cursor,windsurf

# Install for all supported tools
magic-data-agent-skills init --tools all

Or run without installing:

npx @votee-ai/magic-data-agent-skills init --tools claude,cursor

The installer copies skill packages into each tool's configuration directory (e.g., .claude/skills/, .cursor/skills/) and adapts slash commands to each tool's native format.

Managing your installation:

# Update skills after upgrading the package
magic-data-agent-skills update

# Remove skills from specific tools
magic-data-agent-skills remove --tools cursor

# Remove all installed skills and config
magic-data-agent-skills remove

# Reinstall over an existing installation
magic-data-agent-skills init --force

A magic-data-agent-skills.json config file is saved in your project root to track installed tools.

Option 2: Vercel Skills CLI (skills only)

Compatible with the open agent skills ecosystem:

# Install all MAGIC data skills
npx skills add Votee-AI/magic-data-agent-skills

# Install only specific skills
npx skills add Votee-AI/magic-data-agent-skills --skill magic-data-synthesis
npx skills add Votee-AI/magic-data-agent-skills --skill magic-data-cleaning --skill magic-data-profiling

# Target a specific agent (Claude Code, Cursor, Windsurf, Codex, ...)
npx skills add Votee-AI/magic-data-agent-skills --agent claude-code

# Install globally (available across all projects)
npx skills add Votee-AI/magic-data-agent-skills --global

# List / update / remove
npx skills list
npx skills update
npx skills remove magic-data-cleaning

Each skill works independently when installed individually — no shared dependencies required.

Comparison

Method Skills Commands Tools supported
CLI Installer (npx @votee-ai/...) 12 13 30 AI tools
Vercel Skills CLI (npx skills add) 12 Any (convention-based)

After installing, just ask your agent:

"Load and profile this CSV, clean any issues, and generate a summary report"

The agent reads the installed SKILL.md files and handles the rest.

What a Skill Looks Like

Each skill is a directory with a SKILL.md definition and reference scripts:

skills/magic-data-loading/
├── SKILL.md              # Skill definition (the agent reads this)
├── scripts/
│   ├── detect_format.py  # Scriptable tool: auto-detect file format
│   ├── load_file.py      # Scriptable tool: load any supported format
│   └── inspect_hf_dataset.py  # Reference: HuggingFace dataset inspection
├── evals/                # Quality benchmarks
└── references/           # Domain knowledge documents

The SKILL.md contains structured knowledge — thinking patterns, rules, seed patterns, and script documentation — that the agent studies before writing code tailored to your specific task.

Commands

Commands are available via the CLI installer.

Command Description Category
/magic:lifecycle Run the full data pipeline (Discover → Plan → Execute → Validate → Deliver) Core
/magic:explore Enter interactive data exploration mode Core
/magic:status Quick snapshot of workspace state Core
/magic:connect Connect to a database and inspect schema Core
/magic:deliver Deliver processed data to DB, HuggingFace, or local file Core
/magic:init-workspace Initialize workspace directory structure Core
/magic:findings Show structured findings from profiling/analysis Support
/magic:propose Generate a processing plan from findings Support
/magic:decide Record a user decision Support
/magic:spec Create or show the data processing spec Support
/magic:review Review decisions and progress Support
/magic:rollback Revert to a previous checkpoint Support
/magic:help Show command reference Support

Typical Workflow

 1. Setup       → magic-workspace-init (create dirs, verify environment)
 2. Load        → magic-data-loading (ingest files, detect format/encoding)
 3. Profile     → magic-data-profiling (quality score, distributions, outliers)
 4. Clean       → magic-data-cleaning (fix issues found by profiling)
 5. Transform   → magic-data-transformation (reshape, join, aggregate)
 6. Synthesize  → magic-data-synthesis (LLM-based fill/translate/enrich via DataDesigner)
 7. Validate    → magic-data-validation (schema + constraints + sanity checks)
 8. Analyze     → magic-statistical-analysis (hypothesis testing, correlations)
 9. Visualize   → magic-data-visualization (charts, plots)
10. Report      → magic-report-generation (structured deliverable)

The agent uses magic-data-lifecycle to orchestrate this flow, adapting the sequence to the specific task. Not every step is needed for every task.

Directory Structure

magic-data-agent-skills/
├── skills/                          # 12 skill packages
│   ├── magic-data-loading/          # Each skill is self-contained:
│   │   ├── SKILL.md                 #   Domain knowledge + patterns + procedures
│   │   ├── evals/evals.json         #   Machine-verifiable eval assertions
│   │   ├── references/              #   Detailed reference docs (loaded on demand)
│   │   ├── scripts/                 #   Reference implementations (read, not executed)
│   │   └── templates/               #   Copy-adaptable starter files
│   └── ... (12 total)
├── commands/                        # 13 slash commands
│   └── magic/
│       ├── explore.md
│       ├── lifecycle.md
│       └── ... (13 total)
├── cli/                             # CLI installer (npm package)
├── tests/                           # Unit, integration, and eval tests
├── .github/                         # CI workflows and templates
├── LICENSE                          # Apache 2.0
├── CONTRIBUTING.md
└── README.md

Documentation

Key Concepts

Library stack: data plane, analytics, visualization, LLM plane

Agent Autonomy

Skills provide knowledge; the agent decides execution:

  • Agents read SKILL.md to understand patterns, then write custom code
  • Reference scripts demonstrate approaches but are not called directly
  • The agent adapts patterns to each unique task rather than following rigid pipelines

Evals (Anthropic Format)

Each skill has evals/evals.json with machine-verifiable assertions:

{
  "skill_name": "magic-data-synthesis",
  "evals": [
    {
      "name": "fill-missing-definitions",
      "prompt": "I have a CSV with 10 product items. Generate English descriptions.",
      "assertions": [
        {"type": "must_contain_one", "values": ["DataDesigner", "data-designer", "preview"]},
        {"type": "must_not_contain", "values": ["batch_synthesize.py --mode"]}
      ]
    }
  ]
}

Workspace Patterns

For complex multi-step tasks, the agent may create:

  • workspace_state.md — tracks processing state across sessions
  • logs/analysis_journal.md — records findings and decisions
  • data/checkpoints/ckpt_NN_*.csv — intermediate results for rollback

Supported Tools

The CLI installer supports 30 AI coding tools. Use the --tools value when running init:

Tools with slash command support:

Tool --tools value Command format
Claude Code claude markdown
Cursor cursor markdown
Windsurf windsurf markdown
Gemini CLI gemini TOML
GitHub Copilot github-copilot prompt
Cline cline markdown
RooCode roocode markdown
Continue continue prompt
Kilo Code kilocode markdown
Kiro kiro markdown
Junie junie markdown
Amazon Q Developer amazon-q markdown
Auggie auggie markdown
OpenCode opencode markdown
Qoder qoder markdown
Qwen Code qwen TOML
Lingma lingma TOML
CodeBuddy codebuddy markdown
Crush crush markdown
Antigravity antigravity markdown

Tools with skills only (no commands):

codex, bob, costrict, factory, forgecode, iflow, kimi, pi, trae, agents

Running Tests

# Unit tests
MPLBACKEND=Agg pytest tests/unit/ -q --tb=short

# Integration tests
MPLBACKEND=Agg pytest tests/integration/ -q --tb=short

# Eval dry-run (validates eval JSON structure)
python tests/e2e/evals/run_evals.py --all --dry-run

Contributing

See CONTRIBUTING.md for development setup, testing, and how to add new skills. Please follow our Code of Conduct.

Security

See SECURITY.md for reporting vulnerabilities.

License

Apache 2.0 — see LICENSE for details.

Built by Votee AI.