Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions .github/workflows/regression-gate.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,57 @@ on:
pull_request:
workflow_dispatch:

defaults:
run:
working-directory: .

jobs:
regression-gate:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4

- name: Setup Node
uses: actions/setup-node@v4
with:
node-version: "18"

- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.11"

- name: Environment info
run: |
python3 --version
node --version
node bin/stylekit-skill.js doctor

- name: Install test dependencies
run: |
pip install pytest jsonschema

- name: Run unit tests
run: |
pytest tests/ -m "not slow" -v

- name: Run contract tests
run: |
pytest tests/ -m slow -v

- name: Run smoke test
run: |
python3 scripts/smoke_test.py

- name: Run taxonomy guard
run: |
python3 scripts/validate_taxonomy.py --max-unused-style-tags 0 --fail-on-warning

- name: Run output-contract sync guard
run: |
python3 scripts/validate_output_contract_sync.py --format text --fail-on-warning

- name: Run benchmark regression gate
run: |
bash scripts/ci_regression_gate.sh \
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,6 @@ __pycache__/
tmp/
*.log
.DS_Store
node_modules/
.pytest_cache/
*.tgz
50 changes: 50 additions & 0 deletions GO_LIVE_CHECKLIST.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# StyleKit Skill Go-Live Checklist

## 1) Gate Stability

- [ ] `main` branch passes CI continuously for at least 7 days (or 20 consecutive runs).
- [ ] No flaky tests in `pytest -m slow`.
- [ ] `regression_gate.passed=true` on the latest benchmark run.

## 2) Quality Baseline

- [ ] 30 real-world prompts (blog/saas/dashboard/docs/ecommerce/landing-page/portfolio/general) are evaluated.
- [ ] Overall pass rate >= 90%.
- [ ] No P0/P1 issues in design recommendation outputs.
- [ ] Re-run consistency: 5 repeated runs per case, key fields (`style_choice`, `site_profile`, `tag_bundle`) >= 95% identical.

## 3) Required Local Checks

Run all checks before release:

```bash
node bin/stylekit-skill.js doctor
python3 scripts/validate_taxonomy.py --format text --max-unused-style-tags 0 --fail-on-warning
python3 scripts/validate_output_contract_sync.py --format text --fail-on-warning
python3 scripts/smoke_test.py
pytest tests/ -m "not slow" -q
pytest tests/ -m slow -q
python3 scripts/benchmark_pipeline.py --format json --baseline-snapshot references/benchmark-baseline.json --fail-on-regression
```

## 4) Release Preparation

- [ ] Update `package.json` version (SemVer).
- [ ] Update `RELEASE.md` with notable changes and rollback notes.
- [ ] Confirm GitHub workflow `.github/workflows/regression-gate.yml` is green on the release commit.
- [ ] Verify `npm pack --dry-run` contains expected payload only.

## 5) Publish

```bash
npm login
npm whoami
npm publish --access public
```

## 6) Post-Release

- [ ] Tag release: `git tag -a vX.Y.Z -m "vX.Y.Z" && git push origin vX.Y.Z`
- [ ] Install smoke check via npx:
`npx @anxforever/stylekit-skill doctor`
- [ ] Archive benchmark snapshot under `tmp/` for traceability.
44 changes: 32 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# @anxforever/stylekit-style-prompts-skill
# @anxforever/stylekit-skill

StyleKit 风格提示词 Skill(独立仓库版)。
目标:把 StyleKit 的风格能力直接安装到 Codex / Claude 的 skills 目录中。
Expand All @@ -8,70 +8,90 @@ StyleKit 风格提示词 Skill(独立仓库版)。
### 安装到 Codex

```bash
npx @anxforever/stylekit-style-prompts-skill install --tool codex
npx @anxforever/stylekit-skill install --tool codex
```

### 安装到 Claude

```bash
npx @anxforever/stylekit-style-prompts-skill install --tool claude
npx @anxforever/stylekit-skill install --tool claude
```

### 自动检测本机工具并安装

```bash
npx @anxforever/stylekit-style-prompts-skill install --tool auto
npx @anxforever/stylekit-skill install --tool auto
```

### 覆盖安装(已存在时)

```bash
npx @anxforever/stylekit-style-prompts-skill install --tool codex --force
npx @anxforever/stylekit-skill install --tool codex --force
```

### 卸载

```bash
npx @anxforever/stylekit-style-prompts-skill uninstall --tool codex
npx @anxforever/stylekit-style-prompts-skill uninstall --tool claude
npx @anxforever/stylekit-skill uninstall --tool codex
npx @anxforever/stylekit-skill uninstall --tool claude
```

### 环境自检

```bash
npx @anxforever/stylekit-style-prompts-skill doctor
npx @anxforever/stylekit-skill doctor
```

## 2) 本地开发验证(维护者)

```bash
node bin/stylekit-style-prompts-skill.js doctor
node bin/stylekit-style-prompts-skill.js install --tool codex --target /tmp/stylekit-skill-test --force
node bin/stylekit-style-prompts-skill.js uninstall --target /tmp/stylekit-skill-test
node bin/stylekit-skill.js doctor
node bin/stylekit-skill.js install --tool codex --target /tmp/stylekit-skill-test --force
node bin/stylekit-skill.js uninstall --target /tmp/stylekit-skill-test
python3 scripts/audit_style_rule_conflicts.py --format text
python3 scripts/validate_taxonomy.py --format json --max-unused-style-tags 0 --fail-on-warning
python3 scripts/validate_output_contract_sync.py --format json --fail-on-warning
python3 scripts/smoke_test.py
python3 scripts/run_pipeline.py --query "高端科技SaaS财务后台,玻璃质感,强调可读性" --stack nextjs --format json
python3 scripts/run_pipeline.py --workflow codegen --query "高端科技SaaS财务后台,玻璃质感,强调可读性" --stack nextjs --format json
python3 scripts/run_pipeline.py --query "高端科技SaaS财务后台,玻璃质感,强调可读性" --stack nextjs --site-type dashboard --recommendation-mode hybrid --content-depth skeleton --decision-speed fast --format json
python3 scripts/merge_taxonomy_expansion.py --type animation --input tmp/gemini-output.json --dry-run
python3 scripts/propose_upgrade.py --pipeline-output tmp/pipeline-output.json --out-dir tmp/upgrade-proposals --format json
python3 scripts/review_upgrade_candidate.py --candidate tmp/upgrade-proposals/<candidate>.json --format json
```

说明:
- 默认是 `--workflow manual`(手册/知识库模式):输出设计简报 + 手册化建议,不强制走 prompt QA。
- 若要生成并严格审查 prompt,请显式加 `--workflow codegen`。
- v2 支持站点类型路由:`--site-type`(blog/saas/dashboard/docs/ecommerce/landing-page/portfolio/general)。
- v2 支持组合决策参数:`--recommendation-mode`、`--content-depth`、`--decision-speed`。
- taxonomy 门禁可用 `--max-unused-style-tags 0 --fail-on-warning` 强制 style tag registry 无闲置条目且 warning 视为失败。
- 契约防漂移门禁可用 `validate_output_contract_sync.py`:校验 `references/output-contract.md` 与 `tests/schemas` 一致(按每个必需章节的第一个 JSON 示例做门禁校验,可用 `--fail-on-warning` 将 warning 提升为失败)。
- taxonomy 扩展脚本支持 `new_style_tags` 字段,并会在 apply 时更新 `references/taxonomy/style-tag-registry.json`。
- 在 manual 模式下,会额外输出 `manual_assistant.decision_assistant`,包含:候选风格卡片、给新手的引导问题、以及用户选定风格后的下一步命令模板。
- 可直接复用对话模板:`references/cc-decision-conversation-template.md`。
- 若要做“人工审核升级”闭环:先运行 `propose_upgrade.py` 生成候选,再用 `review_upgrade_candidate.py` 校验后发 PR。

## 3) 回归门禁

```bash
python3 scripts/validate_taxonomy.py --format json --max-unused-style-tags 0 --fail-on-warning
python3 scripts/validate_output_contract_sync.py --format text --fail-on-warning
bash scripts/ci_regression_gate.sh --baseline references/benchmark-baseline.json --snapshot-out tmp/benchmark-ci-latest.json
```

## 4) 发布到 npm(让所有人可用)

```bash
node bin/stylekit-skill.js doctor
npm pack --dry-run
npm login
npm whoami
npm publish --access public
```

发布后,任何人都可以通过 `npx @anxforever/stylekit-style-prompts-skill ...` 直接安装。
发布后,任何人都可以通过 `npx @anxforever/stylekit-skill ...` 直接安装。

## 5) 正式上线清单

上线前请完整执行:`GO_LIVE_CHECKLIST.md`。
1 change: 1 addition & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Release Process

This repository uses benchmark regression gates to keep skill quality stable.
For full release readiness gates, also run `GO_LIVE_CHECKLIST.md`.

## 1) Pre-release Checks

Expand Down
64 changes: 60 additions & 4 deletions SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,28 @@ description: Use when users ask to generate beautiful frontend prompts from Styl

Generate better-looking frontend output by combining StyleKit style identity, actionable constraints, and quality checks.

## When to Use

Activate this skill when the user:
- Asks to generate a frontend design prompt or style prompt
- Wants to select, compare, or blend StyleKit visual styles
- Needs a design brief for a new page, dashboard, or landing page
- Asks to audit or fix an existing frontend prompt's quality
- Mentions StyleKit, style prompts, or design system prompt generation
- Wants to convert a screenshot or Figma frame into a style-constrained prompt

Do NOT use this skill for general CSS questions, backend logic, or non-visual tasks.

## Quick One-shot Command

Run handbook mode in one command (default):

`python scripts/run_pipeline.py --query "<requirement>" --stack nextjs --format json`

Run site-type routed composition (style + layout + motion + interaction):

`python scripts/run_pipeline.py --query "<requirement>" --stack nextjs --site-type dashboard --recommendation-mode hybrid --content-depth skeleton --decision-speed fast --format json`

Run prompt-generation mode with QA gate:

`python scripts/run_pipeline.py --workflow codegen --query "<requirement>" --stack nextjs --format json`
Expand Down Expand Up @@ -59,14 +75,26 @@ CI one-command gate:

`bash scripts/ci_regression_gate.sh --baseline references/benchmark-baseline.json --snapshot-out tmp/benchmark-ci-latest.json`

Run taxonomy guard with strict style-tag registry usage:

`python scripts/validate_taxonomy.py --format json --max-unused-style-tags 0 --fail-on-warning`

Run output-contract sync guard (docs example JSON vs tests schema):

`python scripts/validate_output_contract_sync.py --format text --fail-on-warning`

Dry-run taxonomy expansion (including optional `new_style_tags` in input JSON):

`python scripts/merge_taxonomy_expansion.py --type animation --input tmp/gemini-output.json --dry-run`

## Workflow 1: Requirement -> Style Candidates -> Design Brief -> Prompt

1. Refresh dataset when needed:
`bash scripts/refresh-style-prompts.sh /mnt/d/stylekit`
2. Retrieve top style candidates:
`python scripts/search_stylekit.py --query "<requirement>" --top 5`
`python scripts/search_stylekit.py --query "<requirement>" --top 5 --site-type <auto|blog|saas|dashboard|docs|ecommerce|landing-page|portfolio|general>`
3. Generate design brief and prompts:
`python scripts/generate_brief.py --query "<requirement>" --stack nextjs --mode brief+prompt`
`python scripts/generate_brief.py --query "<requirement>" --stack nextjs --site-type dashboard --recommendation-mode hybrid --content-depth skeleton --decision-speed fast --mode brief+prompt`
4. If needed, force multi-style blend ownership:
`python scripts/generate_brief.py --query "<requirement>" --stack nextjs --mode brief+prompt --blend-mode on`
5. For iterative work, set refine mode:
Expand All @@ -87,7 +115,7 @@ CI one-command gate:
3. Read `manual_assistant.decision_assistant.recommended_style_options` and explain 3-4 options with trade-offs.
4. Ask `manual_assistant.decision_assistant.decision_questions` to help user pick direction.
5. After user selects one option, run codegen mode with forced style:
`python scripts/run_pipeline.py --workflow codegen --query "<requirement>" --stack nextjs --style <slug> --blend-mode off --format json`
`python scripts/run_pipeline.py --workflow codegen --query "<requirement>" --stack nextjs --style <slug> --site-type <type> --content-depth skeleton --blend-mode off --format json`
6. Follow `references/cc-decision-conversation-template.md` for a turn-by-turn assistant script.

## Workflow 2: Existing Prompt -> Quality Audit -> Fix Suggestions
Expand Down Expand Up @@ -125,6 +153,12 @@ Primary output object fields:
- `soft_prompt`
- `ai_rules`
- `style_choice`
- `site_profile`
- `tag_bundle`
- `composition_plan`
- `decision_flow`
- `content_plan`
- `upgrade_candidates`
- `quality_gate` (for audits)
- `design_brief.refine_mode`
- `design_brief.input_context.reference_type`
Expand All @@ -141,7 +175,22 @@ Primary output object fields:
- Include pre-delivery validation tests (swap/squint/signature/token).
- Include an anti-pattern blacklist (absolute layout misuse, nested scroll, missing focus states, etc.).
- Preserve user language (Chinese in -> Chinese out; English in -> English out).
- If intent is ambiguous, return top 3 candidates with reasons before final prompt.
- If intent is ambiguous, return top 5 candidates with reasons before final prompt.

## Error Handling

- If `quality_gate.status` is `"fail"`, read `violations` and `autofix_suggestions`, apply fixes, then re-run the QA audit. Repeat up to 3 rounds.
- If the pipeline exits with a non-zero code, check stderr for `ModuleNotFoundError` (missing Python dependency or wrong cwd) or `FileNotFoundError` (missing reference data — run `refresh-style-prompts.sh` first).
- If `search_candidates` returns 0 results, broaden the query or remove `--site-type` constraint.

## Parameter Interactions

- `--blend-mode on` + `--style <slug>`: forces blend OFF (explicit style selection overrides blend).
- `--refine-mode` requires `--workflow codegen`; ignored in handbook mode.
- `--reference-type` + `--strict-reference-schema`: strict mode validates the reference JSON payload against the expected schema and fails fast on mismatch.
- `--recommendation-mode hybrid` uses both BM25 search and taxonomy routing; `rules` skips BM25 and relies solely on site-type routing rules.
- `--content-depth skeleton` produces minimal structure; `storyboard` adds section copy; `near-prod` generates production-ready content blocks.
- `validate_taxonomy.py --max-unused-style-tags 0 --fail-on-warning` enforces zero unused tags in `style-tag-registry` and treats warnings as failures.

## Stack Adapters

Expand Down Expand Up @@ -173,5 +222,12 @@ If stack is unknown, fallback to framework-agnostic Tailwind semantics.
- `scripts/benchmark_pipeline.py`: benchmark pass-rate, hard-check pass rate, bucket pass-rate (`strict-domain`/`balanced`/`expressive`), snapshot export, and baseline regression gate.
- `scripts/ci_regression_gate.sh`: CI wrapper for benchmark regression gate (supports baseline bootstrap).
- `scripts/smoke_test.py`: validate end-to-end script integrity.
- `scripts/validate_taxonomy.py`: taxonomy consistency + style-tag-registry coverage guard (`--fail-on-warning` promotes warnings to failures).
- `scripts/validate_output_contract_sync.py`: output-contract markdown JSON examples vs tests schema sync guard (uses the first JSON block in each required section as canonical; `--fail-on-warning` promotes warnings to failures).
- `scripts/merge_taxonomy_expansion.py`: merge Gemini taxonomy expansion payloads (animation/interaction + optional `new_style_tags`).
- `scripts/propose_upgrade.py`: generate manual-review upgrade candidates from pipeline output.
- `scripts/review_upgrade_candidate.py`: validate upgrade candidate schema and gate requirements.
- `references/benchmark-baseline.json`: default baseline snapshot for CI gate.
- `references/github-actions-regression-gate.yml`: GitHub Actions template for regression automation.
- `references/taxonomy/style-tag-registry.json`: controlled style tag dictionary used by routing validation.
- `references/taxonomy/*`: site-type routing, controlled tags, alias mapping, and style-tag overrides.
Loading