Native desktop companion for the AhaKey-X1 keyboard — with a built-in voice agent.
AhaKey-X1(Vibecoding Keyboard)官方桌面端 · 内置 Voice Agent
AhakeyAI/desktop is the official source repository for the AhaKey desktop baseline — a companion suite for the AhaKey-X1 keyboard (Vibecoding Keyboard) on Windows and macOS.
The desktop app does two things:
- Keyboard control — connect to the AhaKey-X1 over BLE, configure the 4-key × 3-mode mapping, push OLED art, and reflect IDE state on the LED light bar.
- On-device AI workflows — drive a voice-first agent that runs locally on the user's machine, talks to LLMs, and reaches into productivity tools (e.g. Feishu / Lark) on the user's behalf.
- Native BLE stack. SwiftUI + CoreBluetooth, no Python / .NET / TCP-bridge in the loop. One signed
.appbundle plus a backgroundahakeyconfig-agentLaunchAgent for LED state pushes after the GUI closes. - Voice Agent system. A
VoiceAgentSwift module with a supervisor + sub-agent orchestrator (VoiceAgentOrchestrator), structured tool-calling, per-agent memory, concurrency limiting, and an OpenAI-compatibleLLMClient. Sessions persist across launches viaVoiceAgentSessionStore. - Feishu / Lark integration. Send messages and look up contacts through
lark-cliunder the user's own identity — the app never stores Feishu credentials. Contacts can be aliased locally (FeishuContactBook) so sub-agents can resolve names like "智能助手" →open_id. - Dual workspace. A root workspace toggle between IDE 工作台 (classic keyboard config) and Agent 工作台 (voice agent), backed by a unified design system (
AhaKeyDesignSystem) and a single onboarding flow (UnifiedTypelessOnboardingView). - Voice input HUD. Floating
VoiceInputFloatingHUDdriven byNativeSpeechTranscriptionService(Apple Speech) with push-to-talk relay routes for IDEs, WeChat, etc. without losing the holding state on view rebuilds. - LLM configuration UI.
LLMConfigViewsurfaces model / endpoint / key settings; provider talks the OpenAI protocol so any compatible backend works.
See platforms/macos/README.md and platforms/macos/client/README.md for build instructions, BLE protocol docs, and the source map.
The Windows client (Python PySide6 + .NET BLE bridge + Swift-equivalent helper) is preserved as the imported baseline. No major refactor this cycle. See platforms/windows/README.md.
desktop/
├── platforms/
│ ├── macos/client/ # Swift + SwiftUI client (active)
│ └── windows/ # Windows client baseline
├── docs/ # Repo-level docs (architecture, releases, layout)
├── scripts/ # Repo-level helper scripts
├── assets/ # Shared brand / build assets
└── releases/ # Release notes (binaries live in GitHub Releases)
This repository stores source code, project files, required assets, and documentation only.
Build artifacts (.exe, .msi, .app, .dmg) are not committed. Installers are distributed exclusively through GitHub Releases.
- macOS client has moved past the post-migration cleanup phase and is in active feature development — voice agent, Feishu integration, and the new workbench UI all landed in this cycle.
- Windows client remains on the imported baseline.
- Windows / macOS are intentionally kept in separate platform directories: different runtimes, UI models, and system capabilities.
New contributors:
docs/repo-layout.mddocs/installation.mddocs/architecture.mddocs/releases.mdplatforms/macos/README.mdplatforms/macos/client/README.mdplatforms/windows/README.md
AhakeyAI/desktop 是 AhaKey 官方桌面端 baseline 的源码仓库,对应 AhaKey-X1(Vibecoding Keyboard)在 Windows 与 macOS 上的配套桌面应用。
桌面端做两件事:
- 键盘控制 — 通过 BLE 连接 AhaKey-X1,配置 4 键 × 3 模式键位映射、推送 OLED 图片、把 IDE 状态映射到灯条上。
- 设备侧 AI 工作流 — 运行一个本机 voice-first agent,调用 LLM,并代表用户操作生产力工具(飞书 / Lark 等)。
- 原生 BLE 栈。 SwiftUI + CoreBluetooth,链路里没有 Python / .NET / TCP 桥接。单个签名
.app+ 一个后台ahakeyconfig-agentLaunchAgent,GUI 关掉后仍能接收 LED 状态推送。 - Voice Agent 体系。
VoiceAgentSwift 模块,supervisor + sub-agent 编排(VoiceAgentOrchestrator),结构化工具调用、独立记忆、并发限流,配 OpenAI 协议兼容的LLMClient。会话通过VoiceAgentSessionStore跨次启动保留。 - 飞书 / Lark 集成。 通过
lark-cli以用户自己的身份发消息和查联系人,App 不保存飞书凭证。本地可配置联系人别名(FeishuContactBook),sub-agent 可以把"智能助手"这种名字解析成open_id。 - 双工作台。 根工作台支持 IDE 工作台(经典键位配置)和 Agent 工作台(语音助手)切换,共用
AhaKeyDesignSystem设计系统和UnifiedTypelessOnboardingView引导流程。 - 语音输入 HUD。 浮动
VoiceInputFloatingHUD基于NativeSpeechTranscriptionService(Apple Speech),针对 IDE / 微信等场景做了"按住说话"中继路由,View 重建时不会丢失按住状态。 - LLM 配置界面。
LLMConfigView暴露模型 / endpoint / key 配置,走 OpenAI 协议,任意兼容后端可接。
构建说明、BLE 协议文档和源码索引见 platforms/macos/README.md 与 platforms/macos/client/README.md。
Windows 客户端(Python PySide6 + .NET BLE 桥接 + 辅助 helper)保持迁入时的 baseline,本轮无大规模重构。详见 platforms/windows/README.md。
desktop/
├── platforms/
│ ├── macos/client/ # Swift + SwiftUI 客户端(活跃)
│ └── windows/ # Windows 客户端 baseline
├── docs/ # 仓库级文档(架构、发布、目录布局)
├── scripts/ # 仓库级辅助脚本
├── assets/ # 共享品牌 / 构建资源
└── releases/ # 发布说明(二进制走 GitHub Releases)
仓库只保留源码、工程文件、必要资源与文档。
构建产物(.exe、.msi、.app、.dmg)不入库,安装包统一走 GitHub Releases。
- macOS 客户端已走过迁入后整理阶段,进入活跃功能开发期 —— 本轮新增了 voice agent、飞书集成、新工作台 UI。
- Windows 客户端维持在迁入时的 baseline。
- Windows / macOS 保留独立平台目录:运行时、UI 模型、系统能力差异较大,不混合管理。
docs/repo-layout.mddocs/installation.mddocs/architecture.mddocs/releases.mdplatforms/macos/README.mdplatforms/macos/client/README.mdplatforms/windows/README.md