This document describes the current runtime entrypoints and key server modules.
server/server.py- Main device server process
- Auto-detects wired USB serial or accepts Wi-Fi/TCP connections
- Handles ESP32 packet I/O, STT pipeline, Agent mode orchestration
- Starts the optional web dashboard and logs dashboard URL(s) plus
/api/docs
ccoli/cli.pyccoli setup/ccoli install- LLM install target + device connection mode (
wired/wifi) onboarding
- LLM install target + device connection mode (
ccoli startccoli config wifi <WiFi Name> password <password> port <port>
server/src/protocol.py- Packet type constants
- Packet encode/decode helpers
- CMD/AUDIO_OUT send helpers
server/src/connection_manager.py- USB serial auto-detect + TCP listen/accept loop
- Live
Wired > WiFipriority polling inautomode
server/src/stt_engine.py- Whisper model load + transcription wrapper
- GPU/CPU device-priority fallback
- Current production path uses
faster-whisper, so STT runs oncpu/cudaand does not use AppleMPS
server/src/audio_processor.py- Audio quality checks, trim, normalization
server/src/agent_mode.py- Agent response orchestration (LLM/TTS/services)
- Sends deterministic time-of-day connection greetings without an LLM call
server/src/integrations/calendar_google.py- Google Calendar OAuth refresh-token handling
- Calendar Events API list/create/update/delete operations
- Setup guide:
docs/GOOGLE_CALENDAR_GUIDE.md
server/src/robot_mode.py- Robot command parser (currently gated by feature flag)
server/src/llm_client.py- Multi-provider LLM wrapper
- Runtime-verified priority order:
ollama -> api -> ollama_cpu -> other - Pins the first successful LLM candidate after startup or priority/config reload, so later requests reuse the same route until preferences change
- Disables Gemini thinking with
thinkingBudget: 0for runtime calls
server/src/runtime_preferences.py- Runtime priority defaults, model resolution, hardware detection
server/src/runtime_controller.py- Conversational priority commands such as
우선순위 상태 - Live runtime reload for LLM/STT/connection preference changes
- Conversational priority commands such as
server/src/input_gate.py- Stream gating for turn-based processing
server/src/job_queue.py- Queue utility for STT/TTS command flows
server/config.yaml(primary)server/.env(optional overrides, seeserver/env.example)
GET /api/status- Returns the existing dashboard status payload
- Now also includes
runtimewith current model/network/processor priority state - If optional web packages such as
uvicornare missing, the device server keeps running and logs an install hint instead of exiting
POST /api/chat- Regular agent chat
- Also accepts runtime-priority commands that would normally be spoken to the device
- Agent mode: enabled
- Robot mode: disabled by default via
features.robot_mode_enabled: false