Version: 3.2 Architecture: 10-Component AI Ensemble with Dual-Path Routing + Adaptation Engine Status: MVP Development (99.5% Complete) — installable as a Progressive Web App
ChaosLimbă is a sophisticated adaptive language learning platform designed specifically for Romanian language instruction. The system employs a 10-component AI ensemble with intelligent dual-path routing, a 3-tier fossilization adaptation engine, and grammar workshop system — all grounded in SLA theory.
- Multimodal Input: Accept both speech and text inputs from learners
- Intelligent Routing: Automatically detects input type and activates appropriate analysis components
- Comprehensive Assessment: Grammar, pronunciation, semantic similarity, and intonation analysis
- 3-Tier Adaptation Engine: Fossilization detection with escalating intervention (nudge → push → destabilize)
- Grammar Workshop: Targeted micro-challenges (transform, complete, fix, rewrite) with vocab exercises
- Smart Content Selection: Weighted random content targeting based on error patterns and adaptation profile
- Installable PWA: Web app manifest + service worker, works as a standalone app on desktop and mobile home screens
- Cost Optimized: $0-5/month (72-100% under original budget — nearly all FREE APIs)
| # | Component | Model/Technology | Active For | Status |
|---|---|---|---|---|
| 1 | Speech Recognition | whisper-large-v3 (Groq API) | Speech only | ✅ COMPLETE |
| 2 | Pronunciation Analysis | romanian-wav2vec2 (HF Inference) | Speech only | ✅ COMPLETE |
| 3 | Grammar Correction | Claude Haiku 4.5 (Anthropic API) | Both | ✅ COMPLETE |
| 4 | SPAM-A: Semantic Similarity | multilingual-MiniLM-L12-v2 (HF) | Both | ✅ COMPLETE |
| 5 | SPAM-B: Relevance Scorer | Reuses SPAM-A embeddings (HF) | Both | ✅ COMPLETE |
| 6 | SPAM-D: Intonation Mapper | Rule-based lookup (50-100 pairs) | Speech only | ✅ COMPLETE |
| 7 | Conductor | Conditional logic | Both | ✅ COMPLETE |
| 8 | Feedback Aggregator | Integration logic | Both | ✅ COMPLETE |
| 9 | AI Tutor | Llama 3.3 70B (Groq API) | Both | ✅ COMPLETE |
| 10 | Adaptation Engine | 3-tier fossilization system | Both | ✅ COMPLETE |
Plus: Workshop Challenge Generator (Groq API) — grammar/vocab micro-challenges ✅ COMPLETE
Speech Input Path (1.0-1.5 seconds)
Audio → Speech Recognition (Groq) → Grammar + Pronunciation (parallel) →
SPAM-A Semantic → SPAM-D Intonation → Feedback Aggregator → Llama 3.3 Tutor
Text Input Path (0.5-0.8 seconds)
Text → Grammar (local) + SPAM-A Semantic (parallel) → Feedback Aggregator → Llama 3.3 Tutor
The text path skips pronunciation and intonation analysis, reducing processing time by 40-50% and API costs by ~40%.
- Framework: Next.js 16 (App Router)
- Language: TypeScript
- Styling: Tailwind CSS
- UI Components: shadcn/ui
- Speech Recognition: Groq API (whisper-large-v3) - FREE
- Pronunciation: HuggingFace Inference API (romanian-wav2vec2) - FREE
- Grammar: Anthropic API (Claude Haiku 4.5) - ~$0.001 per check
- Semantic (SPAM-A): HuggingFace Inference API (multilingual-MiniLM) - FREE
- Relevance (SPAM-B): Reuses SPAM-A embeddings - FREE
- AI Tutor: Llama 3.3 70B (Groq API) - FREE
- Workshop Challenges: Llama 3.3 70B (Groq API) - FREE
- Adaptation Engine: In-app logic (3-tier fossilization) - FREE
- Database: Neon PostgreSQL + Drizzle ORM
Analyzes user input (speech or text) and returns comprehensive grading report.
Request:
{
"user_id": "uuid",
"input_type": "speech" | "text",
"content": {
"audio_blob": "base64_encoded_audio", // for speech
"text": "Romanian text string" // for text
},
"context": {
"expected_response": "Expected Romanian text",
"difficulty_level": "A1" | "A2" | "B1" | "B2" | "C1" | "C2"
}
}Response (Speech Input):
{
"input_type": "speech",
"grammar": { "score": 0.85, "errors": [...] },
"pronunciation": { "phoneme_score": 0.78, "stress_accuracy": 0.82 },
"semantic": { "similarity_score": 0.90 },
"intonation": { "warnings": [...] },
"overall_score": 0.85
}Response (Text Input):
{
"input_type": "text",
"grammar": { "score": 0.85, "errors": [...] },
"pronunciation": null,
"semantic": { "similarity_score": 0.90 },
"intonation": null,
"overall_score": 0.87
}- Node.js 20+ and npm/yarn/pnpm/bun
- PostgreSQL database
- API keys:
- Anthropic API key (for Claude Haiku 4.5 grammar checking)
- Groq API key (for Whisper speech recognition & Llama 3.3 tutor)
- HuggingFace token (for pronunciation & semantic analysis)
- Clerk keys (for authentication)
Create a .env.local file:
# --- Required ---
# Authentication (Clerk)
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=your_clerk_key
CLERK_SECRET_KEY=your_clerk_secret
# Database (Neon Postgres)
DATABASE_URL=postgresql://...
# AI Providers
ANTHROPIC_API_KEY=your_anthropic_key # Claude Haiku 4.5 grammar checking
GROQ_API_KEY=your_groq_api_key # Whisper speech recognition + Llama 3.3 tutor/workshop
HUGGINGFACE_API_KEY=your_hf_token # Pronunciation + semantic analysis (HUGGINGFACE_API_TOKEN also accepted)
# Cloudflare R2 Storage (audio assets)
R2_ENDPOINT=your_r2_endpoint
R2_ACCESS_KEY_ID=your_access_key
R2_SECRET_ACCESS_KEY=your_secret_key
R2_BUCKET_NAME=your_bucket_name
R2_PUBLIC_URL=your_public_url
# --- Optional ---
# Clerk webhook signature (for user sync)
CLERK_WEBHOOK_SECRET=whsec_...
# Email notifications (Resend)
RESEND_API_KEY=your_resend_key
RESEND_FROM_EMAIL="ChaosLimbă <onboarding@resend.dev>"
# ElevenLabs TTS (alternative voice pipeline)
ELEVENLABS_API_KEY=your_elevenlabs_key
# Google Cloud TTS (used for bulk content audio generation)
GOOGLE_TTS_CREDENTIALS_JSON='{"type":"service_account",...}'
# or point to a key file:
# GOOGLE_APPLICATION_CREDENTIALS=./google-tts-credentials.json
# Analytics (Umami)
NEXT_PUBLIC_UMAMI_URL=https://your-umami.example.com
NEXT_PUBLIC_UMAMI_WEBSITE_ID=your_website_id
# Cron / app URL
CRON_SECRET=a_random_secret
NEXT_PUBLIC_APP_URL=https://chaoslimba.adhdesigns.dev
# Dev tooling
DEV_PASSCODE=chaoslimba-dev # guards /api/dev/* routes in local dev# Install dependencies
npm install
# or
yarn install
# or
pnpm install
# or
bun installChaosLimbă requires the following system dependencies:
- Node.js 20+ (for Next.js 16)
- FFmpeg (for audio processing)
# macOS brew install ffmpeg # Ubuntu/Debian sudo apt-get install ffmpeg # Windows choco install ffmpeg
-
yt-dlp (extracts audio from YouTube videos when captions unavailable)
# macOS brew install yt-dlp # Linux/macOS (pip) pip install yt-dlp # Windows choco install yt-dlp
Note: Without yt-dlp, videos without captions will show title-only questions (graceful degradation).
# Check installations
ffmpeg -version
yt-dlp --version # Optionalnpm run dev
# or
yarn dev
# or
pnpm dev
# or
bun devOpen http://localhost:5001 with your browser to see the result. (The dev server binds to port 5001 — see package.json scripts.)
# Push schema changes to database
npm run db:push
# Open Drizzle Studio (database GUI)
npm run db:studio
# Seed initial data (optional)
npx tsx scripts/seed-content.ts
npx tsx scripts/seed-grammar-features.ts
npx tsx scripts/seed-curated-content.tschaoslimba/
├── src/
│ ├── app/ # Next.js app router
│ │ ├── (dashboard)/ # Dashboard layout group (12 pages)
│ │ │ ├── ask-tutor/ # On-demand linguistic explanations
│ │ │ ├── ce-inseamna/ # Word meaning lookup
│ │ │ ├── chaos-window/ # Interactive AI tutor sessions
│ │ │ ├── cum-se-pronunta/ # Pronunciation lookup
│ │ │ ├── deep-fog/ # Passive immersion mode
│ │ │ ├── error-garden/ # Error pattern visualization
│ │ │ ├── journey/ # Linguistic autobiography
│ │ │ ├── mystery-shelf/ # Unknown word collection
│ │ │ ├── proficiency-tracker/
│ │ │ ├── settings/ # User preferences + themes
│ │ │ └── workshop/ # Grammar/vocab micro-challenges
│ │ └── api/ # 46 API routes
│ │ ├── speech-to-text/
│ │ ├── analyze-pronunciation/
│ │ ├── aggregate-feedback/
│ │ ├── chaos-window/
│ │ ├── workshop/ # Challenge + evaluation endpoints
│ │ ├── journey/ # Linguistic autobiography
│ │ └── generated-content/ # AI content generation
│ ├── components/ # React components
│ │ ├── ui/ # shadcn/ui components
│ │ └── features/ # Feature-specific components
│ └── lib/
│ ├── ai/ # AI component wrappers
│ │ ├── conductor.ts # Orchestration logic
│ │ ├── aggregator.ts # Feedback combination
│ │ ├── tutor.ts # AI tutor (Llama 3.3 70B)
│ │ ├── adaptation.ts # 3-tier fossilization engine
│ │ ├── workshop.ts # Challenge generation + evaluation
│ │ ├── grammar.ts # Grammar analysis (Claude Haiku)
│ │ ├── groq.ts # Speech recognition + Groq client
│ │ ├── pronunciation.ts # Wav2Vec2 pronunciation
│ │ ├── spamA.ts # Semantic similarity
│ │ ├── spamB.ts # Relevance scoring
│ │ └── spamD.ts # Intonation mapping
│ └── db/
│ ├── schema.ts # Drizzle ORM schema
│ └── queries.ts # Smart content selection + queries
├── content/ # Curated learning content
├── scripts/ # Audio processing + content tools
└── public/ # Static assets
Passive immersion experience with content 1-3 levels above learner's proficiency, designed to build tolerance for ambiguity and contextual awareness.
Interactive practice environment with real-time feedback on grammar, pronunciation, and semantic accuracy.
Collection system for unknown words/phrases encountered during Deep Fog sessions, with contextual preservation.
Pattern clustering and interlanguage analysis system that tracks errors by input type (speech vs. text) and updates user profiles. Powers the Adaptation Engine's fossilization detection.
Grammar and vocabulary micro-challenge system with 4 grammar types (transform, complete, fix, rewrite) and 3 vocab types (use_it, which_one, spot_the_trap). Features non-linear flow, multiple choice UI, and destabilization-tier-aware challenge generation.
3-tier fossilization escalation system: Tier 1 (nudge, 40-69% error frequency), Tier 2 (push, ≥70% + 2 failed interventions), Tier 3 (destabilize, ≥70% + 4 failed interventions). Drives smart content selection and workshop feature targeting with dynamic weight adjustment.
Llama 3.3 70B-powered conversational system (via Groq, FREE) that generates "productive confusion" responses tailored to individual learner patterns. Receives fossilization alerts from the Adaptation Engine to target weak structures.
Installable on desktop and mobile via the browser's "Add to Home Screen" flow. Ships a web app manifest (src/app/manifest.ts), a conservative service worker (public/sw.js) that precaches static assets and serves an offline fallback, and an in-app install prompt that appears for eligible visitors on the dashboard. API, Clerk, and webhook requests always go directly to the network — user-specific responses are never cached.
- Text Input: 0.5-0.8 seconds
- Speech Input: 1.0-1.5 seconds
- Speech Recognition: 10-15% WER
- Pronunciation: 75-85% phoneme accuracy
- Grammar: LLM-based contextual analysis (Claude Haiku 4.5)
- Semantic: 80-85% similarity accuracy
- Intonation: >90% minimal pair detection
- Monthly Hosting: $0-5 (nearly all FREE APIs + minimal Claude Haiku usage)
- Per Request: ~$0.001 (Claude Haiku grammar checking with caching)
Cost Breakdown:
- Speech Recognition (Groq): FREE
- Pronunciation (HuggingFace): FREE
- Grammar (Claude Haiku):
$0.001 per check ($2/mo with caching)- Semantic / Relevance (HuggingFace): FREE
- AI Tutor + Workshop (Groq): FREE
- Adaptation Engine: FREE (in-app logic)
- 72-100% under original $10-18/month budget
Phase 1: Text Path (Days 1-4)
- ✅ Conductor (text branch)
- ✅ SPAM-A deployment (HF Inference)
- ✅ Feedback aggregator (text-only)
- ✅ Integration testing
Phase 2: Speech Path (Days 5-10)
- ✅ Whisper deployment (Groq API)
- ✅ Wav2Vec2 deployment (HF Inference)
- ✅ SPAM-D intonation mapper (implemented)
- ✅ Conductor extension (speech branch)
- ✅ Aggregator extension (speech data)
Phase 3: Integration (Days 11-13)
- ✅ Llama 3.3 70B tutor integration (Groq API)
- ✅ End-to-end testing
- ✅ All 10 components complete!
- ✅ Bug fixes and polish
- System Architecture:
/docs/architecture/system-architecture-description.md - Feature Specifications:
/docs/specifications/feature-specifications.md - AI Ensemble Implementation:
/docs/architecture/AI-ENSEMBLE-IMPLEMENTATION.md - AI Component Testing:
/docs/architecture/AI-COMPONENT-TESTING.md - Curriculum Framework:
/docs/pedagogy/chaoslimbă-curriculum-framework.md - Pedagogical Quality Review:
/docs/pedagogy/Pedagogical-Quality-Review.md - Dataset Documentation:
/docs/reference/datasets/datasets-list.md - CLI Scripts Reference:
/docs/reference/CLI-Scripts-Reference.md - Development Timeline:
/docs/Development Guides/ChaosLimba-Development-Documentation-Timeline.md - Development Guide:
/docs/Development Guides/Naes-Structured-Chaos-Development-Guide.md
Deployed Models:
- whisper-large-v3 (via Groq API - Speech Recognition)
- gigant/romanian-wav2vec2 (Pronunciation)
- Claude Haiku 4.5 (Grammar Correction)
- sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 (SPAM-A/B)
- Llama 3.3 70B (via Groq API - AI Tutor + Workshop)
Planned Models (Post-MVP):
- dumitrescustefan/bert-base-romanian-cased-v1 (for SPAM-C Dialectal)
This is a research and development project. Contributions are welcome! Please see the development guides in /docs/Development Guides/ for coding standards and workflow.
[Your License Here]
For questions or collaboration: lmdrew96@gmail.com
Built with ❤️ for Romanian language learners