inworld-ai · cshape · Feb 20, 2026 · Feb 20, 2026 · Feb 20, 2026 · Feb 20, 2026
@@ -140,6 +140,7 @@ vite.config.ts.timestamp-*
 
 # Project specific
 backend/audio/
+backend/src/graphs/configs/
 .DS_Store
 CLAUDE.md
 templates/

@@ -33,7 +33,10 @@ Thank you for your interest in contributing to the Language Learning App! This d
 
    ```bash
    INWORLD_API_KEY=your_api_key_here
+   # Set one of these:
    ASSEMBLY_AI_API_KEY=your_api_key_here
+   # or
+   SONIOX_API_KEY=your_api_key_here
    ```
 
 5. **Verify the setup**:

@@ -14,7 +14,7 @@ A conversational language learning app powered by Inworld AI Runtime. Practice s
 - Node.js (v20 or higher)
 - npm
 - An Inworld AI account and API key
-- An AssemblyAI account and API key (for speech-to-text)
+- An [AssemblyAI](https://www.assemblyai.com/) or [Soniox](https://soniox.com/) account and API key (for speech-to-text)
 
 ## Get Started
 
@@ -35,17 +35,24 @@ This installs dependencies for the root, backend, and frontend automatically.
 
 ### Step 3: Configure Environment Variables
 
-Create a `backend/.env` file:
+Create a `backend/.env` file with your Inworld key and **one** of the two STT provider keys:
 
 ```bash
 INWORLD_API_KEY=your_inworld_base64_key
+
+# Pick one STT provider:
 ASSEMBLY_AI_API_KEY=your_assemblyai_key
+# or
+SONIOX_API_KEY=your_soniox_key
 ```
 
-| Service        | Get Key From                                        | Purpose                           |
-| -------------- | --------------------------------------------------- | --------------------------------- |
-| **Inworld**    | [platform.inworld.ai](https://platform.inworld.ai/) | AI conversations (Base64 API key) |
-| **AssemblyAI** | [assemblyai.com](https://www.assemblyai.com/)       | Speech-to-text                    |
+The server auto-detects which STT provider to use based on which API key is present. If both are set, Soniox takes priority.
+
+| Service        | Get Key From                                         | Purpose                           |
+| -------------- | ---------------------------------------------------- | --------------------------------- |
+| **Inworld**    | [platform.inworld.ai](https://platform.inworld.ai/)  | AI conversations (Base64 API key) |
+| **AssemblyAI** | [assemblyai.com](https://www.assemblyai.com/)         | Speech-to-text (option 1)         |
+| **Soniox**     | [soniox.com](https://soniox.com/)                     | Speech-to-text (option 2)         |
 
 ### Step 4: Run the Application
 
@@ -102,6 +109,18 @@ VITE_SUPABASE_PUBLISHABLE_KEY=your_anon_key
 
 Find these in: Supabase Dashboard > Settings > API
 
+### Step 6 (Optional): Enable Flashcard Images with Replicate
+
+When exporting flashcards to Anki, the app can generate a unique illustrative image for each vocabulary word using [Replicate](https://replicate.com/)'s FLUX Schnell model. Without this key, flashcards are exported with audio only.
+
+Add to `backend/.env`:
+
+```bash
+REPLICATE_API_TOKEN=your_replicate_api_token
+```
+
+Get a token at [replicate.com/account/api-tokens](https://replicate.com/account/api-tokens).
+
 ## Repo Structure
 
 ```
@@ -143,7 +162,7 @@ The app uses a real-time audio streaming architecture:
 
 1. **Frontend** captures microphone audio and streams it via WebSocket
 2. **Backend** processes audio through an Inworld Runtime graph:
-   - AssemblyAI handles speech-to-text with voice activity detection
+   - Speech-to-text with voice activity detection (AssemblyAI or Soniox)
    - LLM generates contextual responses in the target language
    - TTS converts responses back to audio
 3. **Flashcards** are auto-generated from conversation vocabulary
@@ -166,16 +185,19 @@ Without Supabase, the app works in anonymous mode using localStorage (no memory
 
 ## Environment Variables Reference
 
-| Variable                    | Required | Description                                                        |
-| --------------------------- | -------- | ------------------------------------------------------------------ |
-| `INWORLD_API_KEY`           | Yes      | Inworld AI Base64 API key                                          |
-| `ASSEMBLY_AI_API_KEY`       | Yes      | AssemblyAI API key                                                 |
-| `PORT`                      | No       | Server port (default: 3000)                                        |
-| `LOG_LEVEL`                 | No       | `trace`, `debug`, `info`, `warn`, `error`, `fatal` (default: info) |
-| `NODE_ENV`                  | No       | Set to `production` for production log format                      |
-| `ASSEMBLY_AI_EAGERNESS`     | No       | Turn detection: `low`, `medium`, `high` (default: high)            |
-| `SUPABASE_URL`              | No       | Supabase project URL (enables memory feature)                      |
-| `SUPABASE_SECRET_KEY`       | No       | Supabase secret key (for backend memory storage)                   |
+| Variable                    | Required           | Description                                                        |
+| --------------------------- | ------------------ | ------------------------------------------------------------------ |
+| `INWORLD_API_KEY`           | Yes                | Inworld AI Base64 API key                                          |
+| `ASSEMBLY_AI_API_KEY`       | One of these two ↕ | AssemblyAI API key                                                 |
+| `SONIOX_API_KEY`            | One of these two ↑ | Soniox API key (takes priority if both are set)                    |
+| `PORT`                      | No                 | Server port (default: 3000)                                        |
+| `LOG_LEVEL`                 | No                 | `trace`, `debug`, `info`, `warn`, `error`, `fatal` (default: info) |
+| `NODE_ENV`                  | No                 | Set to `production` for production log format                      |
+| `ASSEMBLY_AI_EAGERNESS`     | No                 | AssemblyAI turn detection: `low`, `medium`, `high` (default: high) |
+| `SONIOX_EAGERNESS`          | No                 | Soniox endpoint detection: `low`, `medium`, `high` (default: high) |
+| `SUPABASE_URL`              | No                 | Supabase project URL (enables memory feature)                      |
+| `SUPABASE_SECRET_KEY`       | No                 | Supabase secret key (for backend memory storage)                   |
+| `REPLICATE_API_TOKEN`       | No                 | Replicate API token (enables flashcard image generation)           |
 
 ## Testing
 

@@ -1,5 +1,11 @@
 INWORLD_API_KEY=
+
+# Speech-to-text: set ONE of these (Soniox takes priority if both are set)
 ASSEMBLY_AI_API_KEY=
+SONIOX_API_KEY=
+
+# Optional: generates images for Anki flashcards
+REPLICATE_API_TOKEN=
 
 SUPABASE_URL=
-SUPABASE_SECRET_KEY=
+SUPABASE_SECRET_KEY=
@@ -58,6 +58,7 @@
     "cors": "^2.8.5",
     "dotenv": "^17.2.1",
     "express": "^4.22.1",
+    "jsonrepair": "^3.13.2",
     "pino": "^10.1.0",
     "uuid": "^11.1.0",
     "ws": "^8.18.0"

@@ -74,13 +74,30 @@ describe('languages config', () => {
       expect(codes).toContain('de');
     });
 
-    it('matches SUPPORTED_LANGUAGES keys', () => {
+    it('without provider, returns only languages without requiredSttProvider', () => {
       const codes = getSupportedLanguageCodes();
-      expect(codes.length).toBe(Object.keys(SUPPORTED_LANGUAGES).length);
       for (const code of codes) {
         expect(SUPPORTED_LANGUAGES[code]).toBeDefined();
+        expect(SUPPORTED_LANGUAGES[code].requiredSttProvider).toBeUndefined();
       }
     });
+
+    it('with soniox provider, returns all languages', () => {
+      const codes = getSupportedLanguageCodes('soniox');
+      expect(codes.length).toBe(Object.keys(SUPPORTED_LANGUAGES).length);
+      expect(codes).toContain('zh');
+      expect(codes).toContain('ja');
+      expect(codes).toContain('ko');
+      expect(codes).toContain('ru');
+    });
+
+    it('with assembly provider, excludes soniox-only languages', () => {
+      const codes = getSupportedLanguageCodes('assembly');
+      expect(codes).not.toContain('zh');
+      expect(codes).not.toContain('ja');
+      expect(codes).not.toContain('ko');
+      expect(codes).not.toContain('ru');
+    });
   });
 
   describe('getLanguageOptions', () => {