Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,7 @@ vite.config.ts.timestamp-*

# Project specific
backend/audio/
backend/src/graphs/configs/
.DS_Store
CLAUDE.md
templates/
Expand Down
3 changes: 3 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,10 @@ Thank you for your interest in contributing to the Language Learning App! This d

```bash
INWORLD_API_KEY=your_api_key_here
# Set one of these:
ASSEMBLY_AI_API_KEY=your_api_key_here
# or
SONIOX_API_KEY=your_api_key_here
```

5. **Verify the setup**:
Expand Down
56 changes: 39 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ A conversational language learning app powered by Inworld AI Runtime. Practice s
- Node.js (v20 or higher)
- npm
- An Inworld AI account and API key
- An AssemblyAI account and API key (for speech-to-text)
- An [AssemblyAI](https://www.assemblyai.com/) or [Soniox](https://soniox.com/) account and API key (for speech-to-text)

## Get Started

Expand All @@ -35,17 +35,24 @@ This installs dependencies for the root, backend, and frontend automatically.

### Step 3: Configure Environment Variables

Create a `backend/.env` file:
Create a `backend/.env` file with your Inworld key and **one** of the two STT provider keys:

```bash
INWORLD_API_KEY=your_inworld_base64_key

# Pick one STT provider:
ASSEMBLY_AI_API_KEY=your_assemblyai_key
# or
SONIOX_API_KEY=your_soniox_key
```

| Service | Get Key From | Purpose |
| -------------- | --------------------------------------------------- | --------------------------------- |
| **Inworld** | [platform.inworld.ai](https://platform.inworld.ai/) | AI conversations (Base64 API key) |
| **AssemblyAI** | [assemblyai.com](https://www.assemblyai.com/) | Speech-to-text |
The server auto-detects which STT provider to use based on which API key is present. If both are set, Soniox takes priority.

| Service | Get Key From | Purpose |
| -------------- | ---------------------------------------------------- | --------------------------------- |
| **Inworld** | [platform.inworld.ai](https://platform.inworld.ai/) | AI conversations (Base64 API key) |
| **AssemblyAI** | [assemblyai.com](https://www.assemblyai.com/) | Speech-to-text (option 1) |
| **Soniox** | [soniox.com](https://soniox.com/) | Speech-to-text (option 2) |

### Step 4: Run the Application

Expand Down Expand Up @@ -102,6 +109,18 @@ VITE_SUPABASE_PUBLISHABLE_KEY=your_anon_key

Find these in: Supabase Dashboard > Settings > API

### Step 6 (Optional): Enable Flashcard Images with Replicate

When exporting flashcards to Anki, the app can generate a unique illustrative image for each vocabulary word using [Replicate](https://replicate.com/)'s FLUX Schnell model. Without this key, flashcards are exported with audio only.

Add to `backend/.env`:

```bash
REPLICATE_API_TOKEN=your_replicate_api_token
```

Get a token at [replicate.com/account/api-tokens](https://replicate.com/account/api-tokens).

## Repo Structure

```
Expand Down Expand Up @@ -143,7 +162,7 @@ The app uses a real-time audio streaming architecture:

1. **Frontend** captures microphone audio and streams it via WebSocket
2. **Backend** processes audio through an Inworld Runtime graph:
- AssemblyAI handles speech-to-text with voice activity detection
- Speech-to-text with voice activity detection (AssemblyAI or Soniox)
- LLM generates contextual responses in the target language
- TTS converts responses back to audio
3. **Flashcards** are auto-generated from conversation vocabulary
Expand All @@ -166,16 +185,19 @@ Without Supabase, the app works in anonymous mode using localStorage (no memory

## Environment Variables Reference

| Variable | Required | Description |
| --------------------------- | -------- | ------------------------------------------------------------------ |
| `INWORLD_API_KEY` | Yes | Inworld AI Base64 API key |
| `ASSEMBLY_AI_API_KEY` | Yes | AssemblyAI API key |
| `PORT` | No | Server port (default: 3000) |
| `LOG_LEVEL` | No | `trace`, `debug`, `info`, `warn`, `error`, `fatal` (default: info) |
| `NODE_ENV` | No | Set to `production` for production log format |
| `ASSEMBLY_AI_EAGERNESS` | No | Turn detection: `low`, `medium`, `high` (default: high) |
| `SUPABASE_URL` | No | Supabase project URL (enables memory feature) |
| `SUPABASE_SECRET_KEY` | No | Supabase secret key (for backend memory storage) |
| Variable | Required | Description |
| --------------------------- | ------------------ | ------------------------------------------------------------------ |
| `INWORLD_API_KEY` | Yes | Inworld AI Base64 API key |
| `ASSEMBLY_AI_API_KEY` | One of these two ↕ | AssemblyAI API key |
| `SONIOX_API_KEY` | One of these two ↑ | Soniox API key (takes priority if both are set) |
| `PORT` | No | Server port (default: 3000) |
| `LOG_LEVEL` | No | `trace`, `debug`, `info`, `warn`, `error`, `fatal` (default: info) |
| `NODE_ENV` | No | Set to `production` for production log format |
| `ASSEMBLY_AI_EAGERNESS` | No | AssemblyAI turn detection: `low`, `medium`, `high` (default: high) |
| `SONIOX_EAGERNESS` | No | Soniox endpoint detection: `low`, `medium`, `high` (default: high) |
| `SUPABASE_URL` | No | Supabase project URL (enables memory feature) |
| `SUPABASE_SECRET_KEY` | No | Supabase secret key (for backend memory storage) |
| `REPLICATE_API_TOKEN` | No | Replicate API token (enables flashcard image generation) |

## Testing

Expand Down
8 changes: 7 additions & 1 deletion backend/.env.example
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
INWORLD_API_KEY=

# Speech-to-text: set ONE of these (Soniox takes priority if both are set)
ASSEMBLY_AI_API_KEY=
SONIOX_API_KEY=

# Optional: generates images for Anki flashcards
REPLICATE_API_TOKEN=

SUPABASE_URL=
SUPABASE_SECRET_KEY=
SUPABASE_SECRET_KEY=
21 changes: 10 additions & 11 deletions backend/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions backend/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@
"cors": "^2.8.5",
"dotenv": "^17.2.1",
"express": "^4.22.1",
"jsonrepair": "^3.13.2",
"pino": "^10.1.0",
"uuid": "^11.1.0",
"ws": "^8.18.0"
Expand Down
21 changes: 19 additions & 2 deletions backend/src/__tests__/config/languages.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -74,13 +74,30 @@ describe('languages config', () => {
expect(codes).toContain('de');
});

it('matches SUPPORTED_LANGUAGES keys', () => {
it('without provider, returns only languages without requiredSttProvider', () => {
const codes = getSupportedLanguageCodes();
expect(codes.length).toBe(Object.keys(SUPPORTED_LANGUAGES).length);
for (const code of codes) {
expect(SUPPORTED_LANGUAGES[code]).toBeDefined();
expect(SUPPORTED_LANGUAGES[code].requiredSttProvider).toBeUndefined();
}
});

it('with soniox provider, returns all languages', () => {
const codes = getSupportedLanguageCodes('soniox');
expect(codes.length).toBe(Object.keys(SUPPORTED_LANGUAGES).length);
expect(codes).toContain('zh');
expect(codes).toContain('ja');
expect(codes).toContain('ko');
expect(codes).toContain('ru');
});

it('with assembly provider, excludes soniox-only languages', () => {
const codes = getSupportedLanguageCodes('assembly');
expect(codes).not.toContain('zh');
expect(codes).not.toContain('ja');
expect(codes).not.toContain('ko');
expect(codes).not.toContain('ru');
});
});

describe('getLanguageOptions', () => {
Expand Down
Loading