Skip to content

feat/azure-asr-support#175

Open
ismailariyan wants to merge 10 commits intoTHU-MAIC:mainfrom
ismailariyan:feat/azure-asr
Open

feat/azure-asr-support#175
ismailariyan wants to merge 10 commits intoTHU-MAIC:mainfrom
ismailariyan:feat/azure-asr

Conversation

@ismailariyan
Copy link
Copy Markdown

Summary

Add Azure STT (Speech-to-Text) support using Azure's Fast Transcription REST API. This is a minimal, non-breaking addition that follows the existing ASR provider pattern.

Changes

  • lib/audio/types.ts: Add azure-asr to ASRProviderId union type
  • lib/audio/constants.ts: Add Azure provider configuration with supported languages and formats
  • lib/audio/asr-providers.ts: Implement transcribeAzureASR() function for Fast Transcription API
  • lib/store/settings.ts: Add azure-asr to default ASR config
  • lib/i18n/settings.ts: Add zh-CN and en-US translations for "Azure STT"
  • lib/server/provider-config.ts: Add ASR_AZURE env mapping
  • components/settings/audio-settings.tsx: Add Azure to provider name map
  • components/settings/index.tsx: Add Azure to provider name map
  • .env.example: Add ASR_AZURE_API_KEY and ASR_AZURE_BASE_URL

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Refactoring (no functional changes)
  • CI/CD or build changes

Verification

Steps to reproduce / test

  1. Configure ASR_AZURE_API_KEY and ASR_AZURE_BASE_URL in .env (e.g., https://eastus2.stt.speech.microsoft.com or https://eastus2.api.cognitive.microsoft.com)
  2. Select "Azure STT" as the ASR provider in settings
  3. Record audio using the microphone button

What you personally verified

  • Azure URL backward compatibility (old .stt.speech.microsoft.com format auto-converts to new .api.cognitive.microsoft.com format)
  • WebM audio format works with Azure Fast Transcription API
  • Language selection passes correct locale to Azure API
  • Error messages display correctly when API key or URL is misconfigured

Evidence

  • Uses Azure Fast Transcription API v2025-10-15
  • Supports WebM, OGG, WAV, MP3, FLAC, M4A formats
  • Supports 12 languages: zh, en, ja, ko, de, fr, es, it, pt, ru, ar, hi

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have added/updated documentation as needed
  • My changes do not introduce new warnings
  • CI passes (pnpm check && pnpm lint && npx tsc --noEmit)
  • Manually tested locally
  • Screenshots / recordings attached (if UI changes)

Copy link
Copy Markdown
Contributor

@wyuc wyuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Settings page crashes when Azure STT is selected — asrProvider.models is undefined.

Add models: [] and defaultModelId: '' to the azure-asr config in constants.ts. Other providers without model selection (like browser-native) declare these explicitly.

@ismailariyan ismailariyan requested a review from wyuc April 2, 2026 17:54
@ismailariyan
Copy link
Copy Markdown
Author

Settings page crashes when Azure STT is selected — asrProvider.models is undefined.

Add models: [] and defaultModelId: '' to the azure-asr config in constants.ts. Other providers without model selection (like browser-native) declare these explicitly.

i have fixd the bug, @wyuc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants