feat: Support de la transcription locale avec WhisperKit#1
Conversation
Add on-device transcription as an alternative to cloud API: - TranscriptionProvider protocol for provider abstraction - CloudTranscriptionProvider for existing OpenAI API - LocalTranscriptionProvider with WhisperKit integration - TranscriptionMode enum to switch between cloud/local - Settings UI to select mode and download local models - Model download management with progress tracking Benefits: - Offline transcription (no internet required) - Privacy (audio stays on device) - No API costs for local mode Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR adds local on-device transcription using WhisperKit as an alternative to the cloud-based OpenAI API. Users can now choose between cloud and local transcription modes, with multiple model options for local transcription.
Changes:
- Added TranscriptionProvider protocol with Cloud and Local implementations
- Introduced TranscriptionMode enum with cloud/local options
- Added UI for model selection, download management, and status display
- Integrated WhisperKit dependency for local transcription
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
Whisper/Services/TranscriptionProvider.swift |
New protocol defining transcription interface |
Whisper/Services/CloudTranscriptionProvider.swift |
Extracted existing OpenAI API logic into provider pattern |
Whisper/Services/LocalTranscriptionProvider.swift |
New provider for WhisperKit-based local transcription |
Whisper/Helpers/TranscriptionMode.swift |
Enum defining cloud vs local transcription modes |
Whisper/Helpers/Constants.swift |
Added LocalWhisperModel enum and ModelDownloadState with model metadata |
Whisper/AppState.swift |
Added mode selection, provider switching, and model management methods |
Whisper/Views/SettingsView.swift |
Added mode picker, model management UI, and ModelRowView component |
Whisper/Views/MenuBarView.swift |
Updated status indicators to reflect download state and transcription mode |
Whisper.xcodeproj/project.pbxproj |
Added WhisperKit package dependency |
.gitignore |
Added entries for build artifacts and documentation files |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func validateAPIKey(_ apiKey: String) async -> Bool { | ||
| let originalKey = KeychainHelper.shared.getAPIKey() | ||
|
|
||
| _ = KeychainHelper.shared.save(apiKey: apiKey) | ||
|
|
||
| guard let url = URL(string: "https://api.openai.com/v1/models") else { | ||
| if let original = originalKey { | ||
| _ = KeychainHelper.shared.save(apiKey: original) | ||
| } | ||
| return false | ||
| } | ||
|
|
||
| var request = URLRequest(url: url) | ||
| request.setValue("Bearer \(apiKey)", forHTTPHeaderField: "Authorization") | ||
|
|
||
| do { | ||
| let (_, response) = try await URLSession.shared.data(for: request) | ||
| if let httpResponse = response as? HTTPURLResponse { | ||
| let isValid = httpResponse.statusCode == 200 | ||
| if !isValid, let original = originalKey { | ||
| _ = KeychainHelper.shared.save(apiKey: original) | ||
| } | ||
| return isValid | ||
| } | ||
| } catch { | ||
| if let original = originalKey { | ||
| _ = KeychainHelper.shared.save(apiKey: original) | ||
| } | ||
| } | ||
|
|
||
| return false | ||
| } |
There was a problem hiding this comment.
The validateAPIKey function temporarily saves the new API key to the keychain before validation (line 88), which causes a side effect even if validation fails. If the validation succeeds, the key remains saved, but if it fails or an exception occurs, the function attempts to restore the original key. This approach is problematic because it modifies the keychain state during validation. Consider validating the key without saving it first, and only save it after successful validation in the caller (AppState.updateAPIKey).
| @Published private(set) var downloadStates: [LocalWhisperModel: ModelDownloadState] = [:] | ||
| @Published private(set) var isInitializing = false | ||
|
|
||
| private var whisperKit: WhisperKit? |
There was a problem hiding this comment.
The LocalTranscriptionProvider stores a single WhisperKit instance but doesn't track which model it's initialized with. If a user downloads multiple models and switches between them, the transcription will continue using the previously initialized model instead of the newly selected one. Consider storing the currently loaded model identifier and reinitializing WhisperKit when the selected model changes.
| ) | ||
| ) | ||
|
|
||
| observeProgress(for: model, kit: kit) |
There was a problem hiding this comment.
The progress observation is set up after the WhisperKit initialization completes (line 74), but the download happens during initialization (lines 63-72). This means the progress updates won't be captured during the actual download, and the UI won't show accurate download progress. The progress observation should be set up before calling WhisperKit initialization.
| private var modelStatusIndicator: some View { | ||
| HStack(spacing: 6) { | ||
| Circle() | ||
| .fill(modelStatusColor) | ||
| .frame(width: 6, height: 6) | ||
| .shadow(color: modelStatusColor.opacity(0.4), radius: 3) | ||
|
|
||
| Text(appState.modelDownloadState.statusText) | ||
| .font(.system(size: 10, weight: .medium)) | ||
| .foregroundColor(.secondary) | ||
| } | ||
| } | ||
|
|
||
| private var modelStatusColor: Color { | ||
| switch appState.modelDownloadState { | ||
| case .downloaded: | ||
| return .green | ||
| case .downloading: | ||
| return .blue | ||
| case .notDownloaded: | ||
| return .orange | ||
| case .error: | ||
| return .red | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
The modelStatusIndicator view and modelStatusColor computed property are defined but never used in the SettingsView. These can be removed to reduce code clutter, or if they were intended to be used somewhere, they should be integrated into the UI.
| private var modelStatusIndicator: some View { | |
| HStack(spacing: 6) { | |
| Circle() | |
| .fill(modelStatusColor) | |
| .frame(width: 6, height: 6) | |
| .shadow(color: modelStatusColor.opacity(0.4), radius: 3) | |
| Text(appState.modelDownloadState.statusText) | |
| .font(.system(size: 10, weight: .medium)) | |
| .foregroundColor(.secondary) | |
| } | |
| } | |
| private var modelStatusColor: Color { | |
| switch appState.modelDownloadState { | |
| case .downloaded: | |
| return .green | |
| case .downloading: | |
| return .blue | |
| case .notDownloaded: | |
| return .orange | |
| case .error: | |
| return .red | |
| } | |
| } |
| func initializeIfNeeded(for model: LocalWhisperModel) async throws { | ||
| guard downloadStates[model] != .downloaded else { return } | ||
|
|
||
| isInitializing = true | ||
| defer { isInitializing = false } | ||
|
|
||
| downloadStates[model] = .downloading(progress: 0, bytesDownloaded: 0, bytesTotal: 0) | ||
|
|
||
| do { | ||
| let kit = try await WhisperKit( | ||
| WhisperKitConfig( | ||
| model: model.fileName, | ||
| verbose: false, | ||
| logLevel: .none, | ||
| prewarm: true, | ||
| load: true, | ||
| download: true | ||
| ) | ||
| ) | ||
|
|
||
| observeProgress(for: model, kit: kit) | ||
|
|
||
| whisperKit = kit | ||
| downloadStates[model] = .downloaded | ||
| } catch { | ||
| downloadStates[model] = .error(error.localizedDescription) | ||
| throw LocalTranscriptionError.initializationFailed(error.localizedDescription) | ||
| } | ||
| } |
There was a problem hiding this comment.
There's no protection against concurrent model downloads or initializations. If multiple download requests are made simultaneously (e.g., user clicks download button multiple times, or tries to transcribe while a download is in progress), multiple WhisperKit instances could be created concurrently, leading to race conditions and resource conflicts. Consider adding a guard to check if a download/initialization is already in progress for a specific model.
| case .cloud: | ||
| return hasAPIKey | ||
| case .local: | ||
| return modelDownloadState.isReady || modelDownloadState == .notDownloaded |
There was a problem hiding this comment.
The canTranscribe property returns true for local mode when modelDownloadState is .notDownloaded, which would allow users to attempt transcription before downloading a model. This will cause transcription to fail. The condition should only return true when the model is actually ready (.downloaded state), not when it's not downloaded.
| return modelDownloadState.isReady || modelDownloadState == .notDownloaded | |
| return modelDownloadState.isReady |
| whisperKit = nil | ||
| } | ||
|
|
||
| progressCancellables.removeAll() |
There was a problem hiding this comment.
The progressCancellables are removed for all models (line 144) when deleting a single model, which will stop progress tracking for any ongoing downloads of other models. This should only remove the cancellable for the specific model being deleted, not all cancellables.
| progressCancellables.removeAll() | |
| progressCancellables[model] = nil |
| var fileName: String { | ||
| switch self { | ||
| case .base: return "openai_whisper-base" | ||
| case .small: return "openai_whisper-tiny" |
There was a problem hiding this comment.
The fileName for the .small model case is incorrectly set to "openai_whisper-tiny" but should be "openai_whisper-small" to match the enum case name and the display name "Small". This mismatch will cause the wrong model to be downloaded when users select the "Small" model.
| case .small: return "openai_whisper-tiny" | |
| case .small: return "openai_whisper-small" |
| } | ||
|
|
||
| func initializeIfNeeded(for model: LocalWhisperModel) async throws { | ||
| guard downloadStates[model] != .downloaded else { return } |
There was a problem hiding this comment.
The guard statement returns early if the model is already downloaded, but this prevents initialization when the model is downloaded but WhisperKit hasn't been initialized yet. This will cause transcription to fail if a model is downloaded but the WhisperKit instance wasn't created. The check should verify if whisperKit is nil OR if the model state is not downloaded.
| guard downloadStates[model] != .downloaded else { return } | |
| guard whisperKit == nil || downloadStates[model] != .downloaded else { return } |
| if Constants.selectedLocalModel == model, whisperKit != nil { | ||
| whisperKit = nil | ||
| } | ||
|
|
There was a problem hiding this comment.
The whisperKit instance is set to nil twice in this function - once at line 141 and again at line 154. The first check at line 140 is redundant since the second check at line 153 already handles the same condition. Remove the first if block (lines 140-142) to eliminate the duplicate logic.
| if Constants.selectedLocalModel == model, whisperKit != nil { | |
| whisperKit = nil | |
| } | |
Fixes the following issues identified in PR review: - Track which model WhisperKit is initialized for (currentLoadedModel) - Add concurrency protection to prevent race conditions during downloads - Fix progress observation timing (observe before initialization) - Only remove specific model's progressCancellable on deletion - Fix canTranscribe to require model to be ready (not just notDownloaded) - Fix .small fileName: "openai_whisper-small" (was "openai_whisper-tiny") - Fix initializeIfNeeded guard to properly check whisperKit instance - Add proper error message when trying to record without downloaded model - Remove unused dead code (modelStatusIndicator, modelStatusColor) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Résumé
Cette PR ajoute la possibilité de faire de la transcription directement sur l'appareil en utilisant WhisperKit, en alternative à l'API cloud OpenAI.
Fonctionnalités ajoutées
🎛️ Choix du mode de transcription
L'utilisateur peut maintenant choisir entre :
📦 Plusieurs modèles disponibles
4 modèles WhisperKit au choix, selon les besoins :
🔧 Gestion des modèles
Avantages du mode local
Architecture
TranscriptionProvider: Protocol pour l'abstraction des providersCloudTranscriptionProvider: Provider pour l'API OpenAI existanteLocalTranscriptionProvider: Provider WhisperKit avec gestion du téléchargementTranscriptionMode: Enum pour switcher entre les modesRétrocompatibilité
Le mode Cloud reste le mode par défaut, donc aucun changement de comportement pour les utilisateurs existants.
🤖 Développé avec Claude Code