Conversation
|
@mbailey please take a look I was finally able to run the |
This commit fixes two issues preventing VoiceMode from running on native Windows: 1. conch.py: Replace Unix-only fcntl with platform-aware file locking - Use msvcrt.locking() on Windows - Use fcntl.flock() on Unix (unchanged behavior) 2. simple_failover.py: Add whisper.cpp /inference endpoint support - Try whisper.cpp's native /inference endpoint for local servers - Fall back to OpenAI-compatible endpoint if /inference fails - Maintains backward compatibility with OpenAI-compatible servers Tested on Windows 11 with: - Python 3.13 - whisper.cpp server (whisper-server.exe) - Kokoro TTS server Closes mbailey#232 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
b07ffd1 to
dc551d5
Compare
|
That sounds awesome. I need to sort out a Windows testing environment as I don't run windows on any computers I own. I might need to pay for parallels virtualisation and Windows license. |
|
let me know if I can rent for you a VM or something for testing if it would relatively cheap I can just provide it to you as my contribution to the project I really want this tool to succeed asap and become an industry standard |
|
Thanks for the contribution! This PR addresses two separate concerns:
We're happy to merge the file locking fix - it's a clean implementation that enables Windows support without affecting Unix behavior. However, we'd like to discuss the whisper endpoint change separately. Our design philosophy is to support OpenAI-compatible APIs as the de facto standard interface. This keeps both the interface and implementation simple. For Unix systems, we provide builds that expose this standard endpoint, and users on other platforms can use wrappers like whisper-cpp-python or faster-whisper-server that provide OpenAI compatibility. The current implementation tries Could you split this into two PRs? We can merge the file locking change right away, and then discuss the whisper endpoint approach separately - perhaps with a configuration option or smarter detection rather than trying both endpoints sequentially. Thanks again for your work on Windows support! |
|
@mbailey I've made more tests today and here is a result documented: #239 the LiveKit instead Windows STT Working via LiveKit TransportFindingTested native Windows STT successfully using Test result:
Root Cause ClarificationThe
Relationship to This PR
RecommendationLiveKit transport is the cleanest Windows audio fix. The fcntl changes in this PR are still valuable for complete Windows support (multi-agent scenarios). The whisper endpoint change could be a separate config option as suggested. |
|
Once again ! very loudly LiveKit transport is the cleanest(!) Windows audio fix |
|
I'm a complete newbie, and I'm not even using it for coding. Rather, I just want to get VoiceMode enabled on my claude desktop for normal work and usage. Is there any way that I can get this to work for my purposes and needs? |
|
It's possible now but requires some undocumented steps. I've been building a solution you're welcome to try that supports all of the Claude platform. It's not documented and will change a lot so it's only for the curious but might give you a taste of what's coming. I would strongly recommend you consider installing Claude Code - the terminal is not just for code. :-) |
This commit fixes two issues preventing VoiceMode from running on native Windows:
conch.py: Replace Unix-only fcntl with platform-aware file locking
simple_failover.py: Add whisper.cpp /inference endpoint support
Tested on Windows 11 with:
Closes #232