Skip to content

feat: Add native Windows support#233

Open
kindcreator wants to merge 1 commit intombailey:masterfrom
kindcreator:fix/native-windows-support
Open

feat: Add native Windows support#233
kindcreator wants to merge 1 commit intombailey:masterfrom
kindcreator:fix/native-windows-support

Conversation

@kindcreator
Copy link
Copy Markdown

@kindcreator kindcreator commented Feb 1, 2026

This commit fixes two issues preventing VoiceMode from running on native Windows:

  1. conch.py: Replace Unix-only fcntl with platform-aware file locking

    • Use msvcrt.locking() on Windows
    • Use fcntl.flock() on Unix (unchanged behavior)
  2. simple_failover.py: Add whisper.cpp /inference endpoint support

    • Try whisper.cpp's native /inference endpoint for local servers
    • Fall back to OpenAI-compatible endpoint if /inference fails
    • Maintains backward compatibility with OpenAI-compatible servers

Tested on Windows 11 with:

  • Python 3.13
  • whisper.cpp server (whisper-server.exe)
  • Kokoro TTS server

Closes #232

@kindcreator
Copy link
Copy Markdown
Author

@mbailey please take a look

I was finally able to run the voicemode on windows natively (no WSL)
Since the WSL had 100500 issues this seams to be a nice step towards full cross-platform voicemode availability

This commit fixes two issues preventing VoiceMode from running on native Windows:

1. conch.py: Replace Unix-only fcntl with platform-aware file locking
   - Use msvcrt.locking() on Windows
   - Use fcntl.flock() on Unix (unchanged behavior)

2. simple_failover.py: Add whisper.cpp /inference endpoint support
   - Try whisper.cpp's native /inference endpoint for local servers
   - Fall back to OpenAI-compatible endpoint if /inference fails
   - Maintains backward compatibility with OpenAI-compatible servers

Tested on Windows 11 with:
- Python 3.13
- whisper.cpp server (whisper-server.exe)
- Kokoro TTS server

Closes mbailey#232

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@kindcreator kindcreator force-pushed the fix/native-windows-support branch 2 times, most recently from b07ffd1 to dc551d5 Compare February 1, 2026 21:43
@mbailey
Copy link
Copy Markdown
Owner

mbailey commented Feb 1, 2026

That sounds awesome. I need to sort out a Windows testing environment as I don't run windows on any computers I own. I might need to pay for parallels virtualisation and Windows license.

@kindcreator
Copy link
Copy Markdown
Author

let me know if I can rent for you a VM or something for testing

if it would relatively cheap I can just provide it to you as my contribution to the project

I really want this tool to succeed asap and become an industry standard
Let me know what you need and I'll estimate my capabilities

@ai-cora
Copy link
Copy Markdown
Collaborator

ai-cora commented Feb 2, 2026

Thanks for the contribution! This PR addresses two separate concerns:

  1. File locking (conch.py) - Platform-aware locking using msvcrt on Windows and fcntl on Unix
  2. Whisper endpoint (simple_failover.py) - Adding support for whisper.cpp's native /inference endpoint

We're happy to merge the file locking fix - it's a clean implementation that enables Windows support without affecting Unix behavior.

However, we'd like to discuss the whisper endpoint change separately. Our design philosophy is to support OpenAI-compatible APIs as the de facto standard interface. This keeps both the interface and implementation simple. For Unix systems, we provide builds that expose this standard endpoint, and users on other platforms can use wrappers like whisper-cpp-python or faster-whisper-server that provide OpenAI compatibility.

The current implementation tries /inference first for all local providers, which adds latency for the majority of users whose Whisper servers already expose the OpenAI-compatible endpoint.

Could you split this into two PRs? We can merge the file locking change right away, and then discuss the whisper endpoint approach separately - perhaps with a configuration option or smarter detection rather than trying both endpoints sequentially.

Thanks again for your work on Windows support!

@kindcreator
Copy link
Copy Markdown
Author

kindcreator commented Feb 2, 2026

@ai-cora @mbailey should I split the PR with 2 so there is a separate once for the File locking ?
Or you'll just do it on your own?

@kindcreator
Copy link
Copy Markdown
Author

kindcreator commented Feb 2, 2026

@mbailey I've made more tests today and here is a result documented: #239

the LiveKit instead


Windows STT Working via LiveKit Transport

Finding

Tested native Windows STT successfully using transport="auto" (default), which auto-detects LiveKit on port 7880.

Test result:

  • Mic input captured and transcribed correctly
  • STT provider: whisper-cpp (port 2022)
  • No explicit transport parameter needed

Root Cause Clarification

The WinError 32 temp file locking bug occurs in the local audio capture code, not in Whisper or the STT pipeline.

Transport How it captures audio Windows status
local Writes temp WAV file Broken (file lock)
livekit Streams via WebRTC Works

Relationship to This PR

PR #233 Component Still needed with LiveKit?
fcntl → msvcrt Yes - for wait_for_conch multi-agent coordination
whisper /inference endpoint Depends on whisper build - ours worked without it

Recommendation

LiveKit transport is the cleanest Windows audio fix. The fcntl changes in this PR are still valuable for complete Windows support (multi-agent scenarios). The whisper endpoint change could be a separate config option as suggested.

@kindcreator
Copy link
Copy Markdown
Author

kindcreator commented Feb 2, 2026

Once again ! very loudly

LiveKit transport is the cleanest(!) Windows audio fix

@Newuxtreme
Copy link
Copy Markdown

I'm a complete newbie, and I'm not even using it for coding. Rather, I just want to get VoiceMode enabled on my claude desktop for normal work and usage.

Is there any way that I can get this to work for my purposes and needs?

@mbailey
Copy link
Copy Markdown
Owner

mbailey commented Mar 20, 2026

It's possible now but requires some undocumented steps.

I've been building a solution you're welcome to try that supports all of the Claude platform.

https://voicemode.dev/

It's not documented and will change a lot so it's only for the curious but might give you a taste of what's coming.

I would strongly recommend you consider installing Claude Code - the terminal is not just for code. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Native Windows Support: fcntl ImportError and whisper.cpp endpoint compatibility

4 participants