Prism's speech backends on some platforms #12

diamondStar35 · 2026-04-09T17:20:14Z

diamondStar35
Apr 9, 2026

Hello.

First of all, I'm so sorry for this ambiguous title, I just couldn't really find something better to use as a title for the issues or the questions I have.

I've started using Prism recently on DotNet, and it's absolutely awesome! Especially because it's cross-platform and supports a lot of backends.

One note before describing my issues, I am not using Prismatoid wrapper for one reason. The wrapper only targets DotNet 10, and my project targets both DotNet 10 and DotNet 4.7.2, so I had to create a minimal wrapper that has the things I need.

My experience with Prism on Windows is awesome. There are just a few minor issues and questions here.

Why do I see UIA as a supported speech backend? Is that an intended behavior?
I know that somebody has opened a pull request that talks about ZDSR and some issues related to it, but I am not really sure if this is also related. My issue is that when people use my project, which by default uses "Context.AcquireBest()" to choose an automatic backend with ZDSR running, they get no speech at all.
I've noticed that One Core backend is much faster than SAPI. I don't mean responsiveness here, but rather the speech rate. A speech rate of 0.5 on One Core makes it behave like it's set to 2.0, or something like the rate boost in NVDA, if you get the idea.
When using the SAPI backend, my project accasionally hangs when navigating menus. I am not sure if the SAPI backend is synchronous or not, but it never happens with any other backend, that's why I'm asking here, because I think that behavior is also related to Prism itself and how it handles that.

On Linux, the situation is more interesting. I mentioned before that my project by default trys to acquire the best available backend, but with that setup, people also get no speech on Linux. I don't know what is the reason for that or if there's something to be done on Linux. I haven't tried changing the backend though because I don't even know what other backends are available, and because there is no speech.

I'm just putting it here so if anyone has faced a similar issue or something needs to be done.

I apologize if this is too long. I appreciate any help with this.

Regards.

ethindp · 2026-04-09T17:48:48Z

ethindp
Apr 9, 2026
Maintainer

Hi there,

Nah, thank you for opening the discussion and asking your questions! There is no such thing as a stupid question, so feel free to ask away!

To answer yours:

Why do I see UIA as a supported speech backend? Is that an intended behavior?

Yes, this is intended. UIA is technically not a TTS/screen reader backend in the typical sense. The UIA backend uses UI Automation to communicate with the screen reader, which effectively makes it screen reader agnostic: it works on anything that can speak UIA, including Narrator. The downside with UIA (and the reason the other backends are always preferred over it) is that when using them, speaking through them is much cheaper and makes it possible to communicate with your users even when your app is not focused. UIA is a lot more expensive because it needs to instantiate multiple COM objects, create a separate worker thread and window, etc. UIA however is pretty good and is used by things like OSARA.

I know that somebody has opened a pull request that talks about ZDSR and some issues related to it, but I am not really sure if this is also related. My issue is that when people use my project, which by default uses "Context.AcquireBest()" to choose an automatic backend with ZDSR running, they get no speech at all.

This is very concerning to me. It may be that once I merge PR #8 this issue will get resolved. I haven't tested ZDSR all that much (I don't have a license for it) so I wasn't aware it was dysfunctional, and I apologize.

I've noticed that One Core backend is much faster than SAPI. I don't mean responsiveness here, but rather the speech rate. A speech rate of 0.5 on One Core makes it behave like it's set to 2.0, or something like the rate boost in NVDA, if you get the idea.

Yeah, someone else privately reported this to me, and I'm not honestly sure why it happens, and I didn't fix it because that person provided little detail and I wasn't sure if it was their code or Prism. I may have used the wrong values when converting the ranges to [0.0, 1.0], although what makes it incredibly annoying is that Microsoft is extremely vague and contradictory as to the actual valid values, so I had to improvise. All they say about it is:

The tempo, relative to the default rate of the selected speech synthesis engine (voice).
This value can range from 0.5 (half the default rate) to 6.0 (6x the default rate), inclusive. The default value is 1.0 (the "normal" speaking rate for the current voice).

But, of course, they then go on to say:

Some voices have minimum speaking rates faster than 0.5 and maximum speaking rates slower than 6.0.

Speaking rate cannot be directly translated to words-per-minute because each voice and language can have a different default speaking rate.

To say this made the OneCore backend implementation frustrating was quite the understatement. So fixing this may be one of those long-term "let's just tinker with it" bug-fix projects, since Microsoft's own words are contradictory.

When using the SAPI backend, my project accasionally hangs when navigating menus. I am not sure if the SAPI backend is synchronous or not, but it never happens with any other backend, that's why I'm asking here, because I think that behavior is also related to Prism itself and how it handles that.

Yeah, SAPI has a lot of thread locking stuff that goes on internally to make it reasonably thread-safe. I hoped that this (wouldn't) cause any issues, but I may need to revisit it. Can you provide more detail as to when the hang happens? Or is it random? A small test program would also be appreciated.

On Linux, the situation is more interesting. I mentioned before that my project by default trys to acquire the best available backend, but with that setup, people also get no speech on Linux. I don't know what is the reason for that or if there's something to be done on Linux. I haven't tried changing the backend though because I don't even know what other backends are available, and because there is no speech.

Linux has only Orca and Speech Dispatcher backends available. If you use prism_backend_get_features you can test backend availability. If it says that Orca is available but the backend doesn't speak when used, that's definitely a bug. Same for Speech Dispatcher.

0 replies

diamondStar35 · 2026-04-09T18:05:21Z

diamondStar35
Apr 9, 2026
Author

Hi.

Thanks so much for your answers. I personally made it so that UIA is filtered and excluded from speech backends, because it would be confusing for people, but thank you for your explanation.

For SAPI, it feels like the program is trying to wait for the speech to start to process the next event. Like I said it doesn't happen to me with any other backend, so that's strange to me. Projects that implement SAPI support, such as NVGT, I think do not have this thread locking problem.

It is not actually random for SAPI. It's almost consistant, but the interesting part is, the program does not hang indefinitely when SAPI is speaking. It only happens when it trys to speak the next phrase.

Linux has only Orca and Speech Dispatcher backends available. If you use prism_backend_get_features you can test backend availability. If it says that Orca is available but the backend doesn't speak when used, that's definitely a bug. Same for Speech Dispatcher.

Unfortunately I haven't done any much testing on Linux, but I will do a small program that logs the available backends, and trys each of them to confirm this.

0 replies

ethindp · 2026-04-09T18:17:00Z

ethindp
Apr 9, 2026
Maintainer

For SAPI, it feels like the program is trying to wait for the speech to start to process the next event. Like I said it doesn't happen to me with any other backend, so that's strange to me.

I will definitely look into this because, although SAPI is written to be thread-safe, it should definitely not be causing this level of lock contention. I've already initiated the 0.11.2 release workflow, so we can incrementally test these changes, so once that's out can you see if most of your issues have been resolved? I will dig into SAPI and see what I can do about it.

0 replies

ethindp · 2026-04-09T18:28:35Z

ethindp
Apr 9, 2026
Maintainer

@diamondStar35 I may have a fix for SAPI, but I would like to test it with your setup first. Can you provide me some kind of MRE project I can play with?

0 replies

diamondStar35 · 2026-04-09T18:32:00Z

diamondStar35
Apr 9, 2026
Author

@ethindp

I may have a fix for SAPI, but I would like to test it with your setup first. Can you provide me some kind of MRE project I can play with?

I use it with one of my projects, called Top Speed. It is a racing audio game written in DotNet and uses this library. Unfortunately I do not have a minimal program yet, but if you would like me to do that I could try to do it.

In the mean time, if you want to test how SAPI behaves in my project, you can grab it from the following link: https://github.com/diamondStar35/top_speed/releases/download/release-build/TopSpeed-windows-x64-Release-v-2026.4.9.3.zip

0 replies

diamondStar35 · 2026-04-09T18:33:51Z

diamondStar35
Apr 9, 2026
Author

Also speaking of Linux and Prism. It seems the Linux built has to be revisited because, I've identified the issue.

I've created a minimal program that uses Prism, enumerates available backends, logs all supported backends and trys to use each of them. However, the program couldn't be even started because of the following:

2026-04-09 18:24:39.556 UTC
Prism probe started.

2026-04-09 18:24:39.607 UTC
Base directory: /home/eyad/Downloads/publish/

2026-04-09 18:24:39.609 UTC
Installed Prism native library resolver.

2026-04-09 18:24:39.610 UTC
Resolving Prism native library from: /home/eyad/Downloads/publish/libprism.so

2026-04-09 18:24:39.611 UTC
Failed to load libprism.so from the current directory.

2026-04-09 18:24:39.645 UTC
Probe failed.

2026-04-09 18:24:39.717 UTC
System.DllNotFoundException: Unable to load shared library 'prism' or one of its dependencies. In order to help diagnose loading problems, consider using a tool like strace. If you're using glibc, consider setting the LD_DEBUG environment variable:
/home/eyad/Downloads/publish/prism.so: cannot open shared object file: No such file or directory
/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /home/eyad/Downloads/publish/libprism.so)
libprism.so: cannot open shared object file: No such file or directory
/home/eyad/Downloads/publish/prism: cannot open shared object file: No such file or directory
/home/eyad/Downloads/publish/libprism: cannot open shared object file: No such file or directory

at TopSpeed.Speech.Prism.LinuxMethods.prism_config_init()
at TopSpeed.Speech.Prism.LinuxMethods.prism_config_init()
at TopSpeed.Speech.Prism.LinuxMethods.ConfigInit() in D:\top_speed\top_speed_net\TopSpeed\Speech\Prism\LinuxMethods.cs:line 131
at TopSpeed.Speech.Prism.Native.ConfigInit() in D:\top_speed\top_speed_net\TopSpeed\Speech\Prism\Native.cs:line 24
at TopSpeed.Speech.Prism.Context..ctor() in D:\top_speed\top_speed_net\TopSpeed\Speech\Prism\Context.cs:line 12
at Program.Main() in D:\top_speed\top_speed_net\PrismProbe\Program.cs:line 34

0 replies

ethindp · 2026-04-09T18:40:01Z

ethindp
Apr 9, 2026
Maintainer

@diamondStar35 The error on why Prism isn't working is right there: your glib version is too old (or glib is somehow not installed). You need GCC 13 or later. I could maybe try to get Prism building against an older version of Ubuntu (right now it uses 24.04) but that may be a problem (especially for ARM builds).

0 replies

diamondStar35 · 2026-04-09T18:41:46Z

diamondStar35
Apr 9, 2026
Author

The error on why Prism isn't working is right there: your glib version is too old (or glib is somehow not installed). You need GCC 13 or later. I could maybe try to get Prism building against an older version of Ubuntu (right now it uses 24.04) but that may be a problem (especially for ARM builds).

Yes, I am aware of that, and that's why I've shared it. It requires a newer version of Ubuntu, but this is a problem for many people, especially because people still use Ubuntu 22.0 or similar.

0 replies

ethindp · 2026-04-09T19:17:33Z

ethindp
Apr 9, 2026
Maintainer

@diamondStar35 I'm not sure about lowering the Ubuntu version. Ubuntu 22.04 has GCC 4:11.2.0-1ubuntu1 while 24.04 has 4:13.2.0-7ubuntu1, and Prism requires C++ 23 to compile.

0 replies

diamondStar35 · 2026-04-09T19:22:10Z

diamondStar35
Apr 9, 2026
Author

But you can actually install GCC-13 on Ubuntu 22.0 if that is required if I'm not wrong.

Also, I am trying to compile it on an actual Ubuntu 22.0 machine and will share the details soon if it works.

0 replies

diamondStar35 · 2026-04-09T19:33:28Z

diamondStar35
Apr 9, 2026
Author

@ethindp
This is the result trying the compiled library on Linux after building it on Ubuntu 22.0. The commands used to build the library were:

cmake -S . -B build -G Ninja -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc-13 -DCMAKE_CXX_COMPILER=g++-13

cmake --build build -j

2026-04-09 19:27:16.461 UTC
AcquireBest() failed.

2026-04-09 19:27:16.567 UTC
TopSpeed.Speech.Prism.PrismException: Already initialized
at TopSpeed.Speech.Prism.Backend.ThrowIfError(Error error) in D:\top_speed\top_speed_net\TopSpeed\Speech\Prism\Backend.cs:line 168
at TopSpeed.Speech.Prism.Backend.Initialize() in D:\top_speed\top_speed_net\TopSpeed\Speech\Prism\Backend.cs:line 106
at TopSpeed.Speech.Prism.Context.OpenBackend(IntPtr handle, UInt64 requestedId) in D:\top_speed\top_speed_net\TopSpeed\Speech\Prism\Context.cs:line 82
at TopSpeed.Speech.Prism.Context.AcquireBest() in D:\top_speed\top_speed_net\TopSpeed\Speech\Prism\Context.cs:line 58
at Program.TryAcquireBest(Context context) in D:\top_speed\top_speed_net\PrismProbe\Program.cs:line 65

2026-04-09 19:27:16.569 UTC
Trying backend by id: Orca

2026-04-09 19:27:16.572 UTC
Backend initialized successfully: Orca

2026-04-09 19:27:16.572 UTC
Backend name: Orca

2026-04-09 19:27:16.572 UTC
Backend requested id: 0x10AA1FC05A17F96C

2026-04-09 19:27:16.579 UTC
Backend features: Speak, Output, Stop

2026-04-09 19:27:16.580 UTC
Supports Speak: True

2026-04-09 19:27:16.580 UTC
Supports Output: True

2026-04-09 19:27:16.581 UTC
Supports Braille: False

2026-04-09 19:27:16.581 UTC
Supports Stop: True

2026-04-09 19:27:16.581 UTC
Trying sample text: Sample text from backend Orca.

2026-04-09 19:27:16.585 UTC
Backend failed: Orca

2026-04-09 19:27:16.585 UTC
TopSpeed.Speech.Prism.PrismException: Speak failure
at TopSpeed.Speech.Prism.Backend.ThrowIfError(Error error) in D:\top_speed\top_speed_net\TopSpeed\Speech\Prism\Backend.cs:line 168
at TopSpeed.Speech.Prism.Backend.Output(String text, Boolean interrupt) in D:\top_speed\top_speed_net\TopSpeed\Speech\Prism\Backend.cs:line 124
at Program.TrySpeakSample(Backend backend, String text) in D:\top_speed\top_speed_net\PrismProbe\Program.cs:line 121
at Program.TrySupportedBackends(Context context, IReadOnlyList`1 backends) in D:\top_speed\top_speed_net\PrismProbe\Program.cs:line 94

The same happens with Speech Dispatcher. Prism exception does not give any details on why it failed to speak.

0 replies

ethindp · 2026-04-09T19:43:20Z

ethindp
Apr 9, 2026
Maintainer

@diamondStar35 So, acquire_best returns an already initialized backend, and create/create_best does not. The docs explicitly call this out and you should treat an already initialized error as a sane condition: it does not indicate an actual problem, and backend initialization is done in such a way that it can only be done once unless the backend is explicitly freed first.

With respect to your other two problems, Orca and Speech Dispatcher both require that the respective engine be installed and running. It's odd that Orca is initializing because it shouldn't be if Orca is not running or does not provide the respective D-Bus service (which in 22.04 it would not). Do you check the IS_SUPPORTED_AT_RUNTIME bit?

As for speech dispatcher, this will take some looking into, as I pretty much completely defer to libspeechd for the actual handling and the backend is quite primitive in terms of actual work.

0 replies

diamondStar35 · 2026-04-09T19:47:43Z

diamondStar35
Apr 9, 2026
Author

@ethindp
Alright, I understand that now I should use CreateBest() and not AcquireBest(). But the problem is that Orca is already running, and Speech Dispatcher components are installed. Both fail to speak.
If Orca does not provide the DBus service on Ubuntu 22.0, then what about Speech Dispatcher?

0 replies

ethindp · 2026-04-09T20:06:53Z

ethindp
Apr 9, 2026
Maintainer

@diamondStar35 Can you try with the latest commit? I think I've fixed the libspeechd backend. If it still dies (i.e., libspeechd says nothing), this very well could be a bug with your setup.

As for Orca, I've tried to make it more rigorous with respect to actually detecting that this is Orca we're talking to. But without a full XML parser (which I am hesitant to pull in) and performing D-bus introspection, something can always "fake" the interface and Prism would be none the wiser.

0 replies

diamondStar35 · 2026-04-09T20:10:02Z

diamondStar35
Apr 9, 2026
Author

@ethindp
I've checked the IsSupported as you mentioned, and Orca is showing as unsupported at runtime. Speech Dispatcher on my machine makes the program hangs indefinitely trying to use it. I'll try with your latest commit though.

So another question. Does that mean that using Prism on Ubuntu 22.0 with Orca is not possible?

0 replies

ethindp · 2026-04-09T20:13:04Z

ethindp
Apr 9, 2026
Maintainer

I've checked the IsSupported as you mentioned, and Orca is showing as unsupported at runtime.

Then do not use it. If that bit is clear, it explicitly signals that whatever component is needed for the backend to even function is not present. That bit is the only dynamic bit of that entire bitfield. It does not guarantee the backend will even initialize, but it determines whether it is even available.

0 replies

ethindp · 2026-04-09T20:14:13Z

ethindp
Apr 9, 2026
Maintainer

Does that mean that using Prism on Ubuntu 22.0 with Orca is not possible?

If you do not Have Orca 49 or laTer (I think 49 is the earliest that actually implements it), then no. Whether it is or is not available depends on your Orca version.

0 replies

diamondStar35 · 2026-04-09T20:35:23Z

diamondStar35
Apr 9, 2026
Author

@ethindp

Then do not use it. If that bit is clear, it explicitly signals that whatever component is needed for the backend to even function is not present. That bit is the only dynamic bit of that entire bitfield. It does not guarantee the backend will even initialize, but it determines whether it is even available.

Thank you for your help. The version of Orca in my distrobution was too old and I can't even update it. It was 42, so I had to delete the intire distrobution and try on Ubuntu 24.0. So this issue will probably be resolved by using a later version of Ubuntu.

I hope the issue of SAPI could be looked into though.

Oh my bad sorry. I'll try with the latest commit and SAPI.

0 replies

ethindp · 2026-04-09T20:38:21Z

ethindp
Apr 9, 2026
Maintainer

I believe I have already solved SAPI in the latest commit. It does not seem to be produceable in my tests anymore. It may still lag slightly when navigating certain menus for example, but that is more down to the SAPI speech engine used than SAPI itself, and if you speak to memory and play that you will get silence trimming. As is usual, use get features for backends to figure out what you are and aren't allowed to do for any given backend.

0 replies

diamondStar35 · 2026-04-09T20:40:52Z

diamondStar35
Apr 9, 2026
Author

I believe I have already solved SAPI in the latest commit. It does not seem to be produceable in my tests anymore. It may still lag slightly when navigating certain menus for example, but that is more down to the SAPI speech engine used than SAPI itself, and if you speak to memory and play that you will get silence trimming. As is usual, use get features for backends to figure out what you are and aren't allowed to do for any given backend.

Yes sorry my bad. I've edited my last comment when I saw the description of the latest commit. Apologies.

Speaking of SpeakToMemory, why does the normal speaking produce silence while speaking to memory (As you said) does not produce that silence? You mentioned that when speaking to memory you get silence trimming.

0 replies

ethindp · 2026-04-09T20:44:54Z

ethindp
Apr 9, 2026
Maintainer

@diamondStar35 Speak to memory gets leading/trailing silence because it is what the speech engine produces and what is sent to the audio device. SAPI does all that internally and Prism doesn't control it. I could make Prism do all of that and get silence trimming, but then I would also need to pull in Miniaudio and a bunch of other things that would dramatically complicate the (already complicated) SAPI backend, and a back-of-the-math cost-benefit analysis from my perspective is that the cost of actually implementing that would outweigh any benefit that may exist. speak_to_memory is the way to go if you want to do your own audio processing or custom playback or what have you.

0 replies

diamondStar35 · 2026-04-09T20:51:32Z

diamondStar35
Apr 9, 2026
Author

@ethindp Thanks for your explanation.

While I've tried to look for the latest release because I do not want to build it on Windows, since I don't have the proper tools for that, the latest release doesn't include your latest commit. Isn't this using workflows for generating release assets?

0 replies

ethindp · 2026-04-09T20:56:28Z

ethindp
Apr 9, 2026
Maintainer

It is, yes, but I manually initiate it due to PyPi. I didn't want to publish a new release until we could confirm that the issues were solved.

0 replies

diamondStar35 · 2026-04-09T21:03:44Z

diamondStar35
Apr 9, 2026
Author

Alright thanks so much for your help. I will be able to test it on my end with SAPI when the new release is published, either sooner or later, but I'm not worrying about it if it's already fixed, so feel free to delay it if you feel necessary.

0 replies

ethindp · 2026-04-09T22:08:03Z

ethindp
Apr 9, 2026
Maintainer

It has been published.

0 replies

Uh oh!

Prism's speech backends on some platforms #12

Uh oh!

diamondStar35 Apr 9, 2026

Replies: 25 comments

Uh oh!

ethindp Apr 9, 2026 Maintainer

Uh oh!

diamondStar35 Apr 9, 2026 Author

Uh oh!

ethindp Apr 9, 2026 Maintainer

Uh oh!

ethindp Apr 9, 2026 Maintainer

Uh oh!

diamondStar35 Apr 9, 2026 Author

Uh oh!

diamondStar35 Apr 9, 2026 Author

Uh oh!

ethindp Apr 9, 2026 Maintainer

Uh oh!

diamondStar35 Apr 9, 2026 Author

Uh oh!

ethindp Apr 9, 2026 Maintainer

Uh oh!

Uh oh!

diamondStar35 Apr 9, 2026 Author

Uh oh!

diamondStar35 Apr 9, 2026 Author

Uh oh!

ethindp Apr 9, 2026 Maintainer

Uh oh!

diamondStar35 Apr 9, 2026 Author

Uh oh!

ethindp Apr 9, 2026 Maintainer

Uh oh!

diamondStar35 Apr 9, 2026 Author

Uh oh!

ethindp Apr 9, 2026 Maintainer

Uh oh!

ethindp Apr 9, 2026 Maintainer

Uh oh!

Uh oh!

diamondStar35 Apr 9, 2026 Author

Uh oh!

ethindp Apr 9, 2026 Maintainer

Uh oh!

diamondStar35 Apr 9, 2026 Author

Uh oh!

ethindp Apr 9, 2026 Maintainer

Uh oh!

diamondStar35 Apr 9, 2026 Author

Uh oh!

ethindp Apr 9, 2026 Maintainer

Uh oh!

diamondStar35 Apr 9, 2026 Author

Uh oh!

ethindp Apr 9, 2026 Maintainer

diamondStar35
Apr 9, 2026

ethindp
Apr 9, 2026
Maintainer

diamondStar35
Apr 9, 2026
Author

ethindp
Apr 9, 2026
Maintainer

ethindp
Apr 9, 2026
Maintainer

diamondStar35
Apr 9, 2026
Author

diamondStar35
Apr 9, 2026
Author

ethindp
Apr 9, 2026
Maintainer

diamondStar35
Apr 9, 2026
Author

ethindp
Apr 9, 2026
Maintainer

diamondStar35
Apr 9, 2026
Author

diamondStar35
Apr 9, 2026
Author

ethindp
Apr 9, 2026
Maintainer

diamondStar35
Apr 9, 2026
Author

ethindp
Apr 9, 2026
Maintainer

diamondStar35
Apr 9, 2026
Author

ethindp
Apr 9, 2026
Maintainer

ethindp
Apr 9, 2026
Maintainer

diamondStar35
Apr 9, 2026
Author

ethindp
Apr 9, 2026
Maintainer

diamondStar35
Apr 9, 2026
Author

ethindp
Apr 9, 2026
Maintainer

diamondStar35
Apr 9, 2026
Author

ethindp
Apr 9, 2026
Maintainer

diamondStar35
Apr 9, 2026
Author

ethindp
Apr 9, 2026
Maintainer