Category Archives: Speech

Text to Speech on mobile with Xamarin

Previously I looked at Speech synthesis in Windows with C# but what about speech synthesis on mobile.

Xamarin.Essentials has this covered with the TextToSpeechTextToSpeech class and is included by default when creating a Xamarin Forms mobile application.

It’s really simple to add text to speech capabilities, we simply write code such as

await TextToSpeech.SpeakAsync("Hello World");

As you can see by the use of the await keyword (and the standard Async suffix), the SpeakAsync method returns a task. It also accepts a cancellation token if you need to cancel mid utterance. There’s also an overload which accepts SpeechOptions which allows us to set the volume, pitch and locale.

The second TextToSpeech method is GetLocalesAsync which allows is to get a list of supported locales from the device that can then be used within the SpeakAsync method’s SpeechOptions.

await TextToSpeech.GetLocalesAsync();

Note: It’s fun listening to the attempts at different accents depending upon the locale.

Speech synthesis in Windows with C#

As part of a little side project, I wanted to have Windows speak some text is response to a call to a service. Initially I started to look at Cortana, but this seemed overkill for my initial needs (i.e. you need to write a bot, deploy it to Azure etc.), whereas System.Speech.Synthesis offers a simple API to use text to speech.

Getting started

It can be as simple as this, add the System.Speech assembly to your references then we can use the following

using (var speech = new SpeechSynthesizer())
{
   speech.Volume = 100; 
   speech.Rate = -2; 
   speech.Speak("Hello World");
}

Volume takes a value between [0, 100] and Rate which is the speaking rate can range between [-10, 10].

Taking things a little further

We can also look at controlling other aspects of speech such as emphasis, when the word should be spelled out and ways to create “styles” for different sections of speech. To use these we can use the PromptBuilder. Let’s start by create a “style” for a section of speech

var promptBuilder = new PromptBuilder();

var promptStyle = new PromptStyle
{
   Volume = PromptVolume.Soft,
   Rate = PromptRate.Slow
};

promptBuilder.StartStyle(promptStyle);
promptBuilder.AppendText("Hello World");
promptBuilder.EndStyle();

using (var speech = new SpeechSynthesizer())
{
   speech.Speak(promptBuilder);
}

We can build up our speech from different styles and include emphasis using

promptBuilder.AppendText("Hello ", PromptEmphasis.Strong);

we can also spell out words using

promptBuilder.AppendTextWithHint("Hello", SayAs.SpellOut);

Speech Recognition

The System.Speech assembly also includes the ability to recongize speech. This will require the Speech Recongition software within Windows to run (see Control Panel | Ease of Access | Speech Recognition.

To enable your application to use speech recognition you need to execute the following

SpeechRecognizer recognizer = new SpeechRecognizer();

Just executing this will allow the speech recognition code (on the focused application) to do things like execute button code as if the button was pressed. You can also hook into events to use recogonised speech within your application, for example using

SpeechRecognizer recognizer = new SpeechRecognizer();
recognizer.SpeechRecognized += (sender, args) => 
{ 
   input.Text = args.Result.Text; 
};