Bridge Communications

Thursday, April 16, 2015

UCMA Text to Speech Library

In my last article I alluded to a another function I have written called SaySomething.  The idea here was a reusable function so you don't need to reinvent the wheel every time you want to speak and play and sound to a given caller.

The header

So to get started it is helpful to see what the header of the class looks like because I will use some of that data in the functions listed a bit later.

    internal class SaySomething
    {
        AudioVideoCall _call;
        SpeechSynthesizer _speechSynthesizer;
        string _whattosay;
        ILogger _logger = new ConsoleLogger();

        internal SaySomething(AudioVideoCall call, string whattosay)
        {
            _call = call;
            _whattosay = whattosay;
        }

I will be passing the _call I want the event to occur on as part of the setup in whatever ucma application I am going to call this function.

SaySomething says = new SaySomething(_call, "Hi, thank you for calling.  Dial 44 pound to enter the back to back conference.");
                                says.Start();

or

 SaySomething says3 = new SaySomething(_call, "Playing alarm sound.");
                                            says3.Start();
                                            says3.StartWAV(@"C:\windows\media\alarm01.wav");

The functions

Basically there are 2 functions I needed when I wrote this little library, and they are as follows;

1.  The ability to pay a sound file to a caller.  (StartWav)
2.  The ability to harness Microsoft's text to speech and play a given string to a caller. (Start)

Sound File

I accomplish this first goal with a StartWav(string wav) function where the string is the path to sound file I want to play.  It looks like this.

internal void StartWAV(string wav)
        {
            SpeechSynthesisConnector speechSynthesisConnector = new SpeechSynthesisConnector();
            
            try
            {
                int x = 0;
                while (_call.Flow == null | _call.Flow.State != MediaFlowState.Active)
                {
                    Thread.Sleep(10);
                    x++;
                    if (x > 500)
                        break;
                }
                speechSynthesisConnector.AttachFlow(_call.Flow);

                _speechSynthesizer = new SpeechSynthesizer();
                SpeechAudioFormatInfo audioformat = new SpeechAudioFormatInfo(16000, AudioBitsPerSample.Sixteen, Microsoft.Speech.AudioFormat.AudioChannel.Mono);
                _speechSynthesizer.SetOutputToAudioStream(speechSynthesisConnector, audioformat);

                speechSynthesisConnector.Start();

                PromptBuilder prompt = new PromptBuilder();
                prompt.AppendAudio(wav);
                _speechSynthesizer.Speak(prompt);
                prompt.ClearContent();

                speechSynthesisConnector.Stop();

                speechSynthesisConnector.DetachFlow();
            }
            catch (Exception ex)
            {
                _logger.Log(ex.Message);
            }
            finally
            {
                try
                {
                    speechSynthesisConnector.DetachFlow();
                }
                catch { }
            }
        }


You can see it attaches to the call flow, does it business and then detaches.  The TTS function is very similar and looks like this.

Text to Speech

        internal void Start()
        {
            SpeechSynthesisConnector speechSynthesisConnector = new SpeechSynthesisConnector();

            try
            {
                int x = 0;
                while (_call.Flow == null | _call.Flow.State != MediaFlowState.Active)
                {
                    Thread.Sleep(10);
                    x++;
                    if (x > 500)
                        break;
                }
                speechSynthesisConnector.AttachFlow(_call.Flow);

                _speechSynthesizer = new SpeechSynthesizer();
                SpeechAudioFormatInfo audioformat = new SpeechAudioFormatInfo(16000, AudioBitsPerSample.Sixteen, Microsoft.Speech.AudioFormat.AudioChannel.Mono);
                _speechSynthesizer.SetOutputToAudioStream(speechSynthesisConnector, audioformat);

                speechSynthesisConnector.Start();

                PromptBuilder prompt = new PromptBuilder();
                prompt.AppendText(_whattosay);
                _speechSynthesizer.Speak(prompt);
                prompt.ClearContent();

                speechSynthesisConnector.Stop();

                speechSynthesisConnector.DetachFlow();
            }
            catch (Exception ex)
            {
                _logger.Log(ex.Message);
            }
            finally
            {
                try
                {
                    speechSynthesisConnector.DetachFlow();
                }
                catch { }
            }
        }

The function is almost identical except the prompt appends text instead of a sound file.  If you have an questions real out to me and I'll be happy to answer them.  This concludes this weeks blog post.

Doug Routledge, C# Lync, Skype for Business, SQL, Exchange, UC Developer  BridgeOC
Twitter - @droutledge @ndbridge

No comments:

Post a Comment

Any spam comments will be deleted and your user account will be disabled.