How to use Microsoft Project Oxford Speech Recognition

Project Oxford is a web API provided by Microsoft for making developer’s lives easy for dealing with artificial intelligence like face detection and speech recognition(projectoxford.ai). Today I’m going to show you how easy it is to write your first application which can recognize speech in less than 5 minutes.

Requirements:

  1. In order to use Microsoft Oxford library, you need an Microsoft Azure account. You don’t need to purchase any kind of services but still need to accept the terms and log in with you Live ID(azure.com).
  2. Click on the “NEW” button and then choose “MARKETPLACE” for adding our new service.

    2

  3. Choose “Speech APIs” from the list shown and click “Next”.
  4. In this page you can give a name for the service and choose a region for the API to use(While I’m writing this article, there is only “West US” region). After choosing a name you like, let’s click “Next”.
  5. You can safely click “PURCHASE” button as you’ll notice it says “Plan : FREE” and “Price: 0.00 USD / Month”.
  6. After purchasing the service click on the “Go to the Microsoft website for the next steps” link to get your key for authentication.
  7. Click to “Show” button next to “Primary key” to reveal your key. Copy this key as we’ll use it in our application.
  8. Last step of the requirements in to download the DLLs provided to be able to accelerate development. You can find them on projectoxford.ai/sdk page by choosing the appropriate library. For our example we need to download “STT-Windows”.

Let’s have some fun:

  1. Let’s start with creating our C# project. I called mine “Talk To Me”.
  2. Copy the “SpeechClient.dll” from the downloaded “STT-Windows” package(It’s inside SpeechSDK -> x86)
  3. Go back to visual studio and inside the “Solution Explorer”, find your project and add a reference to the SpeechClient.dll.
  4. Create a “TextBox” and a “Button” for starting the listening and showing the results.

    4

  5. Following code will do the trick :

    [csharp]
    public static void StartListening(TextBox textBox)
    {
    const string PrimaryKey = "XXXXXxxxXXXXxxXXXXxxxXXX"; //TODO: Your primary key here
    const int TimeOutInMS = 30000;

    MicrophoneRecognitionClient micClient =
    SpeechRecognitionServiceFactory.CreateMicrophoneClient(SpeechRecognitionMode.ShortPhrase, "en-us", PrimaryKey);

    micClient.OnResponseReceived +=
    (sender, args) =>
    {
    textBox.BeginInvoke(new Action(() => { textBox.Text += args.PhraseResponse.Results[0].DisplayText + Environment.NewLine; }));
    };

    try
    {
    micClient.StartMicAndRecognition();

    if (!micClient.WaitForFinalResponse(TimeOutInMS))
    textBox.BeginInvoke(new Action(() => { textBox.Text += "Timed out waiting for conversation response after " + TimeOutInMS + "ms" + Environment.NewLine; }));
    }
    finally
    {
    micClient.EndMicAndRecognition();
    }
    }
    [/csharp]

Leave a Reply

Your email address will not be published. Required fields are marked *