Converting or Transcribing audio to text using C# and .NET System.Speech

Recently, I had a project where I needed to convert some audio to text. It took a bit more googling than I was used to in order to find the code, so I went ahead and whipped up a project that demonstrates its usage, so people can more easily find it.

This code uses the .NET System.Speech namespace and demonstrates how to transcribe audio using either a microphone or a previously created .wav file using C#.

The code can be divided into 2 main parts:

configuring the SpeechRecognitionEngine object (and its required elements)
handling the SpeechRecognized and SpeechHypothesized events.

Step 1: Configuring the SpeechRecognitionEngine

_speechRecognitionEngine = new SpeechRecognitionEngine();
_speechRecognitionEngine.SetInputToDefaultAudioDevice();
_dictationGrammar = new DictationGrammar();
_speechRecognitionEngine.LoadGrammar(_dictationGrammar);
_speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);

At this point, your object is ready to start transcribing audio from the microphone. You need to handle some events though, in order to actually get access to the results.

Step 2: Handling the SpeechRecognitionEngine Events

_speechRecognitionEngine.SpeechRecognized -= new EventHandler(SpeechRecognized);
_speechRecognitionEngine.SpeechHypothesized -= new EventHandler(SpeechHypothesizing);_speechRecognitionEngine.SpeechRecognized += new EventHandler(SpeechRecognized);
_speechRecognitionEngine.SpeechHypothesized += new EventHandler(SpeechHypothesizing);

private void SpeechHypothesizing(object sender, SpeechHypothesizedEventArgs e)
{
///real-time results from the engine
     string realTimeResults = e.Result.Text;
}

private void SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
///final answer from the engine
     string finalAnswer = e.Result.Text;
}

That’s it. If you want to use a pre-recorded .wav file instead of a microphone, you would use _speechRecognitionEngine.SetInputToWaveFile(pathToTargetWavFile); instead of _speechRecognitionEngine.SetInputToDefaultAudioDevice();.

There are a bunch of different options in these classes and they are worth exploring in more detail. This covers the bare essentials for a prototype.

25 thoughts on “Converting or Transcribing audio to text using C# and .NET System.Speech”

Nick April 28, 2012 at 2:21 AM

Reply

Awesome! I was looking for exactly this and having the code project with it made it even sweeter! Have you planned on doing anything more with the speech recognition? I’m just starting out using it and C# for that matter (coming from Java) for home automation. Would be interested in any other projects related to speech recognition. Thanks again!
1. admin May 6, 2012 at 8:09 PM
  
  Reply
  
  Hey Nicholas,
  
  I really do not have any more plans for working with speech recognition. I found the Microsoft solution and a Google solution. I would like to get my hands on the Dragon NaturallySpeaking SDK and see how good that is, but I have no plans in the immediate future for working on it.
  
  Good luck with your home automation and thanks for responding to my blog!
2. 1. Maham July 5, 2014 at 12:29 PM
    
    Reply
    
    hi Micheal, thanks for help, nowadays i am working on Speech Recognition project , in which i have to convert video files into text, here i have tried to convert audio file( .mp3 ) into text through your code but its not properly working , so please do u tell me the code so i can proceed trough my work. please reply as soon as possible.
  2. 1. Lalmuhammad May 27, 2017 at 9:07 PM
      
      Reply
      
      i am working on Speech Recognition project , in which i have to convert video or audio files into text,if you done this workplz send me the code or project
      email ::kanlalmuhammad@yahoo.com
    2. Sudarshan jha April 20, 2022 at 5:08 AM
      
      Reply
      
      hey maham can u plz do hindi speech recognition which read hindi audio as a hindi text file…if u can do it plz sent me the code at sudarshanjha11781@gmail.com
  3. Sudarshan jha April 20, 2022 at 5:05 AM
    
    Reply
    
    hey admin can u plz do speech recognition in hindi which read as a text file in hindi……plz do it if u can…
3. Samia July 5, 2014 at 12:39 PM
  
  Reply
  
  Hey can you please send me the code , i have also looking for this audio to text conversion in C# visual studio.. i have tried above code. but i getting through so many errors, so please help, will b very grateful to you
4. 1. Lalmuhammad May 27, 2017 at 9:08 PM
    
    Reply
    
    am working on Speech Recognition project , in which i have to convert video or audio files into text,if you done this workplz send me the code or project
    email ::kanlalmuhammad@yahoo.com
    if u done thiswork send it to me
chandan May 18, 2012 at 8:11 PM

Reply

Hey can you guide to build and debug this project.
Jean-Philippe Encausse September 25, 2012 at 11:48 AM

Reply

Hi,

Any ideas on how to handle wildcard/garbage from XML grammar ?

http://stackoverflow.com/questions/12101120/matching-wildcard-dictation-in-microsoft-speech-grammar/12535235
John Nelson September 29, 2012 at 12:28 PM

Reply

What was the Google solution?
Julian December 3, 2012 at 10:14 PM

Reply

Like the article. Do you know if its possible to have a voice recognition system which identifies a certain word or words when someone speaks it or them and then converts it to text with a time when the word was spoken? Can it be incorporated into an ipad or tablet? If I have a sales person and I want them to emphasise the brand name for example, it will display as text every time they mentioned the brand name.
Ram Raksha Mishra January 7, 2013 at 11:13 AM

Reply

hey it is awesome but there are some problem with this code
1.Not converting long audio file(more than 1 min) in proper format
2.converted txt not match with audio voice
Please give me some solutions as soon as possible
thanks
jhonas February 3, 2013 at 9:07 PM

Reply

Where could I read to actually build a command line application that would take microphone input or a wav file and provide me with a cout of the text in visual studio 2010
Govind Dhawale June 27, 2013 at 11:08 AM

Reply

Thanks
willian July 26, 2013 at 9:41 PM

Reply

Eu queria mais explicações. Teria como eu entrar em contato com você?
Rhee keun young June 9, 2014 at 7:43 PM

Reply

I want to get involved in tts and stt functioning appl production. Because in special environment of which deaf person recognize the message and normal person’s environment requires such app in order to communicate each other. Well, sometimes I thought only text can solve the problem. But the trend is going to where apple understand and transfer the sound or text into counter format of message ie text, sound. If normal person like to speak out then the apple should be adjusted according to its trend. Likewise if deaf person like to send text message then appl shoud do so. Well all of these things depend on the this era’s trend. Anyway I want to get some help in terms of compiler and SDK. I am in need of detailed link and kind explanation as I am novice to this STT, TTS area. God bless you all!!! What if I download from this link? visualstudio
sanju December 26, 2014 at 9:27 AM

Reply

hi,
i wish to know about building the grammer. how many words can be added in the grammer so that it works fine ? Also i am getting the accuracy of the recognized words close to 30 % ! is there a way to increase this accuracy ?
any help is appreciated.
thank you
Emanuel September 10, 2015 at 11:31 PM

Reply

CAN YOU PLEASE TELL ME IF THIS CODE ENABLES THE SPEECH OR AUDIO TO BE CONVERTED INTO TEXT IN ANY TEXT FIELD OR JUST FOR A TEXT BOX IN THE IDE??
Emanuel September 10, 2015 at 11:53 PM

Reply

CAN YOU PLEASE TELLVME IF THIS CODE IS ABLE TO CONVERT THE VOICE (SPEECH OR SOUND) INTO TEXT IN ANY TEXT FIELD SUCH AS, MS WORD, MS POWERPOINT, NOTPAD ETC, OR I HAVE TO CREAT A TEXT FIELD LIKE A TEXT BOX??
James December 30, 2015 at 2:22 AM

Reply

Hi..
Can you give sample existing program for the better references.
Anjali February 2, 2017 at 5:38 PM

Reply

I am working on a project in visual studio with C# where I need to covert speech into text. I am getting codes for window based application but I need for web based application. Kindly Help.
Got April 26, 2017 at 10:15 AM

Reply

What About : http://www.speech.cs.cmu.edu/ ?
Is it better than your solution ?
1. admin May 19, 2017 at 1:49 PM
  
  Reply
  
  Yeah you could get more accurate results with that, but it would be alot more complicated.
kiryus April 13, 2018 at 10:43 PM

Reply

Hi I’m working on a project and need audio converter in real time text, ie, that is to capture the package of rtp and converter in wav and use a tool, thinking of a level of sales ie 50 calls, you can do using your example?

Converting or Transcribing audio to text using C# and .NET System.Speech

25 thoughts on “Converting or Transcribing audio to text using C# and .NET System.Speech”

Leave a Reply Cancel reply