Mac OS X speech to text API. Howto?
I have a program that receives an audio (mono) stream of bits from TCP/IP. I am wondering whether the speech (speech-recognition) API in Mac OS X would be able to do a speech-to-text transform for me.
(I don’t mind saving the audio into .wav first and read it as oppose to do the transform on the fly).
I have read the official docs online, it is a bit confusing. And I couldn’t find any good example about this topic.
Also, should I do it in Cocoa/Carbon/Java or Objective-C?
Can someone please shed some light?
- Speech Recognition with AVAudioEngine Blocks Sound After Recording
- French speech recognition on iOS
- Speech to text API for iphone?
- iOS / C: Algorithm to detect phonemes
- iPhone: Speech Recognition is in IOS SDK available?
- Has anyone created a MonoTouch binding for the Nuance Dragon Mobile Speech SDK for iOS?
4 Solutions Collect From Internet About “Mac OS X speech to text API. Howto?”
There’s a number of examples that get copied under /Developer/Examples/Speech/Recognition when you install XCode.
Cocoa class for speech recognition is NSSpeechRecognizer.
I’ve not used it but as far as I know speech recognition requires you to build a grammar to help the engine choose from a number of choices rather then allowing you to pass free-form input. This is all explained in the examples referred above.
This comes a bit late perhaps, but I’ll chime in anyway.
The speech recognition facilities in OS X (on both the Carbon and Cocoa side of things) are for speech command recognition, which means that they will recognize words (or phrases, commands) that have been loaded into the speech system language model. I’ve done some stuff with small dictionaries and it works pretty well, but if you want to recognize arbitrary speech things may turn hairier.
Something else to keep in mind is that the functionality that the speech APIs in OS X provide is not one to one. The Carbon stuff provides functionality that has not made it to
NSSpeechRecognizer (the docs make some mention of this).
I don’t know about Cocoa, but the Carbon Speech Recognition Manager does allow you to specify inputs other than a microphone so a sound stream would work just fine.
Here’s a good O’Reilly article to get you started.
You can use either ApplicationServices’s SpeechSynthesis (10.0+)
CFStringRef cfstr = CFStringCreateWithCString(NULL,"Hello World!", kCFStringEncodingMacRoman); Str255 pstr; CFStringGetPascalString(cfstr, pstr, 255, kCFStringEncodingMacRoman); SpeakString(pstr);
or AppKit’s NSSpeechSynthesizer (10.3+)
NSSpeechSynthesizer *synth = [[NSSpeechSynthesizer alloc] initWithVoice:@"com.apple.speech.synthesis.voice.Alex"]; [synth startSpeakingString:@"Hello world!"];
- UISearchBar on the top of UITableView which can hide but stay close to UINavigationBar
- RestKit – Process one REST operation at a time
- Local notification sound not working
- Thread1:exc_bad_access error – Programatic Modal View
- Handle No Internet Connection Error Before Try to Parse the Result in Alamofire
- How to change subversion settings in xcode?
- iOS tap to focus
- How to display video from images
- Querying below AutoID's in Firebase
- What does “sending a message to nil” mean, and why is it a special case?
- iOS delete a tableview row
- iOS – Download file only if modified (NSURL & NSData)
- How to respond to push notification view if app is already running in the background
- Playing a SoundCloud file in IOS app
- Swift Photo Library Access