is it possible to make “HTML to speech” same like “Text to speech”?

4 Solutions Collect From Internet About “is it possible to make “HTML to speech” same like “Text to speech”?”

As I have worked with HTML parsing and text2speech here you can go with 2 steps
1.get Attribute string from HTML file with below code works in iOS7+

As per your client perspective : if there is any API in market for HTML2Speech may be its Paid or
you are depended on that API if you use any. While Native framework
will help same what you/client wants.

Step 1:

[[NSAttributedString alloc] initWithData:[htmlString dataUsingEncoding:NSUTF8StringEncoding] 
                                 options:@{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
                                           NSCharacterEncodingDocumentAttribute: @(NSUTF8StringEncoding)} 
                      documentAttributes:nil error:nil];

Then you can pass this Attributed String in AVSpeechUtterance

Step 2:
use below method to get HTML2String:

 *  "ConvertHTMLtoStrAndPlay" : This method will convert the HTML to String 
 *  @param aURLHtmlFilePath : "object of html file path"

    if (synthesizer.speaking == NO && speechPaused == NO) {

        AVSpeechUtterance *utterance = [[AVSpeechUtterance alloc] initWithString:aStrWithHTMLAttributes.string];
        //utterance.rate = AVSpeechUtteranceMinimumSpeechRate;

        if (IS_ARABIC) {
            utterance.voice = [AVSpeechSynthesisVoice voiceWithLanguage:@"ar-au"];
            utterance.voice = [AVSpeechSynthesisVoice voiceWithLanguage:@"en-au"];

        [synthesizer speakUtterance:utterance];
        [synthesizer pauseSpeakingAtBoundary:AVSpeechBoundaryImmediate];

    if (speechPaused == NO) {
        [synthesizer continueSpeaking];
    } else {
        [synthesizer pauseSpeakingAtBoundary:AVSpeechBoundaryImmediate];


and as usual while you need to stop use below code to stop Speech.

 *  "StopPlayWithAVSpeechSynthesizer" : this method will stop the playing of audio on the application.

    // Do any additional setup after loading the view, typically from a nib.
    [synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryImmediate];

Hope This will help you to get HTML2Speech feature.

There’s two parts to a solution here…

  1. Presumably you don’t care about the formatting in the HTML–after all, by the time it gets to the speech synthesizer, this text is to be spoken, not viewed. AVSpeechSynthesizer takes plain text, so you just need to get rid of the HTML markup. One easy way to do that is to create an NSAttributedString from the HTML, then ask that attributed string for its underlying plain-text string to pass text to the synthesizer.

  2. In iOS 10 you don’t even have to extract the string from an attributed string — you can pass an attributed string directly to AVSpeechUtterance.

One way or another it will always be parsing HTML to something else if you don’t want to read files. If the client want direct HTML2Speech solution you can provide a method that takes html file as an argument and read it. What’s happening with this file under the hood should not bother client that much as long as it’s clean and not causing problems.

What happen when client will ask for Markdown2Speech or XML2Speech. For what i see in your desciption is better to have it for now in one framework with two public methods Text2Speech and HTML2Speech that will take as argument link to file or NSString.

So as @rickster suggest it can be NSAttributedString or NSString. There is a lot of parsers out there, Or if you want own solution you can remove everything what’s inside < and > and change encoding.

The safest method will be to extract the text and use existing text2speech API.

Though if you are sure that the browser will be chrome then Speech Synthesis API maybe helpful. But this API still not fully adopted by all browsers; it will be a risky solution.

You can find necessary info regarding this API at

There is no direct API for HTML to Speech except Speech Synthesis API mentioned above. Though you can try But I think this one is also based on browser’s Speech Synthesis or Speech generation at server. So to use this one, you would have to extract text and pass the text to API to get the speech