Read long text with voice on Android using TextToSpeech


I am implementing text reading with voice in

asked by A. Cedano 11.10.2017 в 16:32

2 answers


I have managed to solve the problem using UtteranceProgressListener , which allows you to add each text sent to the reader an utterance id (a sound id we could say).

When implemented, each time a sound finishes playing, it returns to the onDone(String utteranceId) method with the id of the sound it has just read.

This, combined with TextToSpeech.QUEUE_ADD in the leerTexto method allows you to add to the playlist the different texts that you get inside a loop or outside it.

This is the code, undoubtedly improvable in some aspects (for example, it still has to control that any of the already divided chains does not exceed the maximum number of characters allowed anyway). The fact is that it works as expected.

Method that the reader creates and passes the texts

public void ttsFunction() {
    tts = new TextToSpeech(getApplicationContext(), new TextToSpeech.OnInitListener() {
        public void onInit(int status) {
            if (status == TextToSpeech.SUCCESS) {
                final Locale locSpanish = new Locale("spa", "ESP");
                int result = tts.setLanguage(locSpanish);
                if (result == TextToSpeech.LANG_MISSING_DATA
                        || result == TextToSpeech.LANG_NOT_SUPPORTED) {
                    Toast.makeText(getApplicationContext(), "Lenguaje no soportado", Toast.LENGTH_SHORT).show();
                } else {

                            new UtteranceProgressListener() {
                                public void onStart(String s) {
                                    //Log.i(TAG,"Start: "+s);
                                    final TextView mTextView = (TextView) findViewById(;
                                    String strTexto = mTextView.getText().toString();
                                    String PARAGRAPH_SPLIT_REGEX = "…";
                                    String[] strPrimera = strTexto.split(PARAGRAPH_SPLIT_REGEX);
                                    strPartes = strPrimera[0].split("_");
                                    strOracion = strPrimera[1];


                                public void onDone(String s) {
                                    Log.i(TAG, "Done: " + s);

                                    for (String textos : strPartes) {
                                        i = 1;
                                        x = i - 1;
                                        StringBuilder sb = new StringBuilder();
                                        strActual = sb.toString();

                                        StringBuilder sbX = new StringBuilder();
                                        strPrevia = sbX.toString();

                                        if (s.equals(strPrevia)) {
                                            leerTexto(textos, strActual);


                                    leerTexto(strOracion, "Oracion");
                                    leerTexto("Fin del Oficio", "fin");
                                    if (s.equals("fin")) {
                                        //Log.i(TAG, "Cerramos...");


                                public void onError(String s) {
                                    Log.i(TAG, "Error: " + s);
                                    //Toast.makeText(OficioActivity.this, "OnError mensaje", Toast.LENGTH_SHORT).show();

                    Log.i(TAG, "onInit exitoso");
                    //Texto inicial
                    leerTexto("Iniciando oración...", "0");

            } else {
                Toast.makeText(getApplicationContext(), "Falló la inicialización", Toast.LENGTH_SHORT).show();


Method to read

void leerTexto(String strTexto, String strId) {
        //API 21+
        Bundle bundle = new Bundle();
        bundle.putInt(TextToSpeech.Engine.KEY_PARAM_STREAM, AudioManager.STREAM_MUSIC);
        tts.speak(strTexto, TextToSpeech.QUEUE_ADD, bundle, strId);
    } else {
        //API 15-
answered by 17.10.2017 / 15:04

Actually the limitation using TextToSpeech is the one obtained by getMaxSpeechInputLength() which is currently defined in 4000 characters.

public static int getMaxSpeechInputLength() {
    return 4000;

The option here is to separate the text playback by blocks, however this will result in a pause when playing each block.

Based on this limitation, due to own experience, a batch process had to be carried out to generate the audios of the texts, but through another option , since the mp3 urls are reproduced by MediaPlayer .

answered by 11.10.2017 в 17:39