Read Etexts now supports Text To Speech

Tue Jun 10 06:07:42 EDT 2008

On Mon, Jun 9, 2008 at 9:09 AM, James Simmons <jim.simmons at walgreens.com> wrote:
> I know there are several people interested in having Text to Speech with
> Karaoke highlighting be a built in part of the Sugar environment.  Also,
> when I originally requested a Git repository for the Read Etexts
> activity Ed asked if text to speech with highlighting would be
> supported.  I was reluctant to commit to that at the time, thinking it
> would be too difficult.  It turned out to be both easier and more
> difficult than I thought it would be, but I have released version 4 of
> the activity which now supports TTS with the words highlighted as they
> are spoken.

Wonderful news. Now we have to talk to management about getting a
project created to support more languages. Perhaps organized like
Pootle, but with entirely different software. The idea is that we need

* a linguistic analysis of the sound system of the language, or of any
particular dialect
* a script containing all of the sounds of the language for informants
to read for recording
* a process to create the files for the speech engine in the appropriate format
* a dictionary and an orthography engine to convert from the written
language to the required sound sequence. There will still be
ambiguities that would require strong AI to resolve. "He wound the
bandage around the wound" is a simple example of the problem.

> The code could be improved, no doubt.  I am fairly new to Python
> programming.  But I think trying out this Activity could give you some
> idea of what to expect if you attempt to incorporate TTS as part of the
> Sugar interface.
>
> 1).  Speech-dispatcher needs to run in a separate thread from the GTK
> event loop, otherwise the callbacks needed to highlight words won't be
> received.
> 2).  To get the callbacks as each word is spoken you need to format the
> text to be spoken as an XML document with tags *before* each word.  My
> code assumes that words are separated by whitespace, which works for
> many languages but not all of them.  I know Sanskrit doesn't work that
> way, for instance.

Nor Chinese, nor Thai, nor a number of others.

> 3).  Espeak does not allways do a callback for each word, and there is
> no obvious reason why any given word would be skipped.  I understand
> that Festival works better, but I haven't tried it.  At the suggestion
> of Hynek Hanke of the speech-dispatcher project I made the tag ids for
> each tag correspond to the word number in the document.  In this way I
> can get the tag id in the callback and always highlight the correct word
> even if occasionally words are skipped over by espeak.
> 4).  Pausing and resuming speech doesn't work.  No idea why.
> 5).  The instructions for setting up speech-dispatcher on the wiki are
> obsolete.  You cannot use espeak-generic module with speech-dispatcher
> and get callbacks.  You need to use the normal espeak module.  When you
> try to use the normal espeak module with the current RPMs
> speech-dispatcher complains of a missing library.  So if you want to try
> my Activity you'll need to use sugar-jhbuild with speech-dispatcher
> installed and configured to use espeak.
>
> Hemant Goyal is working on creating RPMs for speech-dispatcher and will
> be updating the instructions on the wiki.

Is anybody interested in making the Debian/Ubuntu packages? This would
be one of my favorite demos.

> The Activity page is: http://wiki.laptop.org/go/Read_Etexts
>
> James Simmons
>
>
>
> _______________________________________________
> Devel mailing list
> Devel at lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>

-- 
Edward Cherlin
End Poverty at a Profit by teaching children business
http://www.EarthTreasury.org/
"The best way to predict the future is to invent it."--Alan Kay