[Bookreader] [IAEP] Text to Speech readers for XO

Thu Nov 5 06:46:31 EST 2009

Hi Mike,
Do you know of any Open Source application which can handle the DAISY
books on Linux ?
Thanks,
Sayamindu

On Wed, Nov 4, 2009 at 4:20 AM, Mike McCabe <mccabe at archive.org> wrote:
> Hi all -
>
> I'm working on creating DAISY (and epub) books for the Archive, and I'm very
> happy to answer any questions.  DAISY and epub books have many similarities.
>
> Mike
>
>
> Sayamindu Dasgupta wrote:
>>
>> The Internet Archive has started to distribute books as DAISY
>> (http://en.wikipedia.org/wiki/DAISY_Digital_Talking_Book), something
>> we should definitely take a look at. We might also consider leveraging
>> the GNOME accessibility framework to provide book-reading features for
>> Epubs and PDFs in Read - it may be tricky, but the end results would
>> be worth it.
>> Thanks,
>> Sayamindu
>>
>>
>> On Fri, Oct 30, 2009 at 5:34 AM, Samuel Klein <meta.sj at gmail.com> wrote:
>>>
>>> Bumping up this recent thread on the bookreader list about
>>> text-to-speech.
>>> Mike and Gregor, in case you haven't seen what's currently possible:
>>>
>>> I believe James S's Read Etexts uses speech-dispatcher to read selected
>>> text. Aleksey and others may have done further work with espeak...  I've
>>> included some old threads from the Sugar list this past spring below.
>>>
>>> SJ
>>>
>>>
>>> On Thu, Oct 29, Mike McCabe <mccabe at archive.org> wrote:
>>>
>>> I also think this is a great idea.  I've worked with several
>>> text-to-speech readers recently, as part of my effort to make the
>>> Internet Archive books available to print disabled people.
>>>
>>> They're very useful, and I think that this mode of reading could be of
>>> use to a very broad range of users.  I suspect we'll see more of it soon.
>>>
>>> I'm also curious to hear about specific experiences with
>>> linux-compatible free TTS, as we may be producing audio books with this
>>> to work with the new Library of Congress audio players.
>>>
>>> Best regards -
>>> Mike
>>>
>>>
>>>
>>>
>>> == [1] old note from James Simmons ==
>>> ( in repsponse to this speech-synthesis summer of code proposal:
>>> http://wiki.sugarlabs.org/go/speech-synthesis )
>>>
>>> Chirag,
>>>
>>> Since you have been working with Aleksey Lim you probably know about
>>> text to speech with highlighting in Read Etexts.  I wrote the original
>>> TTS code that used speech-dispatcher with some assistance from Hemant
>>> Goyal and the folks on the speech-dispatcher project.  Aleksey
>>> refactored my code so it could work with either speech-dispatcher or his
>>> own gstreamer espeak plugin.  Not only does his plugin need no
>>> configuration to work, it also does a LOT better in producing timely
>>> callbacks as it reads each word.
>>>
>>> As you point out in your proposal, highlighting the word as it is spoken
>>> is a big part of the benefit of what you're proposing.  If all you
>>> wanted to do was capture some highlighted text in the clipboard and have
>>> it spoken in a voice you can configure in a control panel, that would be
>>> easy, even trivial.  It's the highlighting that's difficult.  When I
>>> added speech to Read Etexts I deliberately tried for the simplest
>>> approach that would get the job done.  It reads only the current page.
>>> It always starts either at the first word on the page, or if speech has
>>> been paused, it resumes with the last word spoken.  You can't choose the
>>> word to start on.  The Activity itself receives the callbacks as each
>>> word is spoken and takes care of doing the highlight and scrolling the
>>> textarea so the highlighted word stays on the screen.
>>>
>>> If I had to write a facility that did what Read Etexts does outside of
>>> the Activity I wouldn't know how to do it.  It seems to me that
>>> highlighting is best done by the Activity itself.  I can't deny that it
>>> would be useful to have all this work done as you have described without
>>> the Activity knowing anything about it, but it doesn't seem feasible.
>>> You'd have to have something that could work with gtk textareas, the
>>> evince component Read uses, Abiword, and everything else that came along.
>>>
>>> Another thing you'd have to deal with is PDFs composed of scanned in
>>> book pages.  There are a lot of these around (the Internet Archive is
>>> full of them) and somehow the kid trying to select words on a scanned in
>>> page would have to be clued in that these words are not selectable.
>>>
>>> I suppose you could make an Activity that grabbed whatever text was in
>>> the clipboard, displayed it in a textarea, and highlighted the words in
>>> that textarea as it spoke them.  I'm pretty sure that wasn't what you
>>> had in mind.
>>>
>>> Splitting sentences into separate words will be a challenge.  I just use
>>> spaces as delimiters and filter out characters like asterisks, vertical
>>> bars, etc.  That works OK for English but not for other languages.  If I
>>> wanted Read Etexts to do highlighting on the Bhagavad-Gita in the
>>> original
>>> Sanskrit it wouldn't work.  Even in English I get tripped up by double
>>> hyphens (--).  It would be nice if Gutenberg etexts put spaces around
>>> double
>>> hyphens but they don't.
>>>
>>> It looks like you've picked a challenging project, and I would love to be
>>> proven wrong about everything I've mentioned here.  Good luck with this,
>>>
>>> James Simmons
>>>
>>>
>>> == 2: SynPhony and reading assistance ==
>>>
>>> On Tue, Feb 17, 2009 at 12:48 PM, Carol Farlow Lerche <cafl at msbit.com>
>>> wrote:
>>>>
>>>> I'd like to call your attention again to SynPhony.  We are close to a
>>>> base
>>>> release (probably this week) of a 44,000 word English word database that
>>>> has
>>>> a very rich array of information helpful to the teaching of English,
>>>> especially reading.  A 10,000 word Spanish lexicon and 50000 word German
>>>> one
>>>> will follow. Norbert Rennert who compiled these, would like very much to
>>>> work with other language experts to extend this effort to other
>>>> languages.
>>>> Some highlights of the English lexicon:  screened from the CMU Sphynx
>>>> corpus
>>>> for accessibility to children, each word entry has frequency data from
>>>> analysis with respect to a large corpus of text merged in, phoneme
>>>> breakdown
>>>> (used by reading curricula to decide the order in which words should be
>>>> introduced or deemed decodable), etymology, semantic domain
>>>> (categorization), IPA coding, syllabification and stress marking.
>>>>
>>>> The second release will merge in many images, though we don't expect to
>>>> have a complete image-to-word mapping without a volunteer effort.   We
>>>> plan
>>>> to create an API and a way to define a curriculum sequence for word
>>>> groups
>>>> once the basic database is released, to allow integration of the word
>>>> bank
>>>> across all the activities that are literacy related, as well as create
>>>> more.  We also hope to use the word bank to score texts for reading
>>>> level
>>>> and assist in creation of simplified version of extant texts suitable
>>>> for
>>>> use by emergent readers.  Please read our design documents at the above
>>>> site.
>>>>
>>>> On Tue, Feb 17, 2009 at 2:02 AM, Tomeu Vizoso <tomeu at sugarlabs.org>
>>>> wrote:
>>>>>
>>>>> Aleksey has started a very interesting new path:
>>>>>
>>>>>
>>>>> http://lists.sugarlabs.org/archive/sugar-devel/2009-February/011470.html
>>>>>
>>>
>>>
>>>
>>>
>>>> Gregor Kervina wrote:
>>>>>
>>>>> Hi Sayamindu,
>>>>> thanks for quick reply!
>>>>> There is a lot of text to speech software out there - I use
>>>>> http://www.bytecool.com/coolspch.htm that you can try trial and
>>>>> download
>>>>> additional voices, just to get a feeling, but it is not free and not
>>>>> for
>>>>> linux. Many other programs are more complex and complicated and some of
>>>>> them use very complex voice engines that in my opinion doesn't sound
>>>>> very good. (I use Mary voice with cool speech)
>>>>>
>>>>> OK I spent some time to find all TTS software that is free for linux
>>>>> and
>>>>> here are some links:
>>>>>
>>>>> http://linux-sound.org/speech.html
>>>>>
>>>>>
>>>>> http://linuxhelp.blogspot.com/2006/01/festival-text-to-speech-synthesis.html
>>>>> http://larswiki.atrc.utoronto.ca/wiki/Software  - see the links under
>>>>> Speech section
>>>>> http://www.xenocafe.com/tutorials/php/festival_text_to_speech/index.php
>>>>> http://www.wikihow.com/Convert-Text-to-Speech-on-Linux
>>>>> http://www.cstr.ed.ac.uk/projects/festival/
>>>>> http://www.cstr.ed.ac.uk/projects/festival/onlinedemo.html - listen to
>>>>> some demo voices
>>>>> http://sourceforge.net/projects/dhvani/ - this one not english
>>>>> http://sourceforge.net/projects/tts-cubed/
>>>>> http://www.speech.cs.cmu.edu/hephaestus.html - click the links in
>>>>> Speech
>>>>> Synthesis section
>>>>> http://www.speech.cs.cmu.edu/comp.speech/Section5/Synth/rsynth.html
>>>>> http://www.linux.com/archive/feature/122197 - two readers - plug-ins
>>>>> for
>>>>> firefox.
>>>>>
>>>>> I can not test them because I'm not a linux user. Maybe you can modify
>>>>> some of these software (probably Festival) for more user friendly
>>>>> reading and maybe program a specific button on XO keyboard that will
>>>>> automatically read the selected text no matter what program is used for
>>>>> opening the text.
>>>>>
>>>>> Judging from google search result for DTBooks, this technology is not
>>>>> spread at all. The other problem is that it uses somtimes recorded
>>>>> audio
>>>>> and the size of that is too large for XO... I think the most important
>>>>> is that TTS works with reader that will open 1.6M e-books from internet
>>>>> archive
>>>>>
>>>>>
>>>>> <http://www.xconomy.com/boston/2009/10/24/internet-archive-opens-1-6-million-e-books-to-olpc-laptops/>(are
>>>>> you in this team?).
>>>>>
>>>>> Also one important thing is to add cheap headphones with laptop so
>>>>> children could listen to reading without desturbing others and in the
>>>>> noisy environments ... another advantage of audio reading is much
>>>>> longer
>>>>> battery life because you can turn off LCD monitor and audio alone does
>>>>> not consume much energy.
>>>>>
>>>>> Let me know what you think.
>>>>> All the best,
>>>>> Gregor
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Oct 26, 2009 at 4:08 PM, Sayamindu Dasgupta
>>>>> <sayamindu at gmail.com
>>>>> <mailto:sayamindu at gmail.com>> wrote:
>>>>>
>>>>>    Hi Gregor,
>>>>>    Thanks a lot for jumping in :-)
>>>>>
>>>>>    On Mon, Oct 26, 2009 at 2:38 AM, Gregor Kervina
>>>>>    <gregor.kervina at gmail.com <mailto:gregor.kervina at gmail.com>> wrote:
>>>>>     > Dear Sayamindu Dasgupta, SJ Klein and other members of this list,
>>>>>     >
>>>>>     > I'm a student of electrical engineering from Europe and would
>>>>>    like to share
>>>>>     > with you my very positive experience with text to speech
>>>>>    technology that can
>>>>>     > in my opinion significantly increase the educational potential of
>>>>>    XO if used
>>>>>     > in the right way.
>>>>>     >
>>>>>     > For the past 12 years (since I was 15 years old) I'm daily
>>>>>    learning from
>>>>>     > e-books and internet using text to speech software. I know this
>>>>>    software is
>>>>>     > unpopular in developed world, many people don't even know that it
>>>>>    exists. On
>>>>>     > the other hand many people (including me) don't like reading long
>>>>>    texts on
>>>>>     > the LCD screens - that's why e-books are also not very popular.
>>>>>     >
>>>>>     > But unlike my friends I read 50+ e-books every ear and also daily
>>>>>    news on
>>>>>     > the internet - I just select the text, copy it, and CoolSpeech
>>>>>    software
>>>>>     > (using Mary voice) reads me all the text with speeds 300 to 500
>>>>>    words per
>>>>>     > minute. In this way I can browse other sites or look at photos or
>>>>>    just lay
>>>>>     > down and listen while my laptop is reading to me.
>>>>>     > Other people don't understand what I'm reading because it is too
>>>>>    fast for
>>>>>     > them but it can be learned quickly with slower speeds at
>>>>> beginning.
>>>>>     >
>>>>>     > I think XO laptops should definitely have such software
>>>>>    pre-installed and a
>>>>>     > video introduction how to use it and what reading speeds can they
>>>>>    expect
>>>>>     > after some time of practicing.
>>>>>     > It is also ideal for children with poor eye sight.
>>>>>     >
>>>>>
>>>>>    This sounds awesome. Could you let us know if the text to speech
>>>>>    software you have in mind is free/opensource and if it works on
>>>>> Linux
>>>>>    ?
>>>>>    I am also looking at DTBooks specifications for digital talking
>>>>> books
>>>>>    - do you know how useful/widespread this technology is ?
>>>>>
>>>>>    Thanks,
>>>>>    Sayamindu
>>>>>
>>>>>
>>>>>    --
>>>>>    Sayamindu Dasgupta
>>>>>    [http://sayamindu.randomink.org/ramblings]
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>>
>>>>> _______________________________________________
>>>>> Bookreader mailing list
>>>>> Bookreader at lists.laptop.org
>>>>> http://lists.laptop.org/listinfo/bookreader
>>>>
>>>> _______________________________________________
>>>> Bookreader mailing list
>>>> Bookreader at lists.laptop.org
>>>> http://lists.laptop.org/listinfo/bookreader
>>>
>>> _______________________________________________
>>> IAEP -- It's An Education Project (not a laptop project!)
>>> IAEP at lists.sugarlabs.org
>>> http://lists.sugarlabs.org/listinfo/iaep
>>>
>>
>>
>>
>

-- 
Sayamindu Dasgupta
[http://sayamindu.randomink.org/ramblings]