[Bookreader] gnubook/Read-activity update

Tue Feb 17 11:19:11 EST 2009

This sounds like good progress is being made.

2009/2/14 Samuel Klein <sj at laptop.org>:

> Sayamindu is close to having gnubook support ready to test...

Testers and hackers, please take a look at:
https://bugs.launchpad.net/gnubook/ to help this along.

> Goals:
>  1) start with a list of 5000 titles : draw from the Archive,
> Gutenberg, children's collection, CK-12 (15 textbooks), & a mobile
> classics list.  Make sure all are available as html, pdf, and flipbook
> [there's some instance-to-work association needed here, and
> potentially pdf-to-html conversion with image placement].

I think that some description of language should be in this or an
eventual goals statement.  There is a great deal less content in
Spanish than English in these repo's and the number in each language
drops steadily after that.

Attempting X number of Spanish children's books, would be a valuable statement.

>  3) write a bundling script that understands the Open Library (or
> other) metadata api and can generate an .xol collection index for a
> list of books. A single work's metadata should include a link to the
> original, and a link to each of its formats online.
>

I believe that I have wrote a description of what this little coding
project would look like, somewhere on the wiki.  I will look this up.

>  4) related projects :
>  a. define a file extension? for flipbook books, since they take
> their own reader / find another way to auto launch the reader from
> clicking on a link to a book file

I would suggest an acronym that's as multi-national as possible.

>  b. make each of the formats smaller -- shrink Flipbook images, use
> text-only pdf's, compress a shelf of html books and only unpack the
> one being read (all in the reader)

I've done this kind of optimization before.  If someone could help me
automate the process I could help fine tune settings.

>  c. publish a toolchain for converting from each format to the others
>  d. use wikisource to publish and correct OCR-html from pristine PDFs
> for a wikibook version of each... rate limited and on demand, so as
> not to flood ws.

What's the difference between using wikisource and Distributed
Proofreaders (Project Gutenberg).  I've had more experience with DPPG
than ws personally.  If DP / Gutenberg affiliated with archive.org,
would this help us at all for organization?

>  e. extend Read testing to epub and djvu formats

Isn't the current Read (based on Evince?) able to read .djvu formats already?

>  f. figure out a flexible way to let Read as well as other tools read
> multiple formats : txt, html, zip. [is it necessary to load all of
> Browse to read a simple  html page?]

Browse isn't needed to render html.  You will need *most* of firefox
however.  The Help activity includes a stripped down version of a
browser, packaged and maintained as HulaHop.  IDK if this is the best
route, it is rather large still.

> Now that people will have a choice of formats to use, let them give
> real feedback.  Define a 30-min test suite for picking a collection
> and a book in it and testing various reader options.  Find heavy
> reader already addicted to reading longer works on their laptops or
> mobiles, get input.

Jennifer, would this kind of focus group be your expertise?