[Bookreader] gnubook/Read-activity update
seth at laptop.org
Tue Feb 17 11:19:11 EST 2009
This sounds like good progress is being made.
2009/2/14 Samuel Klein <sj at laptop.org>:
> Sayamindu is close to having gnubook support ready to test...
Testers and hackers, please take a look at:
https://bugs.launchpad.net/gnubook/ to help this along.
> 1) start with a list of 5000 titles : draw from the Archive,
> Gutenberg, children's collection, CK-12 (15 textbooks), & a mobile
> classics list. Make sure all are available as html, pdf, and flipbook
> [there's some instance-to-work association needed here, and
> potentially pdf-to-html conversion with image placement].
I think that some description of language should be in this or an
eventual goals statement. There is a great deal less content in
Spanish than English in these repo's and the number in each language
drops steadily after that.
Attempting X number of Spanish children's books, would be a valuable statement.
> 3) write a bundling script that understands the Open Library (or
> other) metadata api and can generate an .xol collection index for a
> list of books. A single work's metadata should include a link to the
> original, and a link to each of its formats online.
I believe that I have wrote a description of what this little coding
project would look like, somewhere on the wiki. I will look this up.
> 4) related projects :
> a. define a file extension? for flipbook books, since they take
> their own reader / find another way to auto launch the reader from
> clicking on a link to a book file
I would suggest an acronym that's as multi-national as possible.
> b. make each of the formats smaller -- shrink Flipbook images, use
> text-only pdf's, compress a shelf of html books and only unpack the
> one being read (all in the reader)
I've done this kind of optimization before. If someone could help me
automate the process I could help fine tune settings.
> c. publish a toolchain for converting from each format to the others
> d. use wikisource to publish and correct OCR-html from pristine PDFs
> for a wikibook version of each... rate limited and on demand, so as
> not to flood ws.
What's the difference between using wikisource and Distributed
Proofreaders (Project Gutenberg). I've had more experience with DPPG
than ws personally. If DP / Gutenberg affiliated with archive.org,
would this help us at all for organization?
> e. extend Read testing to epub and djvu formats
Isn't the current Read (based on Evince?) able to read .djvu formats already?
> f. figure out a flexible way to let Read as well as other tools read
> multiple formats : txt, html, zip. [is it necessary to load all of
> Browse to read a simple html page?]
Browse isn't needed to render html. You will need *most* of firefox
however. The Help activity includes a stripped down version of a
browser, packaged and maintained as HulaHop. IDK if this is the best
route, it is rather large still.
> Now that people will have a choice of formats to use, let them give
> real feedback. Define a 30-min test suite for picking a collection
> and a book in it and testing various reader options. Find heavy
> reader already addicted to reading longer works on their laptops or
> mobiles, get input.
Jennifer, would this kind of focus group be your expertise?
More information about the Bookreader