[OLPC library] Preferred font size for ebooks on the XO?

Peter Hollings phollings at ipt.org
Fri Jul 11 13:38:46 EDT 2008



On Thu, Jul 10, 2008 at 11:49 AM, Seth Woodworth <seth at laptop.org> wrote:
> If the material currently exists in a PDF that's fine.  But if at all 
> possible an ebook should be in html, and a pdf should be converted 
> into html.

On Thu, 10 Jul 2008 14:17:50 -0700, "Edward Cherlin" <echerlin at gmail.com>
wrote:
>Is that a documented decision or an opinion? I know of a number of ebooks
in PDF format in Sugar distributions, and none in HTML.


I'm curious about the rationale for HTML. Granted, it's re-flowable and it
can be read on a web browser which also supports the broadest range of media
types. But, there are better tools for reading ebooks than web browsers with
features like searching (both individual document and collection of
documents), bookmarking,  document annotation, etc. (See FBReader or Adobe
Digital Editions as examples.)

The ANSI-recognized National Information Standards Organization sets
standards in this area and their "A Framework of Guidance for Building Good
Digital Collections" (available at http://www.niso.org/publications/rp/ )
provides guidance for digital libraries. This framework is widely observed
(see the Digital Library Federation,
http://www.diglib.org/standards/imlsframe.htm). The Recommendation makes
distinctions between "born digital" materials and non-digital source
materials such as printed matter, and it includes recommended formats for a
broad range of media types. For born digital textual materials the
recommendations are basically either PDF/A or XML based standards like the
Open epub format (see http://www.openebook.org/). For non-digital source
printed materials they recommend TIFF or JPEG2000 formats. See the
discussion beginning on page 26 for recommended formats.

One advantage of following the NISO standards is the interoperability
provided with other libraries. For example, The Digital Library of India -
http://dli.iiit.ac.in/ -- is embarked on building a million book collection.
OLPC would thus have access to a large store of materials without the
up-front trouble of digitizing them. The Digital Library of India uses the
TIFF format with an OCRd text version to provide searchability. A TIFF
rendering browser plugin is used for display. I'm not sure why the Indian
project gave such priority to TIFF, but I suspect that most of the matter
that is either out of Copyright or out of print matter that could be freely
licensed was available only on printed pages that had to be scanned.
Conversely, I suspect that most of the born digital material is so recent
that it is protected by Copyright. 

Another set of considerations that relates to file formats might be the
choice of library software, e.g., Greenstone {open source project funded by
United Nations}, Dspace {MIT/Hewlett Packard}, to support functions such as
a searchable catalog, subject classification, rights management, etc. 


Peter Hollings



More information about the Library mailing list