[Wikireader] wikipedia-iphone on XO.

Chris Ball cjb at laptop.org
Wed Apr 30 17:39:58 EDT 2008


Hi,

I checked out the wikipedia-iphone¹ project, and it seems pretty
suitable for making XO snapshots.  Their goal is to serve wikipedia
content out of a compressed copy of the XML dump file after indexing
it.  The architecture is that there are C functions to pull out
articles, and several interfaces to those C functions:  the main
interface is the iPhone app, but there's also a web server (written
in Ruby with Mongrel) that runs locally and serves up pages from
the compressed archive.

I have this (the Ruby web server) working on a local machine here,
serving up the entirety of the Spanish wikipedia text (the compressed
.bz2 file that pages are served out of is 400M, and the index is 10M).

Here are some possible next steps, possibly even in order:

* Port the straightforward Ruby/Mongrel code to Python/BaseHTTPServer
  so that it can run as a standalone activity on the XO.

* Some article selection.  Since it serves files out of the .xml.bz2,
  we can accomplish this by choosing what goes into the .xml.bz2
  (perhaps there are already tools for doing this?  I don't know much
  about it) as long as we deal with the link-breaking we do as a result.

* Add a subset of images.

* Serve up unparsed text with a Javascript parser rather than using
  the (incomplete) wikitext->HTML parser inside wikipedia-iphone.

* Use 7zip instead of bz2 -- for the Spanish wikipedia text with
  history, this is the difference between 1.6GB and 8.3GB.  With
  this, we could expect our 400M snapshot to drop below 100M, which
  is in the realm of being able to include it on every XO in a
  deployment. I'm most interested in Spanish because it's the language
  that most of OLPC's current deployments use.

You can temporarily see the Ruby system serving up the Spanish wikipedia
text at http://pullcord.laptop.org:9090/ .

Is anyone interested in volunteering for any of these bullet points, or
adding more?  I'll also get in touch with the wikipedia-iphone community
and ask if they're interested in working on any of these with us.

Thanks!

- Chris.

¹:  http://collison.ie/wikipedia-iphone/
-- 
Chris Ball   <cjb at laptop.org>


More information about the Wikireader mailing list