[OLPC library] wikimedia parser

David Farning dfarning at gmail.com
Thu Jan 3 18:36:36 EST 2008


Thanks for the feedback-
I have followed the same workflow as wikisplice with the
following goals:
1. More generalized - should work with any content that is stored on
any mediawiki- dictionaries included.
2. Use mwlib - calls to special:expand have been replaced with call to
mwlib.
3. Generalize fetcher - should be able to work with live wiki or
static dumps.
4. Extend workflow - create the xol.

Some interesting challenges:
1. Creating the Item list.  I am following your lead of reading the
colletion configuration information from a wiki page.  As a starting
point I am using standard configpython syntax.

[main]
base_directory:  ./collections/

[collection]
name:       test
type:       wikimedia
language:   en
items:      A B OLPC

[site]
name:       wikipedia
index:      http://en.wikipedia.org/w/index.php?
wrapper:    wrapper.html

2. Consistent style.  I have not figure this part out yet.  Is there a
web guru out there who could create some style sheet so all of the
content would be rendered consistently?

3. Navigation. This is proving to be one of the bigger challenges.
a.  Client side search.  Has any work been done on developing a
client side search for a dictionaries and collections of articles?

b.  Directed navigation.  Has any work been done on creating
consistent index.html files between collections?

4.  Collections of collection.  How do you handle the integeration of
collections?  For example, if a child installs content bundles that
contain wikipedia articles on both physics and math are intra-
collection links recreated?


Thank
David Farning




More information about the Library mailing list