Cutting a slice of wikipedia - CDPedia

Martin Langhoff martin.langhoff at gmail.com
Wed Apr 9 17:48:58 EDT 2008


On Wed, Apr 9, 2008 at 2:53 PM, Samuel Klein <meta.sj at gmail.com> wrote:
> It's nice to see a python toolchain for this (though I don't see any code at
> that url?)  They exist in other languages as well.  We've been working with
> Linterweb's Kiwix (kiwix.org) and the Schools-Wikipedia, which use their own
> toolchains.

Hi SJ

I suspected that there would be something out there - Alecu's
implementation has some interesting smarts in that it does an
auto-selection of the pages to include. I'll let him explan that. The
wikislice page talks about the user providing the list of urls, which
means you need to auto-generate that somehow.

Maybe we can integrate CDPedia's scoring scheme?

[I did an svn checkout of kiwix, this thing has an embedded gecko.]

> ps - I don't see code at the google-code url... and "cdpedia" is a name used
> by a few existing projects, some commercial; you might want to choose
> another name.

Go to the code page, and click on the svn browse thingy...

> pps - Martin: simple: is nice, but not of uniform quality

Good to know! --  I wasn't ewxpecting too much uniform-ness out of
wikipedia anyway ;-)

cheers,


m
-- 
 martin.langhoff at gmail.com
 martin at laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff



More information about the Devel mailing list