[Server-devel] Cutting a slice of wikipedia - CDPedia

Alejandro J. Cura alecu at vortech.com.ar
Thu Apr 10 10:13:44 EDT 2008


Hi SJ,

On Wed, Apr 9, 2008 at 7:31 PM, Samuel Klein <meta.sj at gmail.com> wrote:
> I'd like to see the auto-selection code; I don't find it in the trunk atm.
We played with some sample code, and have a bunch of ideas on the
design docs, but the auto-selection is not finished yet.

> I do see hints of using mwlib, which is good; it is well-maintained.
We didn't have such a good experience using mwlib. The problem I
remember most clearly was that some strings were hard coded for the
english and german wikipedias, but there were a few others.
Facundo Batista was working on this, and does a great job of
explaining it (again in spanish), here:
http://www.taniquetil.com.ar/plog/post/1/328

> For live slices, using MediaWiki's API rather than a dump, there's mwclient.
>
> http://fisheye.ts.wikimedia.org/browse/bryan/mwclient/trunk/README.txt?r=HEAD
>
> More scoring schemes are welcome.  See also wikiosity's simple
> relevance-scoring code, which takes in a few keywords and considers 1st &
> 2nd-order links.
>   http://dev.laptop.org/git?p=projects/wikiosity;a=tree

Hey, you surely have your eye set on this. Do you keep a list of all
this related projects?

thanks a lot,
-- 
alecu


More information about the Server-devel mailing list