Edit/audit wikipedia activity

Martin Langhoff martin.langhoff at gmail.com
Tue Oct 26 16:51:15 EDT 2010


On Thu, Oct 21, 2010 at 12:06 PM, Martin Langhoff
<martin.langhoff at gmail.com> wrote:
> Unfortunately, there is a clear need to organise a facility to
> audit/edit the wikipedia snapshots we have and "repack" the archive.

Some simple rough mods to server.py to allow local edits -- start
server.py with an additional argument (a path to an existing
directory) and it'll save its results there.

Start it like

  ./server.py 8080 /home/martin/wikiedits

The server shows the changed files, which will go into a 'wiki'
subdirectory there.

You can check the edits thus:

  diff -ur /home/martin/wikiedits/wiki.orig /home/martin/wikiedits/wiki

And mergeupdates.py to... um, merge those updates

bzcat es_PE.xml.bz2.processed | tools/mergeupdates.py //wiki | bzip2
> es_PE.xml.bz2.processed.changed

You'll have to re-create the indexes (look at what woip/sh/process
does right after processing the file).

git clone git://dev.laptop.org/users/martin/wikiserver

cheers,



m
-- 
 martin.langhoff at gmail.com
 martin at laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff



More information about the Devel mailing list