Edit/audit wikipedia activity
Martin Langhoff
martin.langhoff at gmail.com
Tue Oct 26 16:51:15 EDT 2010
On Thu, Oct 21, 2010 at 12:06 PM, Martin Langhoff
<martin.langhoff at gmail.com> wrote:
> Unfortunately, there is a clear need to organise a facility to
> audit/edit the wikipedia snapshots we have and "repack" the archive.
Some simple rough mods to server.py to allow local edits -- start
server.py with an additional argument (a path to an existing
directory) and it'll save its results there.
Start it like
./server.py 8080 /home/martin/wikiedits
The server shows the changed files, which will go into a 'wiki'
subdirectory there.
You can check the edits thus:
diff -ur /home/martin/wikiedits/wiki.orig /home/martin/wikiedits/wiki
And mergeupdates.py to... um, merge those updates
bzcat es_PE.xml.bz2.processed | tools/mergeupdates.py //wiki | bzip2
> es_PE.xml.bz2.processed.changed
You'll have to re-create the indexes (look at what woip/sh/process
does right after processing the file).
git clone git://dev.laptop.org/users/martin/wikiserver
cheers,
m
--
martin.langhoff at gmail.com
martin at laptop.org -- School Server Architect
- ask interesting questions
- don't get distracted with shiny stuff - working code first
- http://wiki.laptop.org/go/User:Martinlanghoff
More information about the Devel
mailing list