[Wikireader] [Techteam] Fw: Reducir tamaño Wikipedia en XS

Fabien Coulon fabien.coulon at gmail.com
Tue Feb 10 08:13:45 EST 2009


Thanks, we will work on a test version.
Samuel, about cross linking between corpus, the zeno:// urls make it
possible, though there is no fallback for missing corpus yet. A page in a
missing corpus is a blank page :) It's something to work on.

On Mon, Feb 9, 2009 at 9:12 PM, Chris Ball <cjb at laptop.org> wrote:

> Hi Fabien, SJ,
>
> > Ok. But the zeno with all articles from wikipedia 'es' takes about 1GB,
> just
> > for texts. Does the Peruvian selection consist of all articles once
> removed
> > those in http://dev.laptop.org/~cjb/eswiki/blacklist3<http://dev.laptop.org/%7Ecjb/eswiki/blacklist3>?
>
> The status of our eswiki builds is:
>
> build 1: for XO, 80M for most popular 30k articles, plus 20M for 3000
> images
> build 2: for server, 622M for all 1M articles, plus 230M for 220k images
>
> (the images in both cases are downsampled to lower quality so that we
> can include more of them)
>
> Fabien, if you'd like to try out the full build 2, here are instructions
> that should work on a 32-bit x86 Linux machine:
>
> * wget http://dev.laptop.org/~cjb/spanish_wikiserver_full.tgz<http://dev.laptop.org/%7Ecjb/spanish_wikiserver_full.tgz>
> * tar zxf spanish_wikiserver_full.tgz
> * cd wikiserver/es_PE
> * wget http://dev.laptop.org/~cjb/eswiki/images.tar.gz<http://dev.laptop.org/%7Ecjb/eswiki/images.tar.gz>
> * tar zxf images.tar.gz
> * cd ..
> * python server.py es_PE/eswiki-20090124-pages-articles.xml.bz2 8000
> * browse to http://<IP address>:8000/
>
> Thanks,
>
> - Chris.
> --
> Chris Ball   <cjb at laptop.org>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.laptop.org/pipermail/wikireader/attachments/20090210/8004e140/attachment.htm 


More information about the Wikireader mailing list