<div class="gmail_quote">Hi everyone. We need your help again.<br><br>We finally have a working mirror for generating the static html version of ESWIKI we need for Cd-Pedia using DumpHTML extension. <br>But it seems that the process will take about 3000 hours of processing in our little sempron server (4 months!). <br>
<br>How many time could it take in Wikimedia's servers?<br><br><br>Thanks<br><br>(This is intentional top-posting to update quickly the situation)<br><br><br>El 1 de junio de 2010 18:53, Ángel González <span dir="ltr"><<a href="mailto:keisial@gmail.com">keisial@gmail.com</a>></span> escribió:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">El 30/04/10 17:43, Alejandro J. Cura escribió:<br>
<div><div></div><div class="h5">> Hi everyone, we need your help.<br>
><br>
> We are from Python Argentina, and we are working on adapting our<br>
> cdpedia project to make a DVD together with <a href="http://educ.ar" target="_blank">educ.ar</a> and Wikimedia<br>
> Foundation, holding the entire Spanish Wikipedia that will be sent<br>
> soon to Argentinian schools.<br>
><br>
> Hernán and Diego are the two interns tasked with updating the data<br>
> that cdpedia uses to make the cd (it currently uses a static html dump<br>
> dated June 2008), but they are encountering some problems while trying<br>
> to make an up to date static html es-wikipedia dump.<br>
><br>
> I'm ccing this list of people, because I'm sure you've faced similar<br>
> issues when making your offline wikipedias, or because maybe you know<br>
> someone who can help us.<br>
><br>
> Following is an email from Hernán describing the problems he's found.<br>
><br>
> thanks!<br>
> -- alecu - Python Argentina 2010/4/30 Hernan Olivera<br>
> <<a href="mailto:lholivera@gmail.com">lholivera@gmail.com</a>>: Hi everybody, I've been working on making an up<br>
> to date static html dump for the spanish wikipedia, to use as a basis<br>
> for the DVD. I've followed the procedures detailed in the pages below,<br>
> that were used to generate the current (and out of date) static html<br>
> dumps: 1) installing and setting up a mediawiki instance 2) importing<br>
> the xml from [6] with mwdumper 3) exporting the static html with<br>
> mediawiki's tool The procedure finishes without throwing any errors,<br>
> but the xml import produces malformed html pages that have visible<br>
> wikimarkup. We would really need to have a successful import from the<br>
> spanish xmls to a mediawiki instance so we can produce the up to date<br>
> static html dump. Links to the info I used: [0]<br>
> <a href="http://www.mediawiki.org/wiki/Manual:Installation_guide/es" target="_blank">http://www.mediawiki.org/wiki/Manual:Installation_guide/es</a> [1]<br>
> <a href="http://www.mediawiki.org/wiki/Manual:Running_MediaWiki_on_Ubuntu" target="_blank">http://www.mediawiki.org/wiki/Manual:Running_MediaWiki_on_Ubuntu</a> [2]<br>
> <a href="http://en.wikipedia.org/wiki/Wikipedia_database" target="_blank">http://en.wikipedia.org/wiki/Wikipedia_database</a> [3]<br>
> <a href="http://www.mediawiki.org/wiki/Manual:Importing_XML_dumps" target="_blank">http://www.mediawiki.org/wiki/Manual:Importing_XML_dumps</a> [4]<br>
> <a href="http://meta.wikimedia.org/wiki/Importing_a_Wikipedia_database_dump_into_MediaWiki" target="_blank">http://meta.wikimedia.org/wiki/Importing_a_Wikipedia_database_dump_into_MediaWiki</a><br>
> [5] <a href="http://meta.wikimedia.org/wiki/Data_dumps" target="_blank">http://meta.wikimedia.org/wiki/Data_dumps</a> [6]<br>
> <a href="http://dumps.wikimedia.org/eswiki/20100331/" target="_blank">http://dumps.wikimedia.org/eswiki/20100331/</a> [7]<br>
> <a href="http://www.mediawiki.org/wiki/Alternative_parsers" target="_blank">http://www.mediawiki.org/wiki/Alternative_parsers</a> (among others)<br>
> Cheers, --<br>
</div></div>Hola Hernán,<br>
<br>
You may have used one of the corrupted dumps. See<br>
<a href="https://bugzilla.wikimedia.org/show_bug.cgi?id=18694" target="_blank">https://bugzilla.wikimedia.org/show_bug.cgi?id=18694</a><br>
<a href="https://bugzilla.wikimedia.org/show_bug.cgi?id=23264" target="_blank">https://bugzilla.wikimedia.org/show_bug.cgi?id=23264</a><br>
<br>
Otherwise, did you install parserfunctions and other extensions needed?<br>
<br>
</blockquote></div><br><br clear="all"><br>-- <br>Hernan Olivera<br><br>