<div dir="ltr">@martin -- How about having a Friday afternoon wikireader meeting?<br>For this week, whether or not we meet, a pressing question is : Generating the main page. For the spanish WP, Madeleine did most of the main page by hand with a bit of help. We may have to do the same here until better scripts are set up.<br>
<br>A couple people built the main page for our spanish-language bundle more or less by hand from a portal template.<br><br>Metadata :<br><br>1. metadata that is currently particularly useful for us is:<br> - a blacklist of article titles, and a blacklist of images, for the very few that we explicitly leave out despite other metadata<br>
- a whitelist of both, again to ensure inclusion.<br><br>2. In a general system, I'd like to see this tagged with the name of the group associated; say olpc-peru-blacklist and olpc-peru-whitelist.<br><br>@cfabian -- testing this on bee units sounds like a fun test of the metadata slimming!<br>
<br>SJ<br><br>ps - any news from the offline spanish wp project that got started a while back?<br><br><br><div class="gmail_quote">On Sun, Aug 24, 2008 at 6:12 PM, Martin Walker <span dir="ltr"><<a href="mailto:walkerma@potsdam.edu">walkerma@potsdam.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Things are looking very promising for the Version 0.7 selection - we should have a complete article list within a week or so, containing about 30,000 articles organized by a combination of quality and importance. With our basic system of compression , using I think probably Zeno format), I believe we should be able to include 30,000 long-ish articles with thumbnails on one DVD, along with Kiwix and some index pages. I'd be interested to see how it would work with your compression system - we could get a few people to test that, I think.<br>
<br>
I know how you love metadata, SJ, and we now have loads of it (from 1.4 million articles) - so we can customize the selection for you at will using quality, wikiproject, or the four importance paramaters. Since this is for kids in specific places, we can emphasize dinosaurs or birds, exclude serial killers, or include all articles from (say) Uganda, all as requested. Let me know if this feature is useful. We don't have an equivalent ranking for images, I'm afraid - for V0.7 we just include all legal images (as thumbnails). As for a "main page", the plan is to have a set of index pages generated by bot and then corrected by a manual "reality check", but that will take another month or two.<br>
<br>
I'd really like to make sure that we make sure we work together in the coming months, because I think we can avoid a lot of duplicate work if we share our best resources, scripts, etc. Once the selection is done (~ 1st Sept), should we hold an IRC discussion on how we can best collaborate?<br>
<font color="#888888">
<br>
Martin</font><div><div></div><div class="Wj3C7c"><br>
<br>
Samuel Klein wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
There's lots of motivation to get an english wikireader, say, taking advantage of the article selection and processing of 0.7 .<br>
OLPC could include this in the upcoming G1G1 machines this winter / early next year. Other users could test wikireaders that read this zipped format on their own machines, which would flesh out the reader code.<br>
<br>
Martin -- what's the status on the 0.7 articlelist? Do you have a similar imagelist that ranks images by importance to that set of articles?<br>
How is work on a 0.7 main page? I'd love to see how large a snapshot is with our curent wikireader code (without even moving to 7z, or trimming the list).<br>
<br>
SJ<br>
<br>
</blockquote>
<br>
<br>
<br>
</div></div></blockquote></div><br></div>