[Wikireader] english wikireaders and 0.7
Madeleine Ball
meprice at gmail.com
Fri Sep 5 14:04:23 EDT 2008
AIUI they've given us the list here, but I haven't had a chance to
look at it yet. I think it would be best to meet sometime after we've
played with it -- which has to wait until this weekend. Maybe on Monday?
On Sep 5, 2008, at 1:46 PM, Samuel Klein wrote:
> Thanks for the update. bozmo, it's great to hear your group is
> working on assessments as well... we won't be able to wait another two
> weeks for a revised version list, but may be able to recompile once
> next week. However, I think for olpc's coming release we want a final
> draft bundle this weekend.
>
> Warmly,
> SJ
>
> On Thu, Sep 4, 2008 at 5:01 PM, Martin Walker
> <walkerma at potsdam.edu> wrote:
>> We found a bug in the SelectionBot script that was affecting some
>> unassessed
>> articles. That has now been fixed, and there is now an updated
>> set of
>> results, with about 28,000 articles selected.
>>
>> http://toolserver.org/~cbm/release-data/2008-9-4/HTML/index.html
>>
>>
>> As for the small detailed fixes, we'll have to work on those at
>> the weekend.
>>
>> Martin
>> Walkerma on Wikipedia
>>
>> Samuel Klein wrote:
>>>
>>> ok, let's meet friday at 1500 EST on #kiwix on freenode,
>>> for those who can make it, to discuss making a main page for an
>>> english
>>> 0.7 wikipedia bundle.
>>>
>>> SJ
>>>
>>> On Thu, Aug 28, 2008 at 12:20 PM, Martin Pascal
>>> <pmartin at linterweb.com
>>> <mailto:pmartin at linterweb.com>> wrote:
>>>
>>> Yes Sj ,
>>>
>>> you could join #kiwix on irc.freenode.net <http://
>>> irc.freenode.net>
>>> Cordialement
>>> Martin Pascal
>>> tel : 02 32 40 23 69, fax : 02 32 61 45 26
>>> gsm : 06 13 89 77 32
>>> ----- Original Message ----- From: "Martin Walker"
>>> <walkerma at potsdam.edu <mailto:walkerma at potsdam.edu>>
>>>
>>> To: "Samuel Klein" <sj at laptop.org <mailto:sj at laptop.org>>
>>> Cc: "Madeleine Ball" <mad at printf.net <mailto:mad at printf.net>>;
>>> "Offline Wikireaders" <wikireader at lists.laptop.org
>>> <mailto:wikireader at lists.laptop.org>>
>>> Sent: Thursday, August 28, 2008 6:16 PM
>>>
>>> Subject: Re: [Wikireader] english wikireaders and 0.7
>>>
>>>
>>> SJ,
>>>
>>> I can manage an IRC meeting on Friday - say at 3pm EDT (1900h
>>> UTC)? If
>>> this is difficult for others, I will be around next week. We
>>> have the
>>> #wikipedia-1.0 channel ( irc://irc.freenode.net/
>>> #wikipedia-1.0
>>> <http://irc.freenode.net/#wikipedia-1.0> ) if you
>>> wish, but perhaps you have a wikireader channel that may
>>> be more
>>> appropriate?
>>>
>>> Martin
>>>
>>>
>>> Samuel Klein wrote:
>>>
>>> @martin -- How about having a Friday afternoon wikireader
>>> meeting?
>>> For this week, whether or not we meet, a pressing
>>> question
>>> is :
>>> Generating the main page. For the spanish WP, Madeleine
>>> did most of
>>> the main page by hand with a bit of help. We may have to
>>> do the same
>>> here until better scripts are set up.
>>>
>>> A couple people built the main page for our
>>> spanish-language bundle
>>> more or less by hand from a portal template.
>>>
>>> Metadata :
>>>
>>> 1. metadata that is currently particularly useful for
>>> us is:
>>> - a blacklist of article titles, and a blacklist of
>>> images, for the
>>> very few that we explicitly leave out despite other
>>> metadata
>>> - a whitelist of both, again to ensure inclusion.
>>>
>>> 2. In a general system, I'd like to see this tagged with
>>> the name of
>>> the group associated; say olpc-peru-blacklist and
>>> olpc-peru-whitelist.
>>>
>>> @cfabian -- testing this on bee units sounds like a fun
>>> test of the
>>> metadata slimming!
>>>
>>> SJ
>>>
>>> ps - any news from the offline spanish wp project that
>>> got
>>> started a
>>> while back?
>>>
>>>
>>> On Sun, Aug 24, 2008 at 6:12 PM, Martin Walker
>>> <walkerma at potsdam.edu <mailto:walkerma at potsdam.edu>
>>> <mailto:walkerma at potsdam.edu
>>> <mailto:walkerma at potsdam.edu>>> wrote:
>>>
>>> Things are looking very promising for the Version 0.7
>>> selection -
>>> we should have a complete article list within a
>>> week or so,
>>> containing about 30,000 articles organized by a
>>> combination of
>>> quality and importance. With our basic system of
>>> compression ,
>>> using I think probably Zeno format), I believe we
>>> should be able
>>> to include 30,000 long-ish articles with thumbnails on
>>> one DVD,
>>> along with Kiwix and some index pages. I'd be
>>> interested to see
>>> how it would work with your compression system - we
>>> could get a
>>> few people to test that, I think.
>>>
>>> I know how you love metadata, SJ, and we now have
>>> loads
>>> of it
>>> (from 1.4 million articles) - so we can customize the
>>> selection
>>> for you at will using quality, wikiproject, or the
>>> four
>>> importance
>>> paramaters. Since this is for kids in specific
>>> places,
>>> we can
>>> emphasize dinosaurs or birds, exclude serial killers,
>>> or include
>>> all articles from (say) Uganda, all as requested. Let
>>> me know if
>>> this feature is useful. We don't have an equivalent
>>> ranking for
>>> images, I'm afraid - for V0.7 we just include all
>>> legal
>>> images (as
>>> thumbnails). As for a "main page", the plan is to
>>> have
>>> a set of
>>> index pages generated by bot and then corrected by
>>> a manual
>>> "reality check", but that will take another month
>>> or two.
>>>
>>> I'd really like to make sure that we make sure we work
>>> together in
>>> the coming months, because I think we can avoid a lot
>>> of duplicate
>>> work if we share our best resources, scripts, etc.
>>> Once the
>>> selection is done (~ 1st Sept), should we hold an IRC
>>> discussion
>>> on how we can best collaborate?
>>>
>>> Martin
>>>
>>>
>>> Samuel Klein wrote:
>>>
>>> There's lots of motivation to get an english
>>> wikireader, say,
>>> taking advantage of the article selection and
>>> processing of 0.7 .
>>> OLPC could include this in the upcoming G1G1
>>> machines this
>>> winter / early next year. Other users could test
>>> wikireaders
>>> that read this zipped format on their own
>>> machines,
>>> which
>>> would flesh out the reader code.
>>>
>>> Martin -- what's the status on the 0.7
>>> articlelist?
>>> Do you
>>> have a similar imagelist that ranks images by
>>> importance to
>>> that set of articles?
>>> How is work on a 0.7 main page? I'd love to see
>>> how large a
>>> snapshot is with our curent wikireader code
>>> (without even
>>> moving to 7z, or trimming the list).
>>>
>>> SJ
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Wikireader mailing list
>>> Wikireader at lists.laptop.org
>>> <mailto:Wikireader at lists.laptop.org>
>>> http://lists.laptop.org/listinfo/wikireader
>>>
>>>
>>>
>>
>>
>>
>>
More information about the Wikireader
mailing list