[Wikireader] english wikireaders and 0.7

Samuel Klein sj at laptop.org
Fri Sep 5 13:46:26 EDT 2008


Thanks for the update.  bozmo, it's great to hear your group is
working on assessments as well... we won't be able to wait another two
weeks for a revised version list, but may be able to recompile once
next week.  However, I think for olpc's coming release we want a final
draft bundle this weekend.

Warmly,
SJ

On Thu, Sep 4, 2008 at 5:01 PM, Martin Walker <walkerma at potsdam.edu> wrote:
> We found a bug in the SelectionBot script that was affecting some unassessed
> articles.  That has now been fixed, and there is now an updated set of
> results, with about 28,000 articles selected.
>
> http://toolserver.org/~cbm/release-data/2008-9-4/HTML/index.html
>
>
> As for the small detailed fixes, we'll have to work on those at the weekend.
>
> Martin
> Walkerma on Wikipedia
>
> Samuel Klein wrote:
>>
>> ok, let's meet friday at 1500 EST  on #kiwix on freenode,
>> for those who can make it, to discuss making a main page for an english
>> 0.7 wikipedia bundle.
>>
>> SJ
>>
>> On Thu, Aug 28, 2008 at 12:20 PM, Martin Pascal <pmartin at linterweb.com
>> <mailto:pmartin at linterweb.com>> wrote:
>>
>>    Yes Sj ,
>>
>>    you could join #kiwix on irc.freenode.net <http://irc.freenode.net>
>>    Cordialement
>>    Martin Pascal
>>    tel : 02 32 40 23 69, fax : 02 32 61 45 26
>>    gsm : 06 13 89 77 32
>>    ----- Original Message ----- From: "Martin Walker"
>>    <walkerma at potsdam.edu <mailto:walkerma at potsdam.edu>>
>>
>>    To: "Samuel Klein" <sj at laptop.org <mailto:sj at laptop.org>>
>>    Cc: "Madeleine Ball" <mad at printf.net <mailto:mad at printf.net>>;
>>    "Offline Wikireaders" <wikireader at lists.laptop.org
>>    <mailto:wikireader at lists.laptop.org>>
>>    Sent: Thursday, August 28, 2008 6:16 PM
>>
>>    Subject: Re: [Wikireader] english wikireaders and 0.7
>>
>>
>>        SJ,
>>
>>        I can manage an IRC meeting on Friday - say at 3pm EDT (1900h
>>        UTC)?  If
>>        this is difficult for others, I will be around next week.  We
>>        have the
>>        #wikipedia-1.0 channel ( irc://irc.freenode.net/#wikipedia-1.0
>>        <http://irc.freenode.net/#wikipedia-1.0> ) if you
>>        wish, but perhaps you have a wikireader channel that may be more
>>        appropriate?
>>
>>        Martin
>>
>>
>>        Samuel Klein wrote:
>>
>>            @martin -- How about having a Friday afternoon wikireader
>>            meeting?
>>            For this week, whether or not we meet, a pressing question
>>            is :
>>            Generating the main page.  For the spanish WP, Madeleine
>>            did most of
>>            the main page by hand with a bit of help.  We may have to
>>            do the same
>>            here until better scripts are set up.
>>
>>            A couple people built the main page for our
>>            spanish-language bundle
>>            more or less by hand from a portal template.
>>
>>            Metadata :
>>
>>            1. metadata that is currently particularly useful for us is:
>>             - a blacklist of article titles, and a blacklist of
>>            images, for the
>>            very few that we explicitly leave out despite other metadata
>>             - a whitelist of both, again to ensure inclusion.
>>
>>            2. In a general system, I'd like to see this tagged with
>>            the name of
>>            the group associated; say olpc-peru-blacklist and
>>            olpc-peru-whitelist.
>>
>>            @cfabian -- testing this on bee units sounds like a fun
>>            test of the
>>            metadata slimming!
>>
>>            SJ
>>
>>            ps - any news from the offline spanish wp project that got
>>            started a
>>            while back?
>>
>>
>>            On Sun, Aug 24, 2008 at 6:12 PM, Martin Walker
>>            <walkerma at potsdam.edu <mailto:walkerma at potsdam.edu>
>>            <mailto:walkerma at potsdam.edu
>>            <mailto:walkerma at potsdam.edu>>> wrote:
>>
>>               Things are looking very promising for the Version 0.7
>>            selection -
>>               we should have a complete article list within a week or so,
>>               containing about 30,000 articles organized by a
>>            combination of
>>               quality and importance.  With our basic system of
>>            compression ,
>>               using I think probably Zeno format), I believe we
>>            should be able
>>               to include 30,000 long-ish articles with thumbnails on
>>            one DVD,
>>               along with Kiwix and some index pages.  I'd be
>>            interested to see
>>               how it would work with your compression system - we
>>            could get a
>>               few people to test that, I think.
>>
>>               I know how you love metadata, SJ, and we now have loads
>>            of it
>>               (from 1.4 million articles) - so we can customize the
>>            selection
>>               for you at will using quality, wikiproject, or the four
>>            importance
>>               paramaters.  Since this is for kids in specific places,
>>            we can
>>               emphasize dinosaurs or birds, exclude serial killers,
>>            or include
>>               all articles from (say) Uganda, all as requested.  Let
>>            me know if
>>               this feature is useful.  We don't have an equivalent
>>            ranking for
>>               images, I'm afraid - for V0.7 we just include all legal
>>            images (as
>>               thumbnails).  As for a "main page", the plan is to have
>>            a set of
>>               index pages generated by bot and then corrected by a manual
>>               "reality check", but that will take another month or two.
>>
>>               I'd really like to make sure that we make sure we work
>>            together in
>>               the coming months, because I think we can avoid a lot
>>            of duplicate
>>               work if we share our best resources, scripts, etc.
>>             Once the
>>               selection is done (~ 1st Sept), should we hold an IRC
>>            discussion
>>               on how we can best collaborate?
>>
>>               Martin
>>
>>
>>               Samuel Klein wrote:
>>
>>                   There's lots of motivation to get an english
>>            wikireader, say,
>>                   taking advantage of the article selection and
>>            processing of 0.7 .
>>                   OLPC could include this in the upcoming G1G1
>>            machines this
>>                   winter / early next year.  Other users could test
>>            wikireaders
>>                   that read this zipped format on their own machines,
>>            which
>>                   would flesh out the reader code.
>>
>>                   Martin -- what's the status on the 0.7 articlelist?
>>             Do you
>>                   have a similar imagelist that ranks images by
>>            importance to
>>                   that set of articles?
>>                   How is work on a 0.7 main page?  I'd love to see
>>            how large a
>>                   snapshot is with our curent wikireader code
>>            (without even
>>                   moving to 7z, or trimming the list).
>>
>>                   SJ
>>
>>
>>
>>
>>
>>
>>
>>        _______________________________________________
>>        Wikireader mailing list
>>        Wikireader at lists.laptop.org <mailto:Wikireader at lists.laptop.org>
>>        http://lists.laptop.org/listinfo/wikireader
>>
>>
>>
>
>
>
>


More information about the Wikireader mailing list