[Wikireader] english wikireaders and 0.7

Martin Walker walkerma at potsdam.edu
Thu Sep 4 17:01:51 EDT 2008


We found a bug in the SelectionBot script that was affecting some 
unassessed articles.  That has now been fixed, and there is now an 
updated set of results, with about 28,000 articles selected.

 http://toolserver.org/~cbm/release-data/2008-9-4/HTML/index.html


As for the small detailed fixes, we'll have to work on those at the weekend.

Martin
Walkerma on Wikipedia

Samuel Klein wrote:
> ok, let's meet friday at 1500 EST  on #kiwix on freenode,
> for those who can make it, to discuss making a main page for an 
> english 0.7 wikipedia bundle.
>
> SJ
>
> On Thu, Aug 28, 2008 at 12:20 PM, Martin Pascal <pmartin at linterweb.com 
> <mailto:pmartin at linterweb.com>> wrote:
>
>     Yes Sj ,
>
>     you could join #kiwix on irc.freenode.net <http://irc.freenode.net>
>     Cordialement
>     Martin Pascal
>     tel : 02 32 40 23 69, fax : 02 32 61 45 26
>     gsm : 06 13 89 77 32
>     ----- Original Message ----- From: "Martin Walker"
>     <walkerma at potsdam.edu <mailto:walkerma at potsdam.edu>>
>
>     To: "Samuel Klein" <sj at laptop.org <mailto:sj at laptop.org>>
>     Cc: "Madeleine Ball" <mad at printf.net <mailto:mad at printf.net>>;
>     "Offline Wikireaders" <wikireader at lists.laptop.org
>     <mailto:wikireader at lists.laptop.org>>
>     Sent: Thursday, August 28, 2008 6:16 PM
>
>     Subject: Re: [Wikireader] english wikireaders and 0.7
>
>
>         SJ,
>
>         I can manage an IRC meeting on Friday - say at 3pm EDT (1900h
>         UTC)?  If
>         this is difficult for others, I will be around next week.  We
>         have the
>         #wikipedia-1.0 channel ( irc://irc.freenode.net/#wikipedia-1.0
>         <http://irc.freenode.net/#wikipedia-1.0> ) if you
>         wish, but perhaps you have a wikireader channel that may be more
>         appropriate?
>
>         Martin
>
>
>         Samuel Klein wrote:
>
>             @martin -- How about having a Friday afternoon wikireader
>             meeting?
>             For this week, whether or not we meet, a pressing question
>             is :
>             Generating the main page.  For the spanish WP, Madeleine
>             did most of
>             the main page by hand with a bit of help.  We may have to
>             do the same
>             here until better scripts are set up.
>
>             A couple people built the main page for our
>             spanish-language bundle
>             more or less by hand from a portal template.
>
>             Metadata :
>
>             1. metadata that is currently particularly useful for us is:
>              - a blacklist of article titles, and a blacklist of
>             images, for the
>             very few that we explicitly leave out despite other metadata
>              - a whitelist of both, again to ensure inclusion.
>
>             2. In a general system, I'd like to see this tagged with
>             the name of
>             the group associated; say olpc-peru-blacklist and
>             olpc-peru-whitelist.
>
>             @cfabian -- testing this on bee units sounds like a fun
>             test of the
>             metadata slimming!
>
>             SJ
>
>             ps - any news from the offline spanish wp project that got
>             started a
>             while back?
>
>
>             On Sun, Aug 24, 2008 at 6:12 PM, Martin Walker
>             <walkerma at potsdam.edu <mailto:walkerma at potsdam.edu>
>             <mailto:walkerma at potsdam.edu
>             <mailto:walkerma at potsdam.edu>>> wrote:
>
>                Things are looking very promising for the Version 0.7
>             selection -
>                we should have a complete article list within a week or so,
>                containing about 30,000 articles organized by a
>             combination of
>                quality and importance.  With our basic system of
>             compression ,
>                using I think probably Zeno format), I believe we
>             should be able
>                to include 30,000 long-ish articles with thumbnails on
>             one DVD,
>                along with Kiwix and some index pages.  I'd be
>             interested to see
>                how it would work with your compression system - we
>             could get a
>                few people to test that, I think.
>
>                I know how you love metadata, SJ, and we now have loads
>             of it
>                (from 1.4 million articles) - so we can customize the
>             selection
>                for you at will using quality, wikiproject, or the four
>             importance
>                paramaters.  Since this is for kids in specific places,
>             we can
>                emphasize dinosaurs or birds, exclude serial killers,
>             or include
>                all articles from (say) Uganda, all as requested.  Let
>             me know if
>                this feature is useful.  We don't have an equivalent
>             ranking for
>                images, I'm afraid - for V0.7 we just include all legal
>             images (as
>                thumbnails).  As for a "main page", the plan is to have
>             a set of
>                index pages generated by bot and then corrected by a manual
>                "reality check", but that will take another month or two.
>
>                I'd really like to make sure that we make sure we work
>             together in
>                the coming months, because I think we can avoid a lot
>             of duplicate
>                work if we share our best resources, scripts, etc.
>              Once the
>                selection is done (~ 1st Sept), should we hold an IRC
>             discussion
>                on how we can best collaborate?
>
>                Martin
>
>
>                Samuel Klein wrote:
>
>                    There's lots of motivation to get an english
>             wikireader, say,
>                    taking advantage of the article selection and
>             processing of 0.7 .
>                    OLPC could include this in the upcoming G1G1
>             machines this
>                    winter / early next year.  Other users could test
>             wikireaders
>                    that read this zipped format on their own machines,
>             which
>                    would flesh out the reader code.
>
>                    Martin -- what's the status on the 0.7 articlelist?
>              Do you
>                    have a similar imagelist that ranks images by
>             importance to
>                    that set of articles?
>                    How is work on a 0.7 main page?  I'd love to see
>             how large a
>                    snapshot is with our curent wikireader code
>             (without even
>                    moving to 7z, or trimming the list).
>
>                    SJ
>
>
>
>
>
>
>
>         _______________________________________________
>         Wikireader mailing list
>         Wikireader at lists.laptop.org <mailto:Wikireader at lists.laptop.org>
>         http://lists.laptop.org/listinfo/wikireader
>
>
>





More information about the Wikireader mailing list