[Wikireader] english wikireaders and 0.7

Samuel Klein sj at laptop.org
Fri Sep 5 18:09:48 EDT 2008


longest. 5. min. meeting. ever.  and awesome :-)  I am hopeful for
snapshots by monday, and agreed to do any filtering/blacklisting.

On Fri, Sep 5, 2008 at 2:31 PM, Samuel Klein <sj at laptop.org> wrote:
> This will be a 5-minute meeting then, to review who i doing what by
> when, and exchange cheer :)  SJ
>
> On Fri, Sep 5, 2008 at 2:04 PM, Madeleine Ball <meprice at gmail.com> wrote:
>> AIUI they've given us the list here, but I haven't had a chance to look at
>> it yet. I think it would be best to meet sometime after we've played with it
>> -- which has to wait until this weekend. Maybe on Monday?
>>
>> On Sep 5, 2008, at 1:46 PM, Samuel Klein wrote:
>>
>>> Thanks for the update.  bozmo, it's great to hear your group is
>>> working on assessments as well... we won't be able to wait another two
>>> weeks for a revised version list, but may be able to recompile once
>>> next week.  However, I think for olpc's coming release we want a final
>>> draft bundle this weekend.
>>>
>>> Warmly,
>>> SJ
>>>
>>> On Thu, Sep 4, 2008 at 5:01 PM, Martin Walker <walkerma at potsdam.edu>
>>> wrote:
>>>>
>>>> We found a bug in the SelectionBot script that was affecting some
>>>> unassessed
>>>> articles.  That has now been fixed, and there is now an updated set of
>>>> results, with about 28,000 articles selected.
>>>>
>>>> http://toolserver.org/~cbm/release-data/2008-9-4/HTML/index.html
>>>>
>>>>
>>>> As for the small detailed fixes, we'll have to work on those at the
>>>> weekend.
>>>>
>>>> Martin
>>>> Walkerma on Wikipedia
>>>>
>>>> Samuel Klein wrote:
>>>>>
>>>>> ok, let's meet friday at 1500 EST  on #kiwix on freenode,
>>>>> for those who can make it, to discuss making a main page for an english
>>>>> 0.7 wikipedia bundle.
>>>>>
>>>>> SJ
>>>>>
>>>>> On Thu, Aug 28, 2008 at 12:20 PM, Martin Pascal <pmartin at linterweb.com
>>>>> <mailto:pmartin at linterweb.com>> wrote:
>>>>>
>>>>>   Yes Sj ,
>>>>>
>>>>>   you could join #kiwix on irc.freenode.net <http://irc.freenode.net>
>>>>>   Cordialement
>>>>>   Martin Pascal
>>>>>   tel : 02 32 40 23 69, fax : 02 32 61 45 26
>>>>>   gsm : 06 13 89 77 32
>>>>>   ----- Original Message ----- From: "Martin Walker"
>>>>>   <walkerma at potsdam.edu <mailto:walkerma at potsdam.edu>>
>>>>>
>>>>>   To: "Samuel Klein" <sj at laptop.org <mailto:sj at laptop.org>>
>>>>>   Cc: "Madeleine Ball" <mad at printf.net <mailto:mad at printf.net>>;
>>>>>   "Offline Wikireaders" <wikireader at lists.laptop.org
>>>>>   <mailto:wikireader at lists.laptop.org>>
>>>>>   Sent: Thursday, August 28, 2008 6:16 PM
>>>>>
>>>>>   Subject: Re: [Wikireader] english wikireaders and 0.7
>>>>>
>>>>>
>>>>>       SJ,
>>>>>
>>>>>       I can manage an IRC meeting on Friday - say at 3pm EDT (1900h
>>>>>       UTC)?  If
>>>>>       this is difficult for others, I will be around next week.  We
>>>>>       have the
>>>>>       #wikipedia-1.0 channel ( irc://irc.freenode.net/#wikipedia-1.0
>>>>>       <http://irc.freenode.net/#wikipedia-1.0> ) if you
>>>>>       wish, but perhaps you have a wikireader channel that may be more
>>>>>       appropriate?
>>>>>
>>>>>       Martin
>>>>>
>>>>>
>>>>>       Samuel Klein wrote:
>>>>>
>>>>>           @martin -- How about having a Friday afternoon wikireader
>>>>>           meeting?
>>>>>           For this week, whether or not we meet, a pressing question
>>>>>           is :
>>>>>           Generating the main page.  For the spanish WP, Madeleine
>>>>>           did most of
>>>>>           the main page by hand with a bit of help.  We may have to
>>>>>           do the same
>>>>>           here until better scripts are set up.
>>>>>
>>>>>           A couple people built the main page for our
>>>>>           spanish-language bundle
>>>>>           more or less by hand from a portal template.
>>>>>
>>>>>           Metadata :
>>>>>
>>>>>           1. metadata that is currently particularly useful for us is:
>>>>>            - a blacklist of article titles, and a blacklist of
>>>>>           images, for the
>>>>>           very few that we explicitly leave out despite other metadata
>>>>>            - a whitelist of both, again to ensure inclusion.
>>>>>
>>>>>           2. In a general system, I'd like to see this tagged with
>>>>>           the name of
>>>>>           the group associated; say olpc-peru-blacklist and
>>>>>           olpc-peru-whitelist.
>>>>>
>>>>>           @cfabian -- testing this on bee units sounds like a fun
>>>>>           test of the
>>>>>           metadata slimming!
>>>>>
>>>>>           SJ
>>>>>
>>>>>           ps - any news from the offline spanish wp project that got
>>>>>           started a
>>>>>           while back?
>>>>>
>>>>>
>>>>>           On Sun, Aug 24, 2008 at 6:12 PM, Martin Walker
>>>>>           <walkerma at potsdam.edu <mailto:walkerma at potsdam.edu>
>>>>>           <mailto:walkerma at potsdam.edu
>>>>>           <mailto:walkerma at potsdam.edu>>> wrote:
>>>>>
>>>>>              Things are looking very promising for the Version 0.7
>>>>>           selection -
>>>>>              we should have a complete article list within a week or so,
>>>>>              containing about 30,000 articles organized by a
>>>>>           combination of
>>>>>              quality and importance.  With our basic system of
>>>>>           compression ,
>>>>>              using I think probably Zeno format), I believe we
>>>>>           should be able
>>>>>              to include 30,000 long-ish articles with thumbnails on
>>>>>           one DVD,
>>>>>              along with Kiwix and some index pages.  I'd be
>>>>>           interested to see
>>>>>              how it would work with your compression system - we
>>>>>           could get a
>>>>>              few people to test that, I think.
>>>>>
>>>>>              I know how you love metadata, SJ, and we now have loads
>>>>>           of it
>>>>>              (from 1.4 million articles) - so we can customize the
>>>>>           selection
>>>>>              for you at will using quality, wikiproject, or the four
>>>>>           importance
>>>>>              paramaters.  Since this is for kids in specific places,
>>>>>           we can
>>>>>              emphasize dinosaurs or birds, exclude serial killers,
>>>>>           or include
>>>>>              all articles from (say) Uganda, all as requested.  Let
>>>>>           me know if
>>>>>              this feature is useful.  We don't have an equivalent
>>>>>           ranking for
>>>>>              images, I'm afraid - for V0.7 we just include all legal
>>>>>           images (as
>>>>>              thumbnails).  As for a "main page", the plan is to have
>>>>>           a set of
>>>>>              index pages generated by bot and then corrected by a manual
>>>>>              "reality check", but that will take another month or two.
>>>>>
>>>>>              I'd really like to make sure that we make sure we work
>>>>>           together in
>>>>>              the coming months, because I think we can avoid a lot
>>>>>           of duplicate
>>>>>              work if we share our best resources, scripts, etc.
>>>>>            Once the
>>>>>              selection is done (~ 1st Sept), should we hold an IRC
>>>>>           discussion
>>>>>              on how we can best collaborate?
>>>>>
>>>>>              Martin
>>>>>
>>>>>
>>>>>              Samuel Klein wrote:
>>>>>
>>>>>                  There's lots of motivation to get an english
>>>>>           wikireader, say,
>>>>>                  taking advantage of the article selection and
>>>>>           processing of 0.7 .
>>>>>                  OLPC could include this in the upcoming G1G1
>>>>>           machines this
>>>>>                  winter / early next year.  Other users could test
>>>>>           wikireaders
>>>>>                  that read this zipped format on their own machines,
>>>>>           which
>>>>>                  would flesh out the reader code.
>>>>>
>>>>>                  Martin -- what's the status on the 0.7 articlelist?
>>>>>            Do you
>>>>>                  have a similar imagelist that ranks images by
>>>>>           importance to
>>>>>                  that set of articles?
>>>>>                  How is work on a 0.7 main page?  I'd love to see
>>>>>           how large a
>>>>>                  snapshot is with our curent wikireader code
>>>>>           (without even
>>>>>                  moving to 7z, or trimming the list).
>>>>>
>>>>>                  SJ
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>       _______________________________________________
>>>>>       Wikireader mailing list
>>>>>       Wikireader at lists.laptop.org <mailto:Wikireader at lists.laptop.org>
>>>>>       http://lists.laptop.org/listinfo/wikireader
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>
>>
>


More information about the Wikireader mailing list