[Wikireader] english wikireaders and 0.7
Samuel Klein
sj at laptop.org
Fri Sep 5 18:09:48 EDT 2008
longest. 5. min. meeting. ever. and awesome :-) I am hopeful for
snapshots by monday, and agreed to do any filtering/blacklisting.
On Fri, Sep 5, 2008 at 2:31 PM, Samuel Klein <sj at laptop.org> wrote:
> This will be a 5-minute meeting then, to review who i doing what by
> when, and exchange cheer :) SJ
>
> On Fri, Sep 5, 2008 at 2:04 PM, Madeleine Ball <meprice at gmail.com> wrote:
>> AIUI they've given us the list here, but I haven't had a chance to look at
>> it yet. I think it would be best to meet sometime after we've played with it
>> -- which has to wait until this weekend. Maybe on Monday?
>>
>> On Sep 5, 2008, at 1:46 PM, Samuel Klein wrote:
>>
>>> Thanks for the update. bozmo, it's great to hear your group is
>>> working on assessments as well... we won't be able to wait another two
>>> weeks for a revised version list, but may be able to recompile once
>>> next week. However, I think for olpc's coming release we want a final
>>> draft bundle this weekend.
>>>
>>> Warmly,
>>> SJ
>>>
>>> On Thu, Sep 4, 2008 at 5:01 PM, Martin Walker <walkerma at potsdam.edu>
>>> wrote:
>>>>
>>>> We found a bug in the SelectionBot script that was affecting some
>>>> unassessed
>>>> articles. That has now been fixed, and there is now an updated set of
>>>> results, with about 28,000 articles selected.
>>>>
>>>> http://toolserver.org/~cbm/release-data/2008-9-4/HTML/index.html
>>>>
>>>>
>>>> As for the small detailed fixes, we'll have to work on those at the
>>>> weekend.
>>>>
>>>> Martin
>>>> Walkerma on Wikipedia
>>>>
>>>> Samuel Klein wrote:
>>>>>
>>>>> ok, let's meet friday at 1500 EST on #kiwix on freenode,
>>>>> for those who can make it, to discuss making a main page for an english
>>>>> 0.7 wikipedia bundle.
>>>>>
>>>>> SJ
>>>>>
>>>>> On Thu, Aug 28, 2008 at 12:20 PM, Martin Pascal <pmartin at linterweb.com
>>>>> <mailto:pmartin at linterweb.com>> wrote:
>>>>>
>>>>> Yes Sj ,
>>>>>
>>>>> you could join #kiwix on irc.freenode.net <http://irc.freenode.net>
>>>>> Cordialement
>>>>> Martin Pascal
>>>>> tel : 02 32 40 23 69, fax : 02 32 61 45 26
>>>>> gsm : 06 13 89 77 32
>>>>> ----- Original Message ----- From: "Martin Walker"
>>>>> <walkerma at potsdam.edu <mailto:walkerma at potsdam.edu>>
>>>>>
>>>>> To: "Samuel Klein" <sj at laptop.org <mailto:sj at laptop.org>>
>>>>> Cc: "Madeleine Ball" <mad at printf.net <mailto:mad at printf.net>>;
>>>>> "Offline Wikireaders" <wikireader at lists.laptop.org
>>>>> <mailto:wikireader at lists.laptop.org>>
>>>>> Sent: Thursday, August 28, 2008 6:16 PM
>>>>>
>>>>> Subject: Re: [Wikireader] english wikireaders and 0.7
>>>>>
>>>>>
>>>>> SJ,
>>>>>
>>>>> I can manage an IRC meeting on Friday - say at 3pm EDT (1900h
>>>>> UTC)? If
>>>>> this is difficult for others, I will be around next week. We
>>>>> have the
>>>>> #wikipedia-1.0 channel ( irc://irc.freenode.net/#wikipedia-1.0
>>>>> <http://irc.freenode.net/#wikipedia-1.0> ) if you
>>>>> wish, but perhaps you have a wikireader channel that may be more
>>>>> appropriate?
>>>>>
>>>>> Martin
>>>>>
>>>>>
>>>>> Samuel Klein wrote:
>>>>>
>>>>> @martin -- How about having a Friday afternoon wikireader
>>>>> meeting?
>>>>> For this week, whether or not we meet, a pressing question
>>>>> is :
>>>>> Generating the main page. For the spanish WP, Madeleine
>>>>> did most of
>>>>> the main page by hand with a bit of help. We may have to
>>>>> do the same
>>>>> here until better scripts are set up.
>>>>>
>>>>> A couple people built the main page for our
>>>>> spanish-language bundle
>>>>> more or less by hand from a portal template.
>>>>>
>>>>> Metadata :
>>>>>
>>>>> 1. metadata that is currently particularly useful for us is:
>>>>> - a blacklist of article titles, and a blacklist of
>>>>> images, for the
>>>>> very few that we explicitly leave out despite other metadata
>>>>> - a whitelist of both, again to ensure inclusion.
>>>>>
>>>>> 2. In a general system, I'd like to see this tagged with
>>>>> the name of
>>>>> the group associated; say olpc-peru-blacklist and
>>>>> olpc-peru-whitelist.
>>>>>
>>>>> @cfabian -- testing this on bee units sounds like a fun
>>>>> test of the
>>>>> metadata slimming!
>>>>>
>>>>> SJ
>>>>>
>>>>> ps - any news from the offline spanish wp project that got
>>>>> started a
>>>>> while back?
>>>>>
>>>>>
>>>>> On Sun, Aug 24, 2008 at 6:12 PM, Martin Walker
>>>>> <walkerma at potsdam.edu <mailto:walkerma at potsdam.edu>
>>>>> <mailto:walkerma at potsdam.edu
>>>>> <mailto:walkerma at potsdam.edu>>> wrote:
>>>>>
>>>>> Things are looking very promising for the Version 0.7
>>>>> selection -
>>>>> we should have a complete article list within a week or so,
>>>>> containing about 30,000 articles organized by a
>>>>> combination of
>>>>> quality and importance. With our basic system of
>>>>> compression ,
>>>>> using I think probably Zeno format), I believe we
>>>>> should be able
>>>>> to include 30,000 long-ish articles with thumbnails on
>>>>> one DVD,
>>>>> along with Kiwix and some index pages. I'd be
>>>>> interested to see
>>>>> how it would work with your compression system - we
>>>>> could get a
>>>>> few people to test that, I think.
>>>>>
>>>>> I know how you love metadata, SJ, and we now have loads
>>>>> of it
>>>>> (from 1.4 million articles) - so we can customize the
>>>>> selection
>>>>> for you at will using quality, wikiproject, or the four
>>>>> importance
>>>>> paramaters. Since this is for kids in specific places,
>>>>> we can
>>>>> emphasize dinosaurs or birds, exclude serial killers,
>>>>> or include
>>>>> all articles from (say) Uganda, all as requested. Let
>>>>> me know if
>>>>> this feature is useful. We don't have an equivalent
>>>>> ranking for
>>>>> images, I'm afraid - for V0.7 we just include all legal
>>>>> images (as
>>>>> thumbnails). As for a "main page", the plan is to have
>>>>> a set of
>>>>> index pages generated by bot and then corrected by a manual
>>>>> "reality check", but that will take another month or two.
>>>>>
>>>>> I'd really like to make sure that we make sure we work
>>>>> together in
>>>>> the coming months, because I think we can avoid a lot
>>>>> of duplicate
>>>>> work if we share our best resources, scripts, etc.
>>>>> Once the
>>>>> selection is done (~ 1st Sept), should we hold an IRC
>>>>> discussion
>>>>> on how we can best collaborate?
>>>>>
>>>>> Martin
>>>>>
>>>>>
>>>>> Samuel Klein wrote:
>>>>>
>>>>> There's lots of motivation to get an english
>>>>> wikireader, say,
>>>>> taking advantage of the article selection and
>>>>> processing of 0.7 .
>>>>> OLPC could include this in the upcoming G1G1
>>>>> machines this
>>>>> winter / early next year. Other users could test
>>>>> wikireaders
>>>>> that read this zipped format on their own machines,
>>>>> which
>>>>> would flesh out the reader code.
>>>>>
>>>>> Martin -- what's the status on the 0.7 articlelist?
>>>>> Do you
>>>>> have a similar imagelist that ranks images by
>>>>> importance to
>>>>> that set of articles?
>>>>> How is work on a 0.7 main page? I'd love to see
>>>>> how large a
>>>>> snapshot is with our curent wikireader code
>>>>> (without even
>>>>> moving to 7z, or trimming the list).
>>>>>
>>>>> SJ
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Wikireader mailing list
>>>>> Wikireader at lists.laptop.org <mailto:Wikireader at lists.laptop.org>
>>>>> http://lists.laptop.org/listinfo/wikireader
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>
>>
>
More information about the Wikireader
mailing list