[Wikireader] english wikireaders and 0.7
Samuel Klein
sj at laptop.org
Sat Sep 6 16:21:23 EDT 2008
That's great, thank you Andrew. do you post these changers back to wp
proper? I'd like for every article revision we include in our
bundle to have a permalink online. (and it makes sense to me that
some other people who currently only read wp might like your versions
as well...)
I will certainly support you in running an SOS-bot that publishes its
preferred cleaner revisions to articles, with an edit summary
indicating it is posting the version from the latest
childrens-wikipedia, and a bot-option to self-revert and leave a
message on the talk page (if editors start to get annoyed with it --
that way the regulars on any given article can choose to include or
not include its changes, but it doesn't change the latest-current
version and start what may already be ongoing edit wars).
SJ
(You know the content-review is overseen by a Wikipedian when... it
includes cleaning out 'births' since 1980 and 'trivia' sections in
bios. :-)
On Sat, Sep 6, 2008 at 2:22 AM, Andrew Cates <Andrew at soschildren.org> wrote:
> Hi Samuel
>
> Just to be clear, we have finished checking our 5400 articles for
> vandalism etc and have this list. But as well as choosing versions we
> have a cleanup script which removes unsuitable paragraphs within
> articles, and editorial notices (e.g. empty sections, "see also" to
> articles not on the list, the sections labelled "personal life" in
> biographies which tends to be full of speculation about sexual
> orientation, the "births" section in years post 1980 which is full of
> rubbish, topic boxes where most of them are not included, category
> lists from portal pages, editorial notices where the issue is minor
> etc.). The remaining two weeks work is on the script not on finding
> the versions.
>
> The "near current" state of play is at
> http://schools-wikipedia-test.soschildren.org/wp/index/subject.htm
> which is only a week old.
>
> Andrew
>
> On Fri, Sep 5, 2008 at 6:46 PM, Samuel Klein <sj at laptop.org> wrote:
>> Thanks for the update. bozmo, it's great to hear your group is
>> working on assessments as well... we won't be able to wait another two
>> weeks for a revised version list, but may be able to recompile once
>> next week. However, I think for olpc's coming release we want a final
>> draft bundle this weekend.
>>
>> Warmly,
>> SJ
>>
>> On Thu, Sep 4, 2008 at 5:01 PM, Martin Walker <walkerma at potsdam.edu> wrote:
>>> We found a bug in the SelectionBot script that was affecting some unassessed
>>> articles. That has now been fixed, and there is now an updated set of
>>> results, with about 28,000 articles selected.
>>>
>>> http://toolserver.org/~cbm/release-data/2008-9-4/HTML/index.html
>>>
>>>
>>> As for the small detailed fixes, we'll have to work on those at the weekend.
>>>
>>> Martin
>>> Walkerma on Wikipedia
>>>
>>> Samuel Klein wrote:
>>>>
>>>> ok, let's meet friday at 1500 EST on #kiwix on freenode,
>>>> for those who can make it, to discuss making a main page for an english
>>>> 0.7 wikipedia bundle.
>>>>
>>>> SJ
>>>>
>>>> On Thu, Aug 28, 2008 at 12:20 PM, Martin Pascal <pmartin at linterweb.com
>>>> <mailto:pmartin at linterweb.com>> wrote:
>>>>
>>>> Yes Sj ,
>>>>
>>>> you could join #kiwix on irc.freenode.net <http://irc.freenode.net>
>>>> Cordialement
>>>> Martin Pascal
>>>> tel : 02 32 40 23 69, fax : 02 32 61 45 26
>>>> gsm : 06 13 89 77 32
>>>> ----- Original Message ----- From: "Martin Walker"
>>>> <walkerma at potsdam.edu <mailto:walkerma at potsdam.edu>>
>>>>
>>>> To: "Samuel Klein" <sj at laptop.org <mailto:sj at laptop.org>>
>>>> Cc: "Madeleine Ball" <mad at printf.net <mailto:mad at printf.net>>;
>>>> "Offline Wikireaders" <wikireader at lists.laptop.org
>>>> <mailto:wikireader at lists.laptop.org>>
>>>> Sent: Thursday, August 28, 2008 6:16 PM
>>>>
>>>> Subject: Re: [Wikireader] english wikireaders and 0.7
>>>>
>>>>
>>>> SJ,
>>>>
>>>> I can manage an IRC meeting on Friday - say at 3pm EDT (1900h
>>>> UTC)? If
>>>> this is difficult for others, I will be around next week. We
>>>> have the
>>>> #wikipedia-1.0 channel ( irc://irc.freenode.net/#wikipedia-1.0
>>>> <http://irc.freenode.net/#wikipedia-1.0> ) if you
>>>> wish, but perhaps you have a wikireader channel that may be more
>>>> appropriate?
>>>>
>>>> Martin
>>>>
>>>>
>>>> Samuel Klein wrote:
>>>>
>>>> @martin -- How about having a Friday afternoon wikireader
>>>> meeting?
>>>> For this week, whether or not we meet, a pressing question
>>>> is :
>>>> Generating the main page. For the spanish WP, Madeleine
>>>> did most of
>>>> the main page by hand with a bit of help. We may have to
>>>> do the same
>>>> here until better scripts are set up.
>>>>
>>>> A couple people built the main page for our
>>>> spanish-language bundle
>>>> more or less by hand from a portal template.
>>>>
>>>> Metadata :
>>>>
>>>> 1. metadata that is currently particularly useful for us is:
>>>> - a blacklist of article titles, and a blacklist of
>>>> images, for the
>>>> very few that we explicitly leave out despite other metadata
>>>> - a whitelist of both, again to ensure inclusion.
>>>>
>>>> 2. In a general system, I'd like to see this tagged with
>>>> the name of
>>>> the group associated; say olpc-peru-blacklist and
>>>> olpc-peru-whitelist.
>>>>
>>>> @cfabian -- testing this on bee units sounds like a fun
>>>> test of the
>>>> metadata slimming!
>>>>
>>>> SJ
>>>>
>>>> ps - any news from the offline spanish wp project that got
>>>> started a
>>>> while back?
>>>>
>>>>
>>>> On Sun, Aug 24, 2008 at 6:12 PM, Martin Walker
>>>> <walkerma at potsdam.edu <mailto:walkerma at potsdam.edu>
>>>> <mailto:walkerma at potsdam.edu
>>>> <mailto:walkerma at potsdam.edu>>> wrote:
>>>>
>>>> Things are looking very promising for the Version 0.7
>>>> selection -
>>>> we should have a complete article list within a week or so,
>>>> containing about 30,000 articles organized by a
>>>> combination of
>>>> quality and importance. With our basic system of
>>>> compression ,
>>>> using I think probably Zeno format), I believe we
>>>> should be able
>>>> to include 30,000 long-ish articles with thumbnails on
>>>> one DVD,
>>>> along with Kiwix and some index pages. I'd be
>>>> interested to see
>>>> how it would work with your compression system - we
>>>> could get a
>>>> few people to test that, I think.
>>>>
>>>> I know how you love metadata, SJ, and we now have loads
>>>> of it
>>>> (from 1.4 million articles) - so we can customize the
>>>> selection
>>>> for you at will using quality, wikiproject, or the four
>>>> importance
>>>> paramaters. Since this is for kids in specific places,
>>>> we can
>>>> emphasize dinosaurs or birds, exclude serial killers,
>>>> or include
>>>> all articles from (say) Uganda, all as requested. Let
>>>> me know if
>>>> this feature is useful. We don't have an equivalent
>>>> ranking for
>>>> images, I'm afraid - for V0.7 we just include all legal
>>>> images (as
>>>> thumbnails). As for a "main page", the plan is to have
>>>> a set of
>>>> index pages generated by bot and then corrected by a manual
>>>> "reality check", but that will take another month or two.
>>>>
>>>> I'd really like to make sure that we make sure we work
>>>> together in
>>>> the coming months, because I think we can avoid a lot
>>>> of duplicate
>>>> work if we share our best resources, scripts, etc.
>>>> Once the
>>>> selection is done (~ 1st Sept), should we hold an IRC
>>>> discussion
>>>> on how we can best collaborate?
>>>>
>>>> Martin
>>>>
>>>>
>>>> Samuel Klein wrote:
>>>>
>>>> There's lots of motivation to get an english
>>>> wikireader, say,
>>>> taking advantage of the article selection and
>>>> processing of 0.7 .
>>>> OLPC could include this in the upcoming G1G1
>>>> machines this
>>>> winter / early next year. Other users could test
>>>> wikireaders
>>>> that read this zipped format on their own machines,
>>>> which
>>>> would flesh out the reader code.
>>>>
>>>> Martin -- what's the status on the 0.7 articlelist?
>>>> Do you
>>>> have a similar imagelist that ranks images by
>>>> importance to
>>>> that set of articles?
>>>> How is work on a 0.7 main page? I'd love to see
>>>> how large a
>>>> snapshot is with our curent wikireader code
>>>> (without even
>>>> moving to 7z, or trimming the list).
>>>>
>>>> SJ
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Wikireader mailing list
>>>> Wikireader at lists.laptop.org <mailto:Wikireader at lists.laptop.org>
>>>> http://lists.laptop.org/listinfo/wikireader
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>> _______________________________________________
>> Wikireader mailing list
>> Wikireader at lists.laptop.org
>> http://lists.laptop.org/listinfo/wikireader
>>
>
More information about the Wikireader
mailing list