[sugar] Report on OLPC in Ethiopia

Gary C Martin gary at garycmartin.com
Tue May 20 09:59:03 EDT 2008


On 20 May 2008, at 08:46, Tomeu Vizoso wrote:

> what about stemming the words? You may be able to use an english
> stemmer from xapian using the python bindings (not sure though).

Thanks Tomeu,

Yes, I was doing some very rudimentary stemming in my text parsing for  
specific cases, but I was still undecided as some of the other texts  
I've mapped have interesting usage patterns with the stems left as is.

I've just had a quick google and found the standard Porter Stemming  
Algorithm written in Python by Vivake Gupta, I'll plug it in and see  
how it goes – mail list and wiki type texts are probably less 'edited'  
and more noisy than the other texts I've been experimenting with :-)

--Gary


More information about the Sugar mailing list