<div dir="ltr">Nirav,<br>  Great ideas...  here are some thoughts and notes when IsForInsects and I did some discussions, some research... then ran out of time to actually program it. :)<br><br>   <a href="http://wiki.laptop.org/go/User:Ixo/Project/Webcam">http://wiki.laptop.org/go/User:Ixo/Project/Webcam</a><br>

<br>-iXo<br><br><div class="gmail_quote">On Tue, Jul 22, 2008 at 19:43, Nirav Patel <<a href="mailto:olpc@spongezone.net">olpc@spongezone.net</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

I'm writing computer vision functions for Pygame (available at<br>

<a href="http://git.n0r.org/?p=pygame-nrp;a=summary" target="_blank">http://git.n0r.org/?p=pygame-nrp;a=summary</a> ), and I've gotten to the<br>

point where I very much need community input on where to go next.<br>

Basically, I would like to know what you want to be able to do with<br>

the camera on the XO, whether its related to gaming, input,<br>

accessibility, education, or anything else.<br>

<br>

What it can currently do, in overly simplistic terms:<br>

<br>

1. Capture images:  This is the basis for everything else, but it can<br>

be useful on its own too.  For example, letting someone take a picture<br>

of themselves, let them select their face in the picture, and then<br>

crop that and use it as a character in a game.<br>

<br>

2. Get the average color:  This is useful for picking values to<br>

threshold, but you can also switch to YUV or HSV colorspace and find<br>

the average brightness of the area.<br>

<br>

3. Threshold images:  Thresholding is pretty flexible.  You can have<br>

it select everything within a threshold of a color or everything<br>

outside the threshold of a color.  You can also threshold between two<br>

images.  This lets you get a "green screen" effect, so you can have a<br>

person being displayed realtime over a virtual background.<br>

<br>

4. Track an object:  After thresholding, you can turn the remaining<br>

object into a bitmask and get various properties about it like its<br>

bounding box, centroid, size in pixels, and angle with respect to the<br>

x-axis.  You can also test collisions between the mask and masks of<br>

virtual objects.<br>

<br>

They may seem pretty basic and simple, but you can do a lot of things<br>

by combining them.  For example:<br>

<br>

1.  Drawing with a real life object:  Have the user pick up an object<br>

and hold it up so it fills a box being displayed on screen and hit a<br>

button.  Save the average color within the box.  Threshold out just<br>

that color, turn it into a bitmask, get the largest connected<br>

component, and find the centroid of it.  Use that centroid as the<br>

coordinates for the on screen paint brush, perhaps also using the<br>

saved average color.  The user now has the illusion of using the<br>

object in hand as a paint brush.<br>

<br>

2.  Play pong with your hand:  Have the user step out of the field of<br>

view of the camera, save the image.  Now threshold between the saved<br>

image and the images currently being captured.  This results in just<br>

showing the differences between the two (like the user, who has now<br>

stepped back into the field of view).  Turn the image into a bitmask<br>

and check it for collisions with the bitmask of the ball in pong.<br>

Actually, now that I'm thinking about it, I'll probably write this<br>

game when I get home.<br>

<br>

Other examples: <a href="http://eclecti.cc/olpc" target="_blank">http://eclecti.cc/olpc</a><br>

<br>

What it doesn't do yet, but could depending on if there is interest:<br>

<br>

1. Generic motion detection:  You can do stuff like thresholding<br>

between a background and current images, or tracking whether blobs of<br>

colors have moved, but neither is a great way of detecting total<br>

motion in an image.  There are many optical flow algorithms, but<br>

finding one that'll run realtime on an XO will be tricky.  The other<br>

issue is how to present motion to the developer.  Perhaps have the<br>

function request two images and a list of points on the first image,<br>

and return a list of where it guesses the points are now on the second<br>

image.<br>

<br>

2. Object recognition:  You could guess based on the size or color of<br>

an object, but there isn't really a way of detecting if what you're<br>

holding up is a lime or a pear.  There are a lot of really<br>

computationally heavy ways of doing object recognition, but there are<br>

also some lightweight ways.  The one I was considering was using image<br>

moments (which I am currently doing to find the centroid and angle of<br>

an object), to get basic parameters like the eccentricity and skewness<br>

of an object.  There is also the Hu set of invariant moments that will<br>

give more information about an object, though not really in a<br>

human-friendly form.  Thus, while I could fairly easily write<br>

functions that would drop a dozen of these numbers, I'm not sure<br>

anyone would be able to make use of them in an Activity.<br>

<br>

So far, I've just been writing functions based on usage cases I could<br>

think of in games.  There are probably many that I just haven't<br>

thought of for which the current functions are inadequate.  So, any<br>

ideas would be appreciated.<br>

<br>

Nirav<br>

_______________________________________________<br>

Games mailing list<br>

<a href="mailto:Games@lists.laptop.org">Games@lists.laptop.org</a><br>

<a href="http://lists.laptop.org/listinfo/games" target="_blank">http://lists.laptop.org/listinfo/games</a><br>

</blockquote></div><br></div>