GSoC Status Report: Vision Processing

Thu Jun 26 04:07:00 EDT 2008

As you may know, OLPC got GSoC students again this summer.  I am one
of them, and my project is Vision Processing.  That is, a library to
use the webcam for more than capturing images.  I am implementing this
by adding v4l2 and computer vision functions to pygame.

My code is available at http://git.n0r.org/?p=pygame-nrp;a=summary and
is currently pygame 1.8.1 with the addition of a camera module that
supports v4l2 cameras that use MMAP and have pixelformats of RGB24,
RGB444, YUYV, SBGGR8, or YUV420.  Basic usage is as follows:

import pygame
from pygame import camera

cam = camera.Camera("/dev/video0", (640, 480), "RGB")  # the third
argument can be YUV or HSV too.
cam.start()
frame = cam.get_image() # the frame returned is a 24bit pygame Surface

You can also do fun stuff like:
http://eclecti.cc/bytes/living-pointillism-a-pygame-webcam-script
or more practical stuff like having it track the centroid of a
specific hue (green in this case): http://eclecti.cc/files/centroid.py

My plans are to add functions like finding the largest connected
component, optical flow, and other things useful for computer vision.

Currently, performace is pretty poor on the XO; a combination of the
Geode being slow and having to convert from 24bit to 16bit surfaces to
display any captured frames.  The XO is fast enough to capture and
blit a 320x240 RGB frame at 30fps, but not at 640x480 or a frame being
converted to HSV.  I'm not sure how or if I'm going to be able to
overcome those performance problems.

I'd appreciate any comments, suggestions, or reality checks on
improving performance or anything else, or any requests for vision
functions to add.  Also, I only have the camera in the XO, vivi, and a
poorly supported USB webcam, so if anyone could test it on other
webcams, that would be great.

Thanks,

Nirav Patel