DPRG
DPRG List  



[DPRG] webcam robot vision limitations

Subject: [DPRG] webcam robot vision limitations
From: David Murphy dfm7 at earthlink.net
Date: Wed Jun 27 13:51:05 CDT 2007

Hi Chris,

I want to share some work with you that you might find interesting  
and relevant.

It will involve some background, so bear with me.

About 18months -> 2 years ago, a group of folks in the Home Brew  
Robotics Club (S.F. Bay area, Ingolf Sander, John Slater, Brandon  
Blodget, Dave Wyland) became interested in object recognition for  
robotics and in particular the SIFT algorithm. If you give me some  
'artistic license' here for the description as I'm not sufficiently  
familiar with it to be accurate, it works something like this.

Scan a scene looking for 'key-points'. A key-point is basically an  
intersection of lines, or a place where there is a sharp curve in a  
line, plus the gradient of illumination in the immediate vicinity of  
the intersection.
An object is known by the set of key-points extracted for it during  
training.
After The key-points are extracted from the scene, they are compared  
against the data base looking for the highest number of matches, and  
this give you the object(s) in the scene.

This algorithm is somewhat tolerant of the rotation of an object and  
changes in illumination from the training position and apparently  
mimics the activity of some cells in the visual cortex of mammals.

One of the folks working in this project has worked with FPGA's and a  
lot of what is going on here in the early stages is very amenable to  
implementation in hardware (extracting lines, intersections,  
computing gradients, and the like). So their approach was to put all  
of this initial stuff into hardware and let the CPU worry about  
database matching and decision making.

The FPGA guy in this group, along with another fellow in the club  
had, a few years ago, built a board for robotics projects that had a  
xilinx spartan fpga; they programmed the fpga with a micoblaze cpu  
( a soft core available for xilinx) and ran a version of linux on it.  
I think they intended to commercialize it, but for whatever reason  
did not.

Ok, now these two groups have gotten together and formed a company  
called Roboticore to make this stuff available.

I saw the demo about a year ago before they formed the company. They  
had much of the hardware running and in real time could extract the  
key-points from an image without stressing the FPGA. They demo'd it  
by outputting the results to an frame buffer and displaying on a  
monitor. They could wave the camera at the audience and you could see  
the results on the monitor in real time with no lag.

I did not see their recent presentation at the HBRC, but you can see  
it at the HBRC website. http://www.hbrobotics.org/HBRC_Presentations.htm
look for FPGA vision.

I guess the point is that a lot of the low level vision stuff is  
repetitive and uniform and hence amenable to hardware. Putting this  
stuff into an FPGA (you can get a development board for a spartan  
FPGA for about $100.00) would free up a lot of CPU cycles for other  
things.

Cheers,
David

On Jun 26, 2007, at 9:16 PM, Chris Jang wrote:

> Hello, I'm not sure if anyone is interested in this...
>
> But it's something different to discuss.
>
> I have a small robot with a VGA webcam and ARM9 PC104 SBC. Up until
> last night, the performance out of this combination has been
> embarrassing - frame rates of around 1.2 fps with a low percentage
> of corrupted images.
>
> After lots of experimentation, the webcam now runs at 15 fps with
> long sequences of around 10 seconds without any image corruption.
> This includes V4L capture, JPEG decode, and saving to SD flash. I
> hope that with some more twiddling, there will be no more corruption.
>
> Here's the trick (I believe) - cameras have a native frame rate.
> If whatever consumes and processes the video does not operate at
> pretty much exactly this speed, then output is prone to corruption
> due to device/driver sync issues. So I had to put an adaptive
> spinning delay loop which adjusts depending on the measured time
> between each frame capture. This adaptive delay tries to hold the
> capture rate at 15 fps which matches the webcam.
>
> Ok, so boring...here's the interesting part.
>
> Now that the basic stuff is worked out, the amount of CPU time
> available for processing each frame is known. On the 200 MHz ARM9
> based computer board I'm using, roughly half the time is spent
> capturing images, decoding and saving them to SD flash (some time
> could be recovered by not saving images - but then debugging is
> impossible as we can't know what the robot saw). The other half of
> the time is available. That's roughly 30 milliseconds at 15 times
> each second, once for each image frame from the webcam.
>
> I think this is enough time for pixel based image segmentation
> (fancy term for color blobs). There is enough time for some
> statistics and simple morphological filters. But there is not
> enough time for any feature based techniques (no convolution).
>
> Stanford's DARPA Grand Challenge vehicle dedicated one 1.6 GHz
> Pentium M computer to 320x240 monocular RGB video. They did color
> based image segmentation with some morphological filtering. So even
> with over 10x the power of a 200 MHz ARM9, they were still limited
> to very sophisticated color blob detection.
> _______________________________________________
> DPRGlist mailing list
> DPRGlist at dprg.org
> http://list.dprg.org/mailman/listinfo/dprglist

More information about the DPRG mailing list

Copyright © 1984 - 2006 Dallas Personal Robotics Group. All rights reserved.
Website Design by NCC

For the latest robot news visit robots.net