DPRG
DPRG List  



[DPRG] webcam robot vision limitations

Subject: [DPRG] webcam robot vision limitations
From: Chris Jang cjang at ix.netcom.com
Date: Fri Jun 29 05:11:45 CDT 2007

>The motion estimation in the typical MPEG encoder is just looking for
>the best match, which is not necessarily the strongest correlation.
>The encoder doesn't care if the motion vector represents the real
>motion of the object, only that the motion vector results in the
>smallest residual. Also, the motion estimation is typically done on a
>reconstructed image, not the original pixels, which further reduce how
>well the vectors maps to physical motion. In addition, the
>hardware/software may not be doing a full search and the search may be
>heavily short circuited (early termination).

This is great. Here's how I interpret this from school (15 years ago...)

Most iterative methods for solving linear systems (and DCT is really
a very clever linear basis transformation of 8x8 image blocks) do not
minimize error. They minimize the residual. So we try to do some
analysis to bound the norm of the error in terms of the residual norm.
If this is possible, then we have proven that as residual converges to
zero, so does the error.

Why do we minimize residual instead of error? Because it is often
impractical to invert operators, even linear ones.

Ax = b

We are looking for the x that when operator A is applied, equals b.
But we don't want to or can't compute this directly. We can't compute
the inverse of A and apply it to b. So all we know is what we get when
we give an x and apply A to it.

It could be that there are many possible values for x that when
operator A is applied, the result is very near b. They all look good.
But the values may be very far away from the one exact solution.
Residual and error are not the same.

>> The arrows look amazingly similar to those shown on Yaw right and move
>> forward samples on the optic flow web page above.  So if you have a camera
>> that can send an MPEG video stream, you could just analyze the motion
>> vectors for optical flow.
>
>Sadly, doing this type of stuff in the compressed domain is rarely of
>any value, especially with the type of MPEG encoder you might find in
>a low end USB camera. You might be able to do basic motion detection
>using the motion vectors, but I wouldn't expect much more then that.

This is also great. I wondered about doing some computer vision at the
DCT block level as images are decoded. Perhaps more optimization is
possible? But experience is usually the best indicator of what works in
the real world. So it probably does not work in practice.

More information about the DPRG mailing list

Copyright © 1984 - 2006 Dallas Personal Robotics Group. All rights reserved.
Website Design by NCC

For the latest robot news visit robots.net