This project will implement a small, cheap, powerful and flexible vision processor, for use in on-board vision systems. The standard approach for vision processing is to use a stationary camera coupled with some powerful digital signal processing (DSP) hardware. This hardware is expensive and usually too bulky to be practical in an on-board application. Some systems such as the EyeBot have incorporated cheap ‘internet camera’ products into small autonomous robots. These systems rely on the system CPU to handle all aspects of image acquisition as well as processing and control. Silicon retinas and other analogue vision chips have been used to implement parallelised image processing algorithms with much success.
The ideal vision sensing tool would perform on-board image acquisition and feature extraction, with the majority of the image processing done independently of the system CPU. Our design will incorporate an image sensing device with its output fed directly into an FPGA. This will enable real-time acquisition of the image, with no intervening encoding or modulation stages required. The features or processed images will be output onto a bus using a standardised communication protocol.
The resulting processor will be small enough to be incorporated into small robotic and vision systems. The processor will be a feasible replacement for other, more standard sensors such as infra-red and ultrasonic spectrum sensors.
Introduction
This project will implement a small, cheap, powerful and flexible vision
processor, for use in on-board vision systems. The processor will be based on
recent technological advances and computationally inexpensive parallel vision
processing algorithms.
DSP Techniques
The standard approach for vision processing is to use a stationary camera
coupled with some powerful digital signal processing (DSP) hardware. Most
vision processing is done off-board, using powerful PC solutions. On-board
processing power in autonomous systems is usually not good enough to enable
vision processing, as well as control of actuators. Off-board DSP systems such
as in Figure 1 require a camera, encoding and decoding of the image, wireless
transmission and a frame grabber housed in a PC with DSP capabilities. This
hardware is expensive and, of course, too bulky to be practical in an on-board
application.

![]()
The Imputer
The development of the Imputer was an attempt to create a flexible, modular
image processing device. It consisted of a single system bus, onto which
components such as CPU cards, frame grabber cards and I/O cards could be added.
The configuration of these cards was limited; only one CPU per system was
allowed, although multiple frame grabbers were supported. The architecture of
the system was fundamentally monolithic; the single von-Neumann processor was
used to perform all the image processing.
Additionally, the Imputer was housed in an external case and so could not be used in an on-board autonomous system for miniature robotics. The Imputer is no longer being manufactured.
Current On-board Autonomous Systems
Some systems such as the EyeBot have incorporated
cheap 'internet camera' products into small autonomous robots. These systems
rely on the system CPU to handle all aspects of image acquisition as well as
processing and control, as shown in Figure 2. Because of the high bandwidth
requirements of real-time video, a compromise is usually made either in terms
of frame rate or in terms of resolution. The EyeBot
team have recently produced their own (low-resolution) camera for use with
their robot controller, which is capable of acquiring full-colour images at a
rate of 3.7 frames per second with a 25Mhz CPU clock.

Analog Vision Chips
Silicon retinas & other analog vision chips have
been used to implement parallelised image processing algorithms with much
success. These chips are usually not very flexible, since they implement a fixed
set of algorithms in hardware, which cannot be modified. They perform image
acquisition and processing on the one die, and therefore do not require the
power of a fully-fledged DSP system. Because the processing is done locally,
they leave the system CPU free to perform other tasks.
The ideal vision sensing tool would perform on-board image acquisition and feature extraction, with the majority of the image processing done independently of the system CPU. The tool would be cheap both to manufacture and purchase, using mostly off-the-shelf components. The processing algorithms used and the output of the tool would be extremely flexible and customisable. The tool would enable powerful real-time control based on real-time feature extraction.
CCDs and Image Sensors
Image acquisition requires a light-sensitive structure to detect the image.
Charge-coupled devices, or CCDs, have been the most
popular choice for many years. The silicon substrate of the chip is sensitive
to incident photons, and generates a charge corresponding to the amount of
light falling upon the sensitive area. This charge is transferred sequentially
off the device, the voltages are measured at the edge of the chip and are
converted to digital via an Analogue to Digital Converter (ADC). CCDs require intricate clocking arrangements, off-chip ADCs
and high-voltage supplies.
The recent availability of entire 'image sensors' on a chip means smaller footprints and low-voltage applications are now possible. An image sensor incorporates a CCD, parallel clock drivers, voltage regulators and a fast ADC into a single chip. The image is read sequentially as a digital signal. These chips are low-voltage, require much more simple clock driver circuits and have a footprint only slightly larger than a CCD alone.
Colour images are obtained by placing a mask of silicon embedded with colour dyes above an ordinary CCD. This results in an array of pixels that are sensitive to blue, green and red light. Adjacent pixels are interpolated on-chip to give a colour image. Because of this interpolation there is a loss of colour resolution, but this is perhaps comparable to the human retina, which provides high-resolution contrast data, but only low-resolution colour.
Kodak provides a range of image sensors and CCDs with varying resolutions and frame rates.
Resolution
Chip type
Frame rate
1280x1024
Image sensor
14fps at 18Mhz clock
640x480
Image sensor
30fps at 10Mhz clock
2048x2048
Interline CCD
20fps at 20Mhz clock
640x480
Interline CCD
85fps at 20Mhz clock
Resolutions of up to 4096x4096 are available, although with correspondingly reduced frame rates. These higher resolution chips may not be suitable for real-time control.
FPGAs
To provide acquisition control and processing, we envisage using Field-Programmable
Gate Arrays (FPGAs), which are essentially
fully-customisable hardware. FPGAs can be programmed
via logic circuit schematics or through VHDL (a hardware description language).
FPGAs are capable of encapsulating an entire
traditional von-Neumann microprocessor, or alternatively, of implementing
massively-parallel processing structures. Because their hardware is
re-configurable, parallel algorithms execute much faster than on a typical
fetch-and-execute architecture. They are re-programmable, and can be modified
without being removed from the circuit board.
In this project, FPGAs will be
used to control the vision input device (including exposure and frame
acquisition) without needing separate frame-grabber hardware. FPGAs will also be used for filtering and processing the
acquired image, as well as handling communication with other system components.
The project will involve the design and implementation of a small, cheap, flexible and powerful real-time image processor, able to be used in a diverse range of imaging applications. Part of the project will be to identify and implement a range of useful image filters and feature extraction methods. The image processor will output feature information onto a standardised communication bus. The bus will be flexible enough that these features can be used directly for real-time control, or can be processed further for other applications.
The design will incorporate an image sensing device with its output fed directly into an FPGA. This will enable real-time acquisition of the image, with no intervening encoding or modulation stages required, as shown in Figure 3. The FPGA will perform various filtering and feature extraction operations, such as region segmentation, blob and edge detection, noise reduction, optical flow, etc. The implementation of these algorithms will be as parallel as possible, to make full use of the concurrency available in the FPGA. The features or processed images will be output onto a bus using a standardised communication protocol.

The resulting processor will be small enough to be incorporated into small robotic and vision systems. Anticipated applications include on-board vision sensors for mobile robotics, intelligent vision processors for visual inspection and control, smart security and identification systems, and a host of other applications. The processor will be a feasible replacement for other, more standard sensors such as infra-red and ultrasonic spectrum sensors.
In other words, this processor will elevate vision into an off-the-shelf sensor option; as easy to incorporate into applied systems as any other basic component.
The primary objective of the vision processor is that is will be able to perform real-time filtering and feature extraction, such that the output of the processor can be used in a real-time control system. This also means that the communications interface between the processor and external components should be simple but comprehensive.
The filters and feature extraction algorithms will be able to be combined in a flexible manner, such that the output of the processor is useful for control and identification applications. Additionally, the lens used for acquisition will be changeable, to accommodate different applications.
The processor will use off-the-shelf components as much as possible (excepting the image sensor and FPGAs) to keep the total cost of producing the system to a minimum.
The processor will be as compact as possible, to ensure it is useable on-board in small applications. The QUTy robot, developed in the Smart Devices Lab at QUT, is approximately 10 centimetres wide. The vision processor will have a similar form factor.
Researchers
Associate Professor Joaquin Sitte
Dr Shlomo Geva
Research Assistant
Dylan Muir