How do we see? The answer to this very old riddle of mankind has not been found yet...
Is it, nevertheless, possible to construct machines which see? We think: yes!
What do we need for technical vision systems?
Vision research is a truly interdisciplinary field; it employs models and methods from very different areas:
- Applied mathematics (statistics and estimation theory)
- Physics (optics and dynamics, as well as statistical physics)
- Computer science (software design and software systems engineering) and processing processing (information theory, pattern recognition)
- Electronics engineering (for the hardware substrate)
- Photogrammetry (when it comes to measuring, and dealing with errors)
- and a small quantum of philosophical reasoning, as well.
'Seeing is a process that produces a description from the pictures which we obtain from the outside. This description is useful to the person who sees and is not loaded with irrelevant information...'
(David Marr, MIT)
Hermann von Helmholtz, one of the most famous German physicists of the 19th century, has worked intensively on the process of seeing. He wrote:
'Any psychical activity which leads us to the perception that there is a certain object of a certain kind at a certain place outside ourselves generally is not a conscious but an unconscious one. As far as the result is concerned, such an activity seems to be a conclusion... So I will denote the psychological processes of ordinary perception as `unconscious conclusions`, since this name distinguishes the latter sufficiently from conscious conclusions.'
The goal of our current research work is the automatic generation of a physically meaningful description of scenes from image sequences. This description covers the geometrical structure, position relations and trajectories of objects (inclusive of possible ego-motion).
Imagine your car is equipped with eyes in order to assist you in driving, or navigates autonomously in heavy traffic --- even at the Arc de Triomphe during rush hour...
A description of that kind permits, compact storage and efficient transfer of the underlying image signals. On the other hand, it makes possible to use camera and computer as a measuring instrument for dynamic processes and as an environmental sensor for autonomous systems (e.g. robots or vehicles).
Current work deals with the statistical modelling of object views and the appropriate motion trajectories, the robust estimation of 2D and 3D motion parameters. This all contributes to continuous incremental learning of an unknown environment from long image sequences: visual exploration.