2021
Authors: Henry Bradler, Adrian Kretz, Rudolf Mester
Conference: IEEE Intelligent Vehicles Symposium (IV), Nagoya, Japan, July 2021
Paper: https://arxiv.org/abs/2105.14993
Abstract: Urban Traffic Surveillance (UTS) is a surveillance system based on a monocular and calibrated video camera that detects vehicles in an urban traffic scenario with dense traffic on multiple lanes and vehicles performing sharp turning maneuvers. UTS then tracks the vehicles using a 3D bounding box representation and a physically reasonable 3D motion model relying on an unscented Kalman filter based approach. Since UTS recovers positions, shape and motion information in a three-dimensional world coordinate system, it can be employed to recognize diverse traffic violations or to supply intelligent vehicles with valuable traffic information. We build on YOLOv3 as a detector yielding 2D bounding boxes and class labels for each vehicle. A 2D detector renders our system much more independent to different camera perspectives as a variety of labeled training data is available. This allows for a good generalization while also being more hardware efficient. The task of 3D tracking based on 2D detections is supported by integrating class specific prior knowledge about the vehicle shape. We quantitatively evaluate UTS using self generated synthetic data and ground truth from the CARLA simulator, due to the non-existence of datasets with an urban vehicle surveillance setting and labeled 3D bounding boxes. Additionally, we give a qualitative impression of how UTS performs on real-world data. Our implementation is capable of operating in real time on a reasonably modern workstation. To the best of our knowledge, UTS is to date the only 3D vehicle tracking system in a surveillance scenario (static camera observing moving targets).
2020
Authors: Adrian Kretz, Rudolf Mester
Conference: IEEE Southwest Symposium on Image Analysis and Interpretation SSIAI 2020
Abstract: Recently, Siamese neural networks have been employed to build several high performance object trackers capable of operating in real time. To further improve the tracking performance, one can train one network on the tracking task and another network on the task of object classification. One can then use the feature representations of both networks to obtain a tracker which performs better than each network on its own. This approach, however, has the downside that two networks have to be evaluated instead of one, resulting in runtime degradation. We demonstrate that it is feasible to train one Siamese network on the tracking and the classifications tasks simultaneously. Specifically, we achieve a tracking performance similar to the performance of two networks trained on tracking and classification separately. Since our approach does not depend on two separate networks though, it allows one to improve the performance of a Siamese network tracker without any runtime penalty.
Authors: Jan Fabian Schmid, Stephan Simon, Rudolf Mester
Conference: IEEE Intern. Conf. on Robotics and Automation ICRA 2020
Abstract: Ground texture based localization is a promising approach to achieve high-accuracy positioning of vehicles. We present a self-contained method that can be used for global localization as well as for subsequent local localization updates, i.e. it allows a robot to localize without any knowledge of its current whereabouts, but it can also take advantage of an available localization prior to reduce computation time significantly. Our method is based on a novel matching strategy, which we call identity matching, that is based on compact binary feature descriptors. Identity matching treats pairs of features as matches only if their descriptors are identical. While other methods for global localization are faster to compute, our method reaches higher localization success rates, and can switch to local localization after initial localization. Compared to state- of-the-art methods for local localization, our method performs similarly, while being significantly faster to compute.
2019
Authors: Fabian Brickwedde, Steffen Abraham, and Rudolf Mester
Conference: Intern. Conf. on Computer Vision (ICCV), Seoul, Oct 2019
Paper: https://arxiv.org/abs/1908.06316
Abstract: Existing 3D scene flow estimation methods provide the 3D geometry and 3D motion of a scene and gain a lot of interest, for example in the context of autonomous driving. These methods are traditionally based on a temporal series of stereo images. In this paper, we propose a novel monocular 3D scene flow estimation method, called Mono-SF. Mono-SF jointly estimates the 3D structure and motion of the scene by combining multi-view geometry and single-view depth information. Mono-SF considers that the scene flow should be consistent in terms of warping the reference image in the consecutive image based on the principles of multi-view geometry. For integrating single-view depth in a statistical manner, a convolutional neural network, called ProbDepthNet, is proposed. ProbDepthNet estimates pixel-wise depth distributions from a single image rather than single depth values. Additionally, as part of ProbDepthNet, a novel recalibration technique for regression problems is proposed to ensure well-calibrated distributions. Our experiments show that Mono-SF outperforms state-of-the-art monocular baselines and ablation studies support the Mono-SF approach and ProbDepthNet design.
Authors: Håkon Hukkelås, Rudolf Mester, Frank Lindseth
Conference: International Symposium on Visual Computing, Oct 7-9, 2019, Lake Tahoe (NV), USA
Abstract: We propose a novel architecture which is able to automatically anonymize faces in images while retaining the original data distribution. We ensure total anonymization of all faces in an image by generating images exclusively on privacy-safe information. Our model is based on a conditional generative adversarial network, generating images considering the original pose and image background. The conditional information enables us to generate highly realistic faces with a seamless transition between the generated face and the existing background. Furthermore, we introduce a diverse dataset of human faces including unconventional poses, occluded faces, and a vast variability in backgrounds. Finally, we present experimental results reflecting the capability of our model to anonymize images while preserving the data distribution, making the data suitable for further training of deep learning models. As far as we know, no other solution has been proposed that guarantees the anonymization of faces while generating realistic images.
Authors: Matthias Ochs, Adrian Kretz and Rudolf Mester
Conference: German Conference on Pattern Recognition (GCPR), Dortmund, Germany, September 2019
Paper: https://arxiv.org/abs/1907.10659
Abstract: Autonomous vehicles and robots require a full scene understanding of the environment to interact with it. Such a perception typically incorporates pixel-wise knowledge of the depths and semantic labels for each image from a video sensor. Recent learning-based methods estimate both types of information independently using two separate CNNs. In this paper, we propose a model that is able to predict both outputs simultaneously, which leads to improved results and even reduced computational costs compared to independent estimation of depth and semantics. We also empirically prove that the CNN is capable of learning more meaningful and semantically richer features. Furthermore, our SDNet estimates the depth based on ordinal classification. On the basis of these two enhancements, our proposed method achieves state-of-the-art results in semantic segmentation and depth estimation from single monocular input images on two challenging datasets.
Authors: Ricardo M. Sánchez, Rudolf Mester and Mikhail Kudryashev
Conference: European Signal Processing Conference (EUSIPCO), A Coruña, Spain, September 2019
Abstract: Volume alignment is a computational intensive task. In Subtomogram Averaging (StA) from electron cryo-tomograms (CryoET), thousands of subtomograms are aligned to a reference, which may take hours till days of computational time. CryoET datasets contain limited number of noisy projections, with very low signal-to-noise ratio (SNR). The noisy subtomograms are aligned to a reference using cross—correlation, operation that can be optimized when working with limited angle tomograms (LAT), as they are sparse in Fourier space. We propose a projected cross-correlation (pCC) algorithm, a faster approach to compute the cross-correlation between a limited angle (sub)-tomogram and a given reference, and we use pCC to design a new procedure for volume alignment. pCC employs the projections to calculate the cross-correlation with lower computational complexity, as it works with a set 2D projections instead of volumes. With this, we propose the Substacks Averaging (SsA) method as an alternative to the conventional Subtomogram Averaging (StA) used in CryoET. Finally, our test show that our implementation of the SsA is considerably faster than the reference StA implementation: for 41 projections (k=41) and N=200, the SsA is 35 times faster, and for N=320, is 150 times faster.
Authors: Robin Kreuzig, Matthias Ochs and Rudolf Mester
Conference: Conference on Computer Vision and Pattern Recognition Workshop (CVPR-WAD), Long Beach, USA, June 2019
Paper: https://arxiv.org/abs/1904.08105
Abstract: Classical monocular vSLAM/VO methods suffer from the scale ambiguity problem. Hybrid approaches solve this problem by adding deep learning methods, for example by using depth maps which are predicted by a CNN. We suggest that it is better to base scale estimation on estimating the traveled distance for a set of subsequent images. In this paper, we propose a novel end-to-end many-to-one traveled distance estimator. By using a deep recurrent convolutional neural network (RCNN), the traveled distance between the first and last image of a set of consecutive frames is estimated by our DistanceNet. Geometric features are learned in the CNN part of our model, which are subsequently used by the RNN to learn dynamics and temporal information. Moreover, we exploit the natural order of distances by using ordinal regression to predict the distance. The evaluation on the KITTI dataset shows that our approach outperforms current state-of-the-art deep learning pose estimators and classical mono vSLAM/VO methods in terms of distance prediction. Thus, our DistanceNet can be used as a component to solve the scale problem and help improve current and future classical mono vSLAM/VO methods.
Authors: Ricardo M. Sánchez, Rudolf Mester and Mikhail Kudryashev
Conference: Scandinavian Conference on Image Analysis (SCIA), Norrköping, Sweden, June 2019
Abstract: The crosscorrelation is a fundamental operation in signal processing, as it used as a metric of similarity and to estimate transla- tions between signals. Due to computational efficiency, it is calculated in Fourier space. Limited angle tomograms are reconstructed from a re- duced number of projections and have a sparse Fourier transform. We propose a new method to calculate the cross–correlation between a sig- nal and a limited angle tomogram. The method is called projected Cross Correlation (pCC) and takes advantage of the mentioned sparsity: it cal- culates the cross–correlation only where the limited angle tomogram has valid Fourier information. pCC, instead of following the traditional reconstruct then cross–correlate process, follows a project, cross–correlate and reconstruct process. Both methods are equivalent, but the proposed one has lower computational complexity and provides significant speedup for larger tomograms, as we confirm with our experiments. Additionally, we add the l1 penalty func- tion to the cross–correlation, which highlights the peaks and improves the reliability of it.
Authors: Patrick Klose and Rudolf Mester
Conference: Applications of Intelligent Systems (APPIS), Gran Canaria, Spain, January 2019
Abstract: In the field of Autonomous Driving, the system controlling the vehicle can be seen as an agent acting in a complex environment and thus naturally fits into the modern framework of Reinforcement Learning. However, learning to drive can be a challenging task and current results are often restricted to simplified driving environments. To advance the field, we present a method to adaptively restrict the action space of the agent according to its current driving situation and show that it can be used to swiftly learn to drive in a realistic environment based on the Deep Q-Network algorithm.
2018
Authors: Nolang Fanani, Matthias Ochs and Rudolf Mester
Conference: European Conference on Computer Vision Workshop (ECCVW-GMDL), Munich, Germany, September 2018
Abstract: This paper presents a method for detecting independently moving objects (IMOs) from a monocular camera mounted on a moving car. We use an existing state of the art monocular sparse visual odometry/SLAM framework, and specifically attack the notorious problem of identifying those IMOs which move parallel to the ego-car motion, that is, in an ’epipolar-conformant’ way. IMO candidate patches are obtained from an existing CNN-based car instance detector. While crossing IMOs can be identified as such by epipolar consistency checks, IMOs that move parallel to the camera motion are much harder to detect as their epipolar conformity allows to misinterpret them as static objects in a wrong distance. We employ a CNN to provide an appearance-based depth estimate, and the ambiguity problem can be solved through depth verification. The obtained motion labels (IMO/static) are then propagated over time using the combination of motion cues and appearance-based
information of the IMO candidate patches. We evaluate the performance of our method on the KITTI dataset. The manually labeled IMO-aware label data for KITTI will be made available to the public.
Authors: Fabian Brickwedde, Steffen Abraham and Rudolf Mester
Conference: European Conference on Computer Vision Workshop (ECCVW-CVRSUAD), Munich, Germany, September 2018
Abstract:The stixel-world is a compact and detailed environment representation specially designed for street scenes and automotive vision applications. A recent work proposes a monocamera based stixel estimation method based on the structure from motion principle and scene model to predict the depth and translational motion of the static and dynamic parts of the scene. In this paper we propose to exploit the recent advantages in deep learning based single image depth prediction for mono-stixel estimation. In our approach the mono-stixels are estimated based on the single image depth predictions, a dense optical flow field and semantic segmentation supported by the prior knowledge about the characteristic of typical street scenes. To provide a meaningful estimation, it is crucial to model the statistical distribution of all measurements, which is especially challenging for the single image depth predictions. Therefore, we present a semantic class dependent measurement model of the 1 single image depth prediction derived from the empirical error distribution on the Kitti dataset. Our experiments on the Kitti-Stereo’2015 dataset show that we can significantly improve the quality of mono-stixel estimation by exploiting the single image depth prediction. Furthermore, our proposed approach is able to handle partly occluded moving objects as well as scenarios without translational motion of the camera.
Authors: Nolang Fanani, Matthias Ochs, Alina Stürck and Rudolf Mester
Conference: Intelligent Vehicles Symposium (IV), Changshu, China, June 2018
Acknowledgement: Paper has won the second prize of the Best Paper Award
Abstract: This paper presents a method for detecting independently moving objects (IMOs) from a monocular camera mounted on a moving car. A CNN-based classifier is employed to generate IMO candidate patches; independent motion is detected by geometric criteria on keypoint trajectories in these patches. Instead of looking only at two consecutive frames, we analyze keypoints inside the IMO candidate patches through multi-frame epipolar consistency checks. The obtained motion labels (IMO/static) are then propagated over time using the combination of motion cues and appearance-based information of the IMO candidate patches. We evaluate the performance of our method on the KITTI dataset, focusing on sub-sequences containing IMOs.
Authors: Matthias Ochs, Henry Bradler and Rudolf Mester
Conference: Intelligent Vehicles Symposium (IV), Changshu, China, June 2018
Abstract: In the area of autonomous driving, sensing the environment is most important for self-localization and egomotion estimation. Visual odometry/SLAM methods have proven capable to achieve good results, even in real-time applications by operating in a sparse mode. Running on a sequence, these methods need to continuously incorporate new features well distributed over the image. Therefore, the performance of these methods can be further improved, if they are supplied with coarse but dense initial depth information, that can be utilized at arbitrary sparse image positions. Previously triangulated depths and even high quality depth measurements of a LIDAR sensor are not suitable for this task, since they only provide a sparse depth map. To solve this issue, we introduce a novel interpolation method called Spatio-Temporal Depth Interpolation (STDI), which exploits spatial and temporal correlations of the data (e.g. sequences of sparse depth maps) to give a consistent dense output including associated uncertainties. STDI is a fused approach, which makes use of the most important components of a principal component analysis (PCA) (spatial information) and additionally is capable to re-use information of previously interpolated depth maps in a regression based approach (temporal information). We evaluate the quality of STDI on the KITTI visual odometry benchmark, where a sequence of extremely sparsely sampled depth maps (around 40 depth values) is densified and on the KITTI depth completion benchmark. The latter deals with the densification of sparse LIDAR input. Of course, our method is not limited to these applications and can be used for any densification of sparse sequential data which is expected to contain spatial and/or temporal correlations (e.g. initialization for dense optical flow methods based on a sparse measurement).
Authors: Nolang Fanani and Rudolf Mester
Conference: Southwest Symposium on Image Analysis and Interpretation (SSIAI), Las Vegas, USA, April 2018
Abstract: We analyze the depth reconstruction precision and sensitivity of two-frame triangulation for the case of general motion, and focus on the case of monocular visual odometry, that is: a single camera looking mostly in the direction of motion. The results confirm intuitive assumptions about the limited triangulation precision close to the focus of expansion.
Authors: Fabian Brickwedde, Steffen Abraham and Rudolf Mester
Conference: International Conference on Robotics and Automation (ICRA), Australia, May 2018
Abstract: This paper presents the Mono-Stixels to estimate Stixels from a monocular camera sequence instead of the traditionally used stereo depth measurements. The main contribution is to provide a reliable depth reconstruction of both the static and moving parts of the scene. In comparison with existing works, the proposed method gives the better results. Experiment shows that MonoStixels could be the enabler to use the Stixel World in a monocamera setup for driver assistance systems or autonomous vehicles.
Authors: Patrick Klose and Rudolf Mester
Conference: Applications of Intelligent Systems (APPIS), Spain, January 2018
Paper: https://arxiv.org/abs/1712.04363
Abstract: Using Deep Reinforcement Learning (DRL) can be a promising approach to handle tasks in the field of (simulated) autonomous driving, whereby recent publications only consider learning in unusual driving environments. This paper outlines a developed software, which instead can be used for evaluating DRL algorithms based on realistic road networks and therefore in more usual driving environments. Furthermore, we identify difficulties when DRL algorithms are applied to tasks, in which it is not only important to reach a goal, but also how this goal is reached. We conclude this paper by outlining a new DRL algorithm which can solve these problems.
2017
Authors: Nolang Fanani, Alina Stürck, Matthias Ochs, Henry Bradler, Rudolf Mester
Journal: Image and Vision Computing, September 2017
Paper: Link to http://www.sciencedirect.com
Abstract: Visual odometry using only a monocular camera faces more algorithmic challenges than stereo odometry. We present a robust monocular visual odometry framework for automotive applications. An extended propagation-based tracking framework is proposed which yields highly accurate (unscaled) pose estimates. Scale is supplied by ground plane pose estimation employing street pixel labeling using a convolutional neural network (CNN). The proposed framework has been extensively tested on the KITTI dataset and achieves a higher rank than current published state-of-the-art monocular methods in the KITTI odometry benchmark. Unlike other VO/SLAM methods, this result is achieved without loop closing mechanism, without RANSAC and also without multiframe bundle adjustment. Thus, we challenge the common belief that robust systems can only be built using iterative robustification tools like RANSAC.
Authors: Christian Conrad and Rudolf Mester
Conference: Workshop on Biological Inspired Computer Vision, Catania, Sicily, Italy, September 2017
Abstract: In this work we study unsupervised learning of correspondence relations over extended image sequences. We are specifically interested in learning the correspondence relations ’from scratch’ and only consider the temporal signal of single pixels. We build on the Temporal Coincidence Analysis (TCA) approach which we apply to motion estimation. Experimental results showcase the approach for learning average motion maps and for the estimation of yaw rates in a visual odometry setting. Our approach is not meant as a direct competitor to state of the art dense motion algorithms but rather shows that valuable information for various vision tasks can be learnt by a simple statistical analysis on the pixel level. Primarily, the approach unveils principles on which biological or ’deep’ learning techniques may build architectures for motion perception; so TCA formulates a hypothesis for a fundamental perception mechanism. Motion or correspondence distributions as they are determined here may associate conventional methods with a confidence measure, which allows to detect implausible, and thus probably incorrect correspondences. The approach does not need any kind of ground truth information, but rather learns over long image sequences and may thus be seen as a continuous learning method. The method is not restricted to a specific camera model and works even with strong geometric distortions. Results are presented for standard as well as fisheye cameras.
Authors: Nolang Fanani, Alina Stürck and Rudolf Mester
Conference: Intelligent Vehicles Symposium (IV), Los Angeles, USA, June 2017
Abstract: Monocular visual odometry / SLAM require the ability to deal with the scale ambiguity problem, or equivalently to transform the estimated unscaled poses into correctly scaled poses. While propagating the scale is possible, it is very prone to the scale drift effect. We address the problem of monocular scale estimation by proposing a multimodal mechanism of prediction, classification, and correction. Our scale correction scheme combines cues from both dense and sparse ground plane estimation; this makes the proposed method robust toward varying availability and distribution of trackable ground structure. Instead of optimizing the parameters of the ground plane related homography, we parametrize and optimize the underlying motion parameters directly. Furthermore, we employ classifiers to detect scale outliers based on various features (e.g. moments on residuals). We test our method on the challenging KITTI dataset and show that the proposed method is capable to provide significantly better scale estimates than current stateof-the-art monocular methods.
Authors: Matthias Ochs, Henry Bradler and Rudolf Mester
Conference: Intelligent Vehicles Symposium (IV), Los Angeles, USA, June 2017
Paper: Link to https://arxiv.org/abs/1703.05061
Abstract: Most iterative optimization algorithms for motion, depth estimation or scene reconstruction, both sparse and dense, rely on a coarse and reliable dense initialization to bootstrap their optimization procedure. This makes techniques important that allow to obtain a dense but still approximative representation of a desired 2D structure (e.g., depth maps, optical flow, disparity maps) from a very sparse measurement of this structure. The method presented here exploits the complete information given by the principal component analysis (PCA), the principal basis and its prior distribution. The method is able to determine a dense reconstruction even if only a very sparse measurement is available. When facing such situations, typically the number of principal components is further reduced which results in a loss of expressiveness of the basis. We overcome this problem and inject prior knowledge in a maximum a posteriori (MAP) approach. We test our approach on the KITTI and the Virtual KITTI dataset and focus on the interpolation of depth maps for driving scenes. The evaluation of the results shows good agreement to the ground truth and is clearly superior to the results of an interpolation by the nearest neighbor method which disregards statistical information.
Authors: Henry Bradler, Matthias Ochs, Nolang Fanani and Rudolf Mester
Conference: IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, California, USA, March 2017
Paper: Link to https://arxiv.org/abs/1703.05065
Abstract: Traditionally, pose estimation is considered as a two step problem. First, feature correspondences are determined by direct comparison of image patches, or by associating feature descriptors. In a second step, the relative pose and the coordinates of corresponding points are estimated, most often by minimizing the reprojection error (RPE). RPE optimization is based on a loss function that is merely aware of the feature pixel positions but not of the underlying image intensities. In this paper, we propose a sparse direct method which introduces a loss function that allows to simultaneously optimize the unscaled relative pose, as well as the set of feature correspondences directly considering the image intensity values. Furthermore, we show how to integrate statistical prior information on the motion into the optimization process. This constructive inclusion of a Bayesian bias term is particularly efficient in application cases with a strongly predictable (short term) dynamic, e.g. in a driving scenario. In our experiments, we demonstrate that the ‚JET‘ algorithm we propose outperforms the classical reprojection error optimization on two synthetic datasets and on the KITTI dataset. The JET algorithm runs in real-time on a single CPU thread.
Authors: Christian Conrad
Dissertation
Brief summary of the thesis by the author:
In this thesis, I investigate learning of correspondence relations in long streams of visual data. In the first part of the thesis, I regard learning of local correspondence relations, i.e., pixel correspondences, which are at the core of stereo vision and motion estimation. My primary goal is not to determine pixel-to-pixel correspondences at a specific time instant, but to learn the distribution of the correspondence relation by only regarding the temporal course of single pixels. In contrast to classic approaches, I show that stereo and motion perception may be learnt without explicitly computing stereo disparity or motion vectors. In the second part of the thesis, I regard learning of global correspondence relations, i.e., global mappings or transformations between pairs of image streams. In contrast to classic approaches, I do not fit the parameters of a parametric model based on spatial image features. Instead, I determine sets of new basis vectors, which implicitly encode the underlying transformation. This is done without the need of computing spatial features, but requires pairs of long image streams in which the global transformation is held fixed.
Download: Link
2016
Authors: Jan van den Brand, Matthias Ochs and Rudolf Mester
Conference: Asian Conference on Computer Vision (ACCV) – Workshop on Computer Vision Technologies for Smart Vehicle, Taipei, Taiwan, November 2016
Abstract: The recognition of individual object instances in single monocular images is still an incompletely solved task. In this work, we propose a new approach for detecting and separating vehicles in the context of autonomous driving. Our method uses the fully convolutional network (FCN) for semantic labeling and for estimating the boundary of each vehicle. Even though a contour is in general a one pixel wide structure which cannot be directly learned by a CNN, our network addresses this by providing areas around the contours. Based on these areas, we separate the individual vehicle instances. In our experiments, we show on two challenging datasets (Cityscapes and KITTI) that we achieve state-of-the-art performance, despite the usage of a subsampling rate of two. Our approach even outperforms all recent works w.r.t. several rating scores.
Authors: Peter Pinggera, Sebastian Ramos, Stefan Gehrig, Uwe Franke, Carsten Rother and Rudolf Mester
Conference: International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea, October 2016
Paper: Link to https://arxiv.org/abs/1609.04653
Abstract: Detecting small obstacles on the road ahead is a critical part of the driving task which has to be mastered by fully autonomous cars. In this paper, we present a method based on stereo vision to reliably detect such obstacles from a moving vehicle. The proposed algorithm performs statistical hypothesis tests in disparity space directly on stereo image data, assessing freespace and obstacle hypotheses on independent local patches. This detection approach does not depend on a global road model and handles both static and moving obstacles. For evaluation, we employ a novel lost-cargo image sequence dataset comprising more than two thousand frames with pixelwise annotations of obstacle and free-space and provide a thorough comparison to several stereo-based baseline methods. The dataset will be made available to the community to foster further research on this important topic. The proposed approach outperforms all considered baselines in our evaluations on both pixel and object level and runs at frame rates of up to 20 Hz on 2 mega-pixel stereo imagery. Small obstacles down to the height of 5 cm can successfully be detected at 20 m distance at low false positive rates.
Authors: Christian Conrad and Rudolf Mester
Conference: Statistical Signal Processing Workshop (SSP), Palma de Mallorca, Spain, June 2016
Abstract: Correspondence relations between different views of the same scene can be learnt in an unsupervised manner. We address autonomous learning of arbitrary fixed spatial (point-to-point) mappings. Since any such transformation can be represented by a permutation matrix, the signal model is a linear one, whereas the proposed analysis method, mainly based on Canonical Correlation Analysis} (CCA) is based on a generalized eigensystem problem, i.e. a nonlinear operation. The learnt transformation is represented implicitly in terms of pairs of learned basis vectors and does neither use nor require an analytic / parametric expression for the latent mapping. We show how the rank of the signal that is shared among views may be determined from canonical correlations and how the overlapping (=shared) dimensions among the views may be inferred.
Authors: Daniel Biedermann, Matthias Ochs and Rudolf Mester
Conference: Intelligent Vehicles Symposium (IV), Gothenburg, Sweden, June 2016
Abstract: To aid in the development and evaluation of vision algorithms in the context of driver assistance applications and traffic surveillance, we created a framework that allows us for continuous creation of highly realistic image sequences featuring traffic scenarios. The sequences are created with a realistic and state of the art vehicle physics model and different kinds of environments are featured, thus providing a wide range of testing scenarios. Due to the physically-based rendering technique and camera model that is employed for the image rendering process, we can simulate different sensor setups and provide appropriate and fully accurate ground truth data for them.
Authors: Nolang Fanani and Rudolf Mester
Conference: Intelligent Vehicles Symposium (IV), Gothenburg, Sweden, June 2016
Abstract: Driver assistance has been a major applicationfield in recent decades. One of the major steps of structure-from-motion approaches is to track surrounding keypointsand to recognize the trajectories of the keypoints. This paperpresents a method to obtain the trajectories of keypoints from asequence of images. The keypoint trajectories are accumulatedby implementing keypoint tracking through the propagationof the predicted 3D position of the keypoint. Experiments onthe KITTI dataset as well as on a synthetic dataset show thataccurate keypoint trajectories are attainable.
Authors: Nolang Fanani and Rudolf Mester
Conference: Southwest Symposium on Image Analysis and Interpretation (SSIAI), Santa Fe, New Mexico, USA, March 2016
Abstract: One of the major steps in visual environment perception for automotive applications is to track keypoints and to subsequently estimate egomotion and environment structure from the trajectories of these keypoints. This paper presents a propagation based tracking method to obtain the 2D trajectories of keypoints from a sequence of images in a monocular camera setup. Instead of relying on the classical RANSAC to obtain accurate keypoint correspondences, we steer the search for keypoint matches by means of propagating the estimated 3D position of the keypoint into the next frame and verifying the photometric consistency. In this process, we continuously predict, estimate and refine the frame-to-frame relative pose which induces the epipolar relation. Experiments on the KITTI dataset as well as on the synthetic COnGRATS dataset show promising results on the estimated courses and accurate keypoint trajectories.
2015
Authors: Nolang Fanani, Marc Barnada and Rudolf Mester
Conference: International Symposium on Visual Computing (ISVC), Las Vegas, Nevada, USA, December 2015
Abstract: Tracking keypoints through a video sequence is a crucial first step in the processing chain of many visual SLAM approaches. This paper presents a robust initialization method to provide the initial match for a keypoint tracker, from the 1st frame where a keypoint is detected to the 2nd frame, that is: when no depth information is available. We deal explicitly with the case of long displacements. The starting position is obtained through an optimization that employs a distribution of motion priors based on pyramidal phase correlation, and epipolar geometry constraints. Experiments on the KITTI dataset demonstrate the significant impact of applying a motion prior to the matching. We provide detailed comparisons to the state-of-the-art methods.
Authors: Henry Bradler, Birthe Wiegand and Rudolf Mester
Conference: International Conference on Computer Vision (ICCV) – Workshop on Computer VIsion for Road Scene Understanding and Autonomous Driving, Santiago de Chile, Chile, December 2015
Abstract: The motion of a driving car is highly constrained and we claim that powerful predictors can be built that ‚learn‘ the typical egomotion statistics, and support the typical tasks of feature matching, tracking, and egomotion estimation. We analyze the statistics of the ‚ground truth‘ data given in the KITTI odometry benchmark sequences and confirm that a coordinated turn motion model, overlaid by moderate vibrations, is a very realistic model. We develop a predictor that is able to significantly reduce the uncertainty about the relative motion when a new image frame comes in. Such predictors can be used to steer the matching process from frame n to frame n + 1. We show that they can also be employed to detect outliers in the temporal sequence of egomation parameters.
Authors: Daniel Biedermann, Matthias Ochs and Rudolf Mester
Conference: Image and Vision Computing New Zealand (IVCNZ), Auckland, New Zealand, November 2015
Acknowledgement: Paper has won Best Student Paper Award
Abstract: For evaluating or training different kinds of vision algorithms, a large amount of precise and reliable data is needed. In this paper we present a system to create extended synthetic sequences of traffic environment scenarios, associated with several types of ground truth data. By integrating vehicle dynamics in a configuration tool, and by using path-tracing in an external rendering engine to render the scenes, a system is created that allows ongoing and flexible creation of highly realistic traffic images. For all images, ground truth data is provided for depth, optical flow, surface normals and semantic scene labeling. Sequences that are produced with this system are more varied and closer to natural images than other synthetic datasets before.
Authors: Matthias Ochs, Henry Bradler and Rudolf Mester
Conference: Pacific Rim Symposium on Image and Video Technology (PSIVT), Auckland, New Zealand, November 2015
Abstract: Phase correlation is one of the classic methods for sparse motion or displacement estimation. It is renowned in the literature for high precision and insensitivity against illumination variations. We propose several important enhancements to the phase correlation (PhC) method which render it more robust against those situations where a motion measurement is not possible (low structure, too much noise, too different image content in the corresponding measurement windows). This allows the method to perform self-diagnosis in adverse situations. Furthermore, we extend the PhC method by a robust scheme for detecting and classifying the presence of multiple motions and estimating their uncertainties. Experimental results on the Middlebury Stereo Dataset and on the KITTI Optical Flow Dataset show the potential offered by the enhanced method in contrast to the PhC implementation of OpenCV.
Authors: Peter Pinggera, Uwe Franke and Rudolf Mester
Conference: International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, September 2015
Abstract: Reliable detection of obstacles at long range is crucial for the timely response to hazards by fast-moving safety-critical platforms like autonomous cars. We present a novel method for the joint detection and localization of distant obstacles using a stereo vision system on a moving platform. The approach is applicable to both static and moving obstacles and pushes the limits of detection performance as well as localization accuracy. The proposed detection algorithm is based on sound statistical tests using local geometric criteria which implicitly consider non-flat ground surfaces. To achieve maximum performance, it operates directly on image data instead of precomputed stereo disparity maps. A careful experimental evaluation on several datasets shows excellent detection performance and localization accuracy up to very large distances, even for small obstacles. We demonstrate a parallel implementation of the proposed system on a GPU that executes at real-time speeds.
Authors: Christian Conrad and Rudolf Mester
Conference: International Conference on Advanced Video and Signal based Surveillance (AVSS), Karlsruhe, Germany, August 2015
Abstract: We present an approach to learn relative photometric differences between pairs of cameras, which have partially overlapping fields of views. This is an important problem, especially in appearance based methods to correspondence estimation or object identification in multi-camera systems where grey values observed by different cameras are processed. We model intensity differences among pairs of cameras by means of a low order polynomial (Gray Value Transfer Function – GVTF ) which represents the characteristic curve of the mapping of grey values si produced by camera Ci to the corresponding grey values sj acquired with camera Cj . While the estimation of the GVTF parameters is straight forward once a set of truly corresponding pairs of grey values is available, the non trivial task in the GVTF estimation process solved in this paper is the extraction of corresponding grey value pairs in the presence of geometric and photometric errors. We also present a temporal GVTF update scheme to adapt to gradual global illumination changes, e.g., due to the change of daylight.
Authors: Mikael Persson, Tommaso Piccini, Michael Felsberg and Rudolf Mester
Conference: Intelligent Vehicles Symposium (IV), Seoul ,South Korea, June 2015
Abstract: Visual odometry is one of the most active topics in computer vision. The automotive industry is particularly interested in this field due to the appeal of achieving a high degree of accuracy with inexpensive sensors such as cameras. The best results on this task are currently achieved by systems based on a calibrated stereo camera rig, whereas monocular systems are generally lagging behind in terms of performance. We hypothesise that this is due to stereo visual odometry being an inherently easier problem, rather than than due to higher quality of the state of the art stereo based algorithms. Under this hypothesis, techniques developed for monocular visual odometry systems would be, in general, more refined and robust since they have to deal with an intrinsically more difficult problem. In this work we present a novel stereo visual odometry system for automotive applications based on advanced monocular techniques. We show that the generalization of these techniques to the stereo case result in a significant improvement of the robustness and accuracy of stereo based visual odometry. We support our claims by the system results on the well known KITTI benchmark, achieving the top rank for visual only systems.
Authors: Marc Barnada, Christian Conrad, Henry Bradler, Matthias Ochs and Rudolf Mester
Conference: Intelligent Vehicles Symposium (IV), Seoul ,South Korea, June 2015
Abstract: The online-estimation of yaw, pitch, and roll of a moving vehicle is an important ingredient for systems which estimate egomotion, and 3D structure of the environment in a moving vehicle from video information. We present an approach to estimate these angular changes from monocular visual data, based on the fact that the motion of far distant points is not dependent on translation, but only on the current rotation of the camera. The presented approach does not require features (corners, edges, …) to be extracted. It allows to estimate in parallel also the illumination changes from frame to frame, and thus allows to largely stabilize the estimation of image correspondences and motion vectors, which are most often central entities needed for computating scene structure, distances, etc. The method is significantly less complex and much faster than a full egomotion computation from features, such as PTAM, but it can be used for providing motion priors and reduce search spaces for more complex methods which perform a complete analysis of egomotion and dynamic 3D structure of the scene in which a vehicle moves.
2014
Authors: Tommaso Piccini, Mikael Persson, Klas Nordberg, Michael Felsberg and Rudolf Mester
Conference: European Conference on Computer Vision (ECCV) – Workshop for Road Scene Understanding and Autonomous Driving, Zurich, Switzerland, September 2014
Abstract: An open issue in multiple view geometry and structure from motion, applied to real life scenarios, is the sparsity of the matched key-points and of the reconstructed point cloud. We present an approach that can significantly improve the density of measured displacement vectors in a sparse matching or tracking setting, exploiting the partial information of the motion field provided by linear oriented image patches (edgels). Our approach assumes that the epipolar geometry of an image pair already has been computed, either in an earlier feature-based matching step, or by a robustified differential tracker. We exploit key-points of a lower order, edgels, which cannot provide a unique 2D matching, but can be employed if a constraint on the motion is already given. We present a method to extract edgels, which can be effectively tracked given a known camera motion scenario, and show how a constrained version of the Lucas-Kanade tracking procedure can efficiently exploit epipolar geometry to reduce the classical KLT optimization to a 1D search problem. The potential of the proposed methods is shown by experiments performed on real driving sequences.
Authors: Peter Pinggera, David Pfeiffer, Uwe Franke, and Rudolf Mester
Conference: European Conference on Computer Vision (ECCV), Zurich, Switzerland, September 2014
Abstract: Modern applications of stereo vision, such as advanced driver assistance systems and autonomous vehicles, require highest precision when determining the location and velocity of potential obstacles. Subpixel disparity accuracy in selected image regions is therefore essential. Evaluation benchmarks for stereo correspondence algorithms, such as the popular Middlebury and KITTI frameworks, provide important reference values regarding dense matching performance, but do not sufficiently treat local sub-pixel matching accuracy. In this paper, we explore this important aspect in detail. We present a comprehensive statistical evaluation of selected state-of-the-art stereo matching approaches on an extensive dataset and establish reference values for the precision limits actually achievable in practice. For a carefully calibrated camera setup under real-world imaging conditions, a consistent error limit of 1/10 pixel is determined. We present guidelines on algorithmic choices derived from theory which turn out to be relevant to achieving this accuracy limit in practice.
Authors: Friedrich Erbs, Andreas Witte, Timo Scharwaechter, Rudolf Mester and Uwe Franke
Conference: Intelligent Vehicles Symposium (IV), Dearborn, Michigan, USA, June 2014
Abstract: Stereo vision has established in the field of driver assistance and vehicular safety systems. Next steps along the road towards accident free driving aim to assist the driver in increasingly complex situations such as inner-city traffic. In order to achieve these goals, it is desirable to incorporate higher-order object knowledge in the stereo vision-based understanding of traffic scenes. In particular, object shape and dimension information can help to achieve correct interpretations. However, typically this kind of higher-order information results in a difficult energy minimization problem since large areas of the input image have to be constrained. In this contribution, an efficient global optimization approach based on dynamic programming is proposed that is able to take into account such higher-order object knowledge. The approach is built upon a simple tree representation of the Dynamic Stixel World, an efficient super-pixel object representation. Experiments show that object segmentation can be improved significantly by means of the higher-order object information.
Authors: Rudolf Mester and Christian Conrad
Conference: International Conference on Pattern Recognition (ICPR), Stockholm, Sweden, August 2014
Abstract: We discuss matching measures (scores and residuals) for comparing image patches under unknown affine photometric (=intensity) transformations. In contrast to existing methods, we derive a fully symmetric matching measure which reflects the fact that both copies of the signal are affected by measurement errors (’noise’), not only one. As it turns out, this evolves into an eigensystem problem; however a simple direct solution for all entities of interest can be given. We strongly advocate for constraining the estimated gain ratio and the estimated mean value offset to realistic ranges, thus preventing the matching scheme from locking into unrealistic correspondences.
Authors: Rudolf Mester
Conference: Southwest Symposium on Image Analysis and Interpretation (SSIAI), San Diego, California, USA, April 2014
Abstract: The present paper analyzes some previously unex- plored aspects of motion estimation that are fundamental both for discrete block matching as well as for differential ’optical flow’ approaches a` la Lucas-Kanade. It aims at providing a complete estimation-theoretic approach that makes the assumptions about noisy observations of samples from a continuous signal of a certain class explicit. It turns out that motion estimation is a combination of simultaneously estimating the true underlying continuous signal and optimizing the displacement between two hypothetical copies of this unknown signal. Practical schemes such as the current variants of Lucas-Kanade are just approxi- mations to the fundamental estimation problem identified in the present paper. Derivatives appear as derivatives to the continuous signal representation kernels, not as ad hoc discrete derivative masks. The formulation via an explicit signal space defined by kernels is a precondition for analyzing e.g. the convergence range of iterative displacement estimation procedures, and for systematically chosing preconditioning filters. The paper sets the stage for further in-depth analysis of some fundamental issues that have so far been overlooked or ignored in motion analysis.
2013
Authors: Peter Pinggera, Uwe Franke and Rudolf Mester
Conference: German Conference on Pattern Recognition (GCPR), Saarbrücken, Germany, September 2013
Abstract: Precise stereo-based depth estimation at large distances is challenging: objects become very small, often exhibit low contrast in the image, and can hardly be separated from the background based on disparity due to measurement noise. In this paper we present an approach that overcomes these problems by combining robust object segmentation and highly accurate depth and motion estimation. The segmentation criterion is formulated as a probabilistic combination of disparity, optical flow and image intensity that is optimized
Authors: Christian Conrad, Matthias Mertz and Rudolf Mester
Conference: Energy Minimization Methods in Computer Vision and Pattern Recognition, Lund, Sweden, August 2013
Abstract: We propose and evaluate a versatile scheme for image pre-segmentation that generates a partition of the image into a selectable number of patches (’superpixels‘), under the constraint of obtaining maximum homogeneity of the ‚texture‘ inside of each patch, and maximum accordance of the contours with both the image content as well as a Gibbs-Markov random field model. In contrast to current state-of-the art approaches to superpixel segmentation, ‚homogeneity‘ does not limit itself to smooth region-internal signals and high feature value similarity between neighboring pixels, but is applicable also to highly textured scenes. The energy functional that is to be maximized for this purpose has only a very small number of design parameters, depending on the particular statistical model used for the images. The capability of the resulting partitions to deform according to the image content can be controlled by a single parameter. We show by means of an extensive comparative experimental evaluation that the compactness-controlled contour-relaxed superpixel method outperforms the state-of-the art superpixel algorithms with respect to boundary recall and undersegmentation error while being faster or on a par with respect to runtime.
Authors: Jens Eisenbach, Matthias Mertz, Christian Conrad and Rudolf Mester
Conference: International Conference on Advanced Video and Signal-Based Surveillance (AVSS), Kraków, Poland, August 2013
Abstract: We analyze the consequences of instabilities and fluctuations, such as camera shaking and illumination/exposure changes, on typical surveillance video material and devise a systematic way to compensate these changes as much as possible. The phase correlation method plays a decisive role in the proposed scheme, since it is inherently insensitive to gain and offset changes, as well as insensitive against different linear degradations (due to time-variant motion blur) in subsequent images. We show that the listed variations can be compensated effectively, and the image data can be equilibrated significantly before a temporal change detection and/or a background-based detection is performed. We verify the usefulness of the method by comparative tests with and without stabilization, using the changedetection.net benchmark and several stateof-the-art detections methods.
Authors: Jens Eisenbach, Christian Conrad and Rudolf Mester
Conference: Conference on Computer Vision and Pattern Recognition (CVPR) – Workshop on Camera Networks and Wide Area Scene Analysis, Portland, Oregon, USA, June 2013
Abstract: This paper addresses the problem of finding corresponding image patches in multi-camera video streams by means of an unsupervised learning method. We determine patch-to-patch correspondence relations (‚correspondence priors‘) merely using information from a temporal change detection. Correspondence priors are essential for geometric multi-camera calibration, but are useful also for further vision tasks such as object tracking and recognition. Since any change detection method with reasonably performance can be applied, our method can be used as an encapsulated processing module and be integrated into existing systems without major structural changes. The only assumption that is made is that relative orientation of pairs of cameras may be arbitrary, but fixed, and that the observed scene shows visual activity. Experimental results show the applicability of the presented approach in real world scenarios where the camera views show large differences in orientation and position. Furthermore we show that a classic spatial matching pipeline, e.g., based on SIFT will typically fail to determine correspondences in these kinds of scenarios.
Authors: Philipp Koschorrek, Tommaso Piccini, Per Öberg, Michael Felsberg, Lars Nielsen and Rudolf Mester
Conference: Conference on Computer Vision and Pattern Recognition (CVPR) – Workshop on Ground Truth: What is a good dataset, Portland, Oregon, USA, June 2013
Abstract: The development of vehicles that perceive their environment, in particular those using computer vision, indispensably requires large databases of sensor recordings obtained from real cars driven in realistic traffic situations. These datasets should be time shaped for enabling synchronization of sensor data from different sources. Furthermore, full surround environment perception requires high frame rates of synchronized omnidirectional video data to prevent information loss at any speeds. This paper describes an experimental setup and software environment for recording such synchronized multi-sensor data streams and storing them in a new open source format. The dataset consists of sequences recorded in various environments from a car equipped with an omnidirectional multi-camera, height sensors, an IMU, a velocity sensor, and a GPS. The software environment for reading these data sets will be provided to the public, together with a collection of long multi-sensor and multi-camera data streams stored in the developed format.
Authors: Vasileios Zografos, Liam Ellis and Rudolf Mester
Conference: Conference on Computer Vision and Pattern Recognition (CVPR), Portland, Oregon, USA, June 2013
Abstract: We present a novel method for clustering data drawn from a union of arbitrary dimensional subspaces, called Discriminative Subspace Clustering (DiSC). DiSC solves the subspace clustering problem by using a quadratic classifier trained from unlabeled data (clustering by classification). We generate labels by exploiting the locality of points from the same subspace and a basic affinity criterion. A number of classifiers are then diversely trained from different partitions of the data, and their results are combined together in an ensemble, in order to obtain the final clustering result. We have tested our method with 4 challenging datasets and compared against 8 state-of-the-art methods from literature. Our results show that DiSC is a very strong performer in both accuracy and robustness, and also of low computational complexity.
Authors: Christian Conrad and Rudolf Mester
Conference: Scandinavian Conferences on Image Analysis, Espoo, Finland, June 2013
Abstract: In this work we present an approach to automatically learn pixel correspondences between pairs of cameras. We build on the method of Temporal Coincidence analysis (TCA) and extend it from the pure temporal (i.e. single-pixel) to the spatiotemporal domain. Our approach is based on learning a statistical model for local spatiotemporal image patches, determining rare, and expressive events from this model, and matching these events across multiple views. Accumulating multi-image coincidences of such events over time allows to learn the desired geometric and photometric relations. The presented method also works for strongly different viewpoints and camera settings, including substantial rotation, and translation. The only assumption that is made is that the relative orientation of pairs of cameras may be arbitrary, but fixed, and that the observed scene shows visual activity. We show that the proposed method outperforms the single pixel approach to TCA both in terms of learning speed and accuracy.
2012
Authors: Christian Conrad and Rudolf Mester
Conference: British Machine Vision Conference (BMVC), Guildford, United Kingdom, September 2012
Abstract: In this work, we study unsupervised learning of correspondences relations (point-to-point, or point-to-point-set) in binocular video streams. This is useful for low level vision tasks in stereo vision or motion estimation as well as in high level applications like object tracking. In contrast to popular probabilistic methods for unsupervised (feature) learning, often involving rather sophisticated machinery and optimization schemes, we present a sampling-free algorithm based on Canonical Correlation Analysis (CCA), and show how ‚correspondence priors‘ can be determined in closed form. Specifically, given video streams of two views of a scene, our algorithm first determines pixel correspondences on a coarse scale. Subsequently it projects those correspondences to the original resolution. For each point in video channel A, regions of high probability containing the corresponding point in image channel B are determined, thus forming correspondence priors. Such correspondence priors may then be plugged into probabilistic and energy based formulations of specific vision applications. Experimental results show the applicability of the proposed method in very different real world scenarios where the binocular views may be subject to substantial spatial transformations.
Authors: David Dederscheck, Thomas Müller and Rudolf Mester
Conference: Intelligent Vehicles Symposium (IV), Alcalá de Henares, Spain, June 2012
Abstract: In the recent years, advanced video sensors have become common in driver assistance, coping with the highly dynamic lighting conditions by nonlinear exposure adjustments. However, many computer vision algorithms are still highly sensitive to the resulting sudden brightness changes. We present a method that is able to estimate the relative intensity transfer function (RITF) between images in a sequence even for moving cameras. The according compensation of the input images can improve the performance of further vision tasks significantly, here demonstrated by results from optical flow. Our method identifies corresponding intensity values from areas in the images where no apparent motion is present. The RITF is then estimated from that data and regularized based on its curvature. Finally, built-in tests reliably flag image pairs with ‚adverse conditions‘ where no compensation could be performed.
Authors: Alvaro Guevara, Christian Conrad and Rudolf Mester
Conference: Southwest Symposium on Image Analysis an Interpretation (SSIAI), Santa Fe, New Mexico, USA, April 2012
Abstract: We present an approach to unveil the underlying structure of dynamic scenes from a sparse set of local flow measurements. We first estimate those measurements at carefully selected locations, and subsequently group them into a finite set of different dense flow field hypotheses. These flow fields are represented as parametric functional models, and the number of flow models (=clusters) is determined by an information-theory based approach. Methodically, the grouping task is a two-step clustering scheme, whose intra-cluster modeling step exploits prior knowledge on real flow fields by enforcing low curvature, and the individual covariance matrices of the sparse local flow measurements are introduced in a principled way. The method has been tested successfully on both stereo and general motion sequences from the standard Middlebury database.
Authors: Rudolf Mester
Conference: Southwest Symposium on Image Analysis an Interpretation (SSIAI), Santa Fe, New Mexico, USA, April 2012
Abstract: This paper formulates the problem of estimating motion or geometric transforms between images in a Bayesian manner, stressing the relation between continuous and discrete formulations and emphasizing the necessity to employ stochastic distributions on function spaces.
Before 2012
Rudolf Mester: Recursive live dense reconstruction: some comments on established and imaginable new approaches. In Live Dense Reconstruction Workshop, Proc. IEEE ICCV 2011, Barcelona, Spain, November 2011.
Alvaro Guevara, Christian Conrad and Rudolf Mester: Boosting segmentation results by contour relaxation. IEEE Intern. Conf. on Image Processing (ICIP 2011), pages 1405 ø1408, Brussels, September 2011.
Thomas Müller, Clemens Rabe, Jens Rannacher, Uwe Franke, Rudolf Mester: Illumination-Robust Dense Optical Flow Using Census Signatures. German Conference of Pattern Recognition (GCPR), Frankfurt, Germany, September 2011.
Christian Conrad, Alavaro Guevara, Rudolf Mester: Learning multi-view correspondences from temporal coincidences. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Colorado Springs, CO, USA, June 20-25, 2011).
Christian Conrad, Alvaro Guevara, Rudolf Mester: Rare Events as a Powerful Cue for Finding Multi-View Correspondences. Swedish Symposium on Image Analysis, SSBA 2011, Linköping, March 2011.
Rudolf Mester, Christian Conrad, Alvaro Guevara: Multichannel Segmentation Using CR: Fast Super-Pixels and Temporal Propagation. Scandinavian Conference on Image Analysis, SCIA 2011, Ystad, May 2011.
Alvaro Guevara, Rudolf Mester: Kernels for Reconstructing Nonideally Sampled Nonbandlimited Signals. IEEE Workshop on Statistical Signal Processing, Nice (F), 28-30 June 2011.
Alvaro Guevara, Rudolf Mester: Wiener crosses borders: interpolation based on second order models. Proc. SPIE Electronic Imaging 2011 San Francisco, USA January 2011.
Guevara, Alvaro; Wolenski, Peter: Convergence results for a self-dual regularization of convex problems. International Conference on Parametric Optimization and Related Topics (paraoptX) 2010, Karlsruhe.
Guevara, Alvaro; Conrad, Christian; Mester, Rudolf: Grouping visual objects based on flow structure. Bernstein Conference on Computational Neuroscience (BCCN) 2010.
Conrad, Christian; Guevara, Alvaro; Mester, Rudolf: Learning Temporal Coincidences. Bernstein Conference on Computational Neuroscience (BCCN) 2010.
Mester, Rudolf; Guevara, Alvaro; Conrad, Christian; Friedrich, Holger: Learning visual motion and structure as latent explanations from huge motion data sets. Bernstein Conference on Computational Neuroscience (BCCN) 2010.
Guevara, Alvaro; Mester, Rudolf: Minimum Variance Image Interpolation from Noisy and Aliased Samples. IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI) Austin, Texas, USA May 23-25, 2010.
Dederscheck, David; Zahn, Martin; Friedrich, Holger; Mester, Rudolf: Optical Rails: Slicing the View. Omnivis Workshop 2010 in conjunction with Robotics: Science and Systems, Zaragoza, Spain, 27.06.2010.
Dederscheck, David; Zahn, Martin; Friedrich, Holger; Mester, Rudolf: Optical Rails: View-Based Track Following with Hemispherical Environment Model and Orientation View Descriptors. 20th International Conference on Pattern Recognition ICPR 2010 Istanbul, Turkey August 23-26, 2010.
Guevara, Alvaro; Mester, Rudolf: Optimal reconstruction from samples for continuous and discrete signals: an statistical approac. DAGStat 2010, Technische Universität Dortmund, Germany, 24.03.2010.
Fürtig, Andreas; Friedrich, Holger; Mester, Rudolf: Robust Pixel Classification for RoboCup. 16. Workshop Farbbildverarbeitung Applikationszentrum (APZ) Ilmenau, Germany October 7-8, 2010.
Guevara, Alvaro; Mester, Rudolf: Signal Reconstriction from Noisy, Aliased, and Nonideal Samples: What Linear MMSE Approaches Can Achieve. European Signal Processing Conference EUSIPCO-2010, Aalborg, Denmark August 23-27, 2010.
Dederscheck, David; Zahn, Martin; Friedrich, Holger; Mester, Rudolf: Slicing the View: Occlusion-Aware View-Based Robot Navigation. 32th DAGM symposium Darmstadt, Germany September 22-24, 2010.
Dederscheck, David; Friedrich, Holge; Lenhart, Christine; Zahn, Martin; Mester, Rudolf: ‚Featuring‘ Optical Rails: View-Based Robot Guidance Using Orientation Features on the Sphere. IEEE 12th International Conference on Computer Vision Workshops, IEEE Workshop on Omnidirectional Vision, Camera Networks and Non-classical Cameras, Kyoto, Japan 04.10.2009.
Kondermann, Claudia; Mester, Rudolf; Garbe, Christoph: A Statistical Confidence Measure for Optical Flows. The 10th European Conference on Computer Vision (ECCV 2008, Marseille October 12-18, 2008.
Wedel, Andreas; Franke, Uwe; Badino, Hern; Cremers, Daniel: B-Spline Modeling of Road Surfaces for Freespace Estimation. IEEE Intelligent Vehicles Symposium, Einhoven, The Netherlands June 4-6, 2008.
Preusser, T.; Scharr, Hanno; Krajsek, Kai; Kirby, Rudolf Mester: Building blocks for computer vision with stochastic partial differential equations. International Journal of Computer Vision (IJCV) 80 (2008), Nr. 3, pp. 375 – 405.
Vaudrey, Tobi; Badino, Hern; Gehrig, Stefan: Integrating Disparity Images by Incorporating Disparity Rate. Second Workshop Robot Vision Auckland, New Zealand February 2008.
Garbe, Christoph S.; Krajsek, Kai; Pavlov, Pavel; Andres, Björn; Mühlich, Matthias; Stuke, Ingo; Mota, Cicero; Böhme, Martin; Haker, Martin; Schuchert, Tobias; Scharr, Hanno; Aach, Til; Barth, Erhardt; Mester, Rudolf; Jähne, Bernd: Nonlinear analysis of multi-dimensional signals: local adaptive estimation of complex motion and orientation patterns. Mathematical Methods in Time Series Analysis and Digital Image Processing.
Friedrich, Holger; Dederscheck, David; Rosert, Eduard; Mester, Rudolf: Optical Rails. View-based Point-To-Point Navigation using Spherical Harmonics. 30th DAGM Symposium München 10.-13. Juni 2008.
Krajsek, Kai; Menzel, M.; Zwanger, M.; Scharr, Hanno: Riemannian Anisotropic Diffusion for Tensor Valued Images. Computer Vision – ECCV 2008, Marseille, France October 12-18, 2008.
Dederscheck, David; Friedrich, Holger; Lenhart, Christine; Penc, Joachim; Rosert, Eduard; Scherer, Maximilian; Mester, Rudolf: Running on Optical Rails: Theory, Implementation and Testing of Omnidirectional View-based Point-To-Point Navigation. The 8th Workshop on Omnidirectional Vision, Camera Networks and Non-classical Cameras – OMNIVIS - INRIA a CCSD electronic archive server based on P.A.O.L (France).
Krajsek, Kai; Mester, Rudolf; Scharr, Hanno: Statistically Optimal Averaging for Image Restoration and Optical Flow Estimation. 30th DAGM Symposium München 10.-13. Juni 2008.
Badino, Hernan; Vaudrey, Tobi; Franke, Uwe; Mester, Rudolf: Stereo-based Free Space Computation in Complex Traffic Scenarios. Southwest Symposium on Image Analysis and Interpretation, Santa Fe, New Mexico, USA March 2008.
Franke, Uwe; Gehrig, Stefan; Badino, Hernan; Rabe, Clemens: Towards Optimal Stereo Analysis of Image Sequences. Second Workshop Robot Vision Auckland, New Zealand February 2008.
Friedrich, Holger; Dederscheck, David; Rosert, Eduard; Mester, Rudolf: View-based navigation using spherical harmonics and ‚optical rails‘. Symposium on Image Analysis, Lund, Sweden March 12-14, 2008
Friedrich, Holger; Dederscheck, David; Mutz, Martin; Mester, Rudolf: View-based Robot Localization Using Illumination-invariant Spherical Harmonics Descriptors. 3rd International Conference on Computer Vision Theory and Applications (VISAPP/VISGRAPP) Funchal, Madeira, Portugal 22.-25.01.2008
Badino, Hernan: A Robust Approach for Ego-Motion Estimation Using a Mobile Stereo Platform. First International Workshop on Complex Motion (IWCM), Günzburg, Germany Oct. 12 – 14, 2004.
Krajsek, Kai; Mester, Rudolf: A Unified Theory for Steerable and Quadrature Filters. International Conferences VISAPP and GRAPP 2006, Portugal February 25-28, 2006.
Gehrig, Stefan; Badino, Hernan; Gall, Jürgen: Accurate and Model-Free Pose Estimation of Crash Test Dummies. Human Motion – Understanding, Modeling, Capture and Animation, 2007, pp. 443-466.
Krajsek, Kai; Mester, Rudolf: Bayesian Model Selection for Optical Flow Estimation. 29th DAGM Symposium, Heidelberg September 12th-14th, 2007.
Jähne, B.; Mester, Rudolf; Barth, E.; Scharr, Hanno: Complex Motion. First International Workshop on Complex Motion (IWCM) Günzburg Oct. 12 – 14, 2004.
Badino, Hernan; Franke, Uwe; Mester, Rudolf: Free Space Computation Using Stochastic Occupancy Grids and Dynamic Programming. Workshop on Dynamical Vision, ICCV International Conference on Computer Vision (ICCV), Rio de Janeiro October 2007
Kominiarczuk, Jakub K.; Krajsek, Kai; Mester, Rudolf: Highly Accurate Orientation Estimation Using Steerable Filters. IEEE International Conference on Image Processing (ICIP 2007), San Antonio, USA September 16th-19th, 2007
Mester, Rudolf: Statistical certainty models in image processing. EEE/SP 14th Workshop on Statistical Signal Processing, Madison, Wisconsin (USA) August 26-29, 2007
Friedrich, Holger; Dederscheck, David; Krajsek, Kai; Mester, Rudolf: View-based Robot Localization Using Spherical Harmonics: Concept and First Experimental Results. 29th DAGM Symposium Heidelberg September 12th-14th, 2007.
Krajsek, Kai ; Mester, Rudolf: Wiener-optimized discrete filters for differential motion estimation. Complex Motion (IWCM04). 1st International Workshop on Complex Motion (IWCM04) Reisensburg/Günzburg, Germany October 12-14.
Krajsek, Kai ; Mester, Rudolf: A maximum likelihood estimator for choosing the regularization parameters in global optical flow methods. International Conference on Image processing, Atlanta 8-11 October.
Mühlich, Matthias ; Aach, Til: A Theory for Multiple Orientation Estimation. European Conference on Computer Vision (ECCV), Graz 2006.
Krajsek, Kai ; Mester, Rudolf: A Unified Theory for Steerable and Quadrature Filters. VISAPP Setbal, Portugal February 25-28, 2006.
Gehrig, Stefan ; Badino, Hernan ; Payson, Pascal: Accurate and Model-Free Pose Estimation of Small Objects for Crash Video Analysis. British Machine Vision Conference. BMVC06 Edinburgh, United Kingdom September 2006.
Krajsek, Kai ; Mester, Rudolf: Marginalized maximum a posteriori hyper-parameter estimation for global optical flow techniques. 26th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Paris, France July 8-13, 2006.
Krajsek, Kai ; Mester, Rudolf: On the equivalence of variational and statistical differential motion estimation. Southwest Symposium on Image Analysis and Interpretation Denver, Colorado March 26-28, 2006.
Badino, Hernan; Franke, Uwe ; Rabe, Clemens ; Gehrig, Stefan: Stereo-vision based detection of moving objects under strong camera motion. International Conference on Computer Vision Theory and Applications Setubal, Portugal February 2006.
Krajsek, Kai ; Mester, Rudolf: The Edge Preserving Wiener Filter for Scalar and Tensor Valued Images. 28th DAGM Symposium Berlin 12.-14. September 2006.
Franke, Uwe ; Rabe, Clemens ; Badino, Hern·n ; Gehrig, Sfefan: 6D-Vision: Fusion of Stereo and Motion for Robust Environment Perception. 27th DAGM Symposium Vienna, Austria August 31th – September 2nd, 2005.
Mühlich, Matthias ; Mester, Rudolf: A fast algorithm for statistically optimized orientation estimation. DAGM 2005 – 27th Annual meeting of the German Association for Pattern Recognition Vienna, Austria August 31th – September 2nd, 2005.
Mühlich, Matthias; Mester, Rudolf: Optimal Estimation of Homogeneous Vectors. Scandinavian Conference on Image Analysis 2005 Joensuu, Finland July 2005.
Krajsek, Kai ; Mester, Rudolf: Signal and noise adapted filters for differential motion estimation. DAGM 2005 – 27th Annual meeting of the German Association for Pattern Recognition Vienna, Austria August 30th – September 2nd, 2005.
Mühlich, Matthias: Subspace Estimation with Uncertain and Correlated Data. Geometric Properties from Incomplete Data Dagstuhl 21.03.-26.03.04.
Mühlich, Matthias ; Mester, Rudolf: A Statistical Extension of Normalized Convolution and its Usage for Image Interpolation and Filtering. 12th European Signal Processing Conference (Eusipco) Vienna, Austria September 6-10, 2004.
Mühlich, Matthias ; Mester, Rudolf: A statistical unification of image interpolation, error concealment, and source-adapted filter design. 6th IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI ’04) Lake Tahoe, USA 28-30 March 2004).
Mester, Rudolf: Special Session on the Convergence of Computer Vision and Visual Communication. Picture Coding Symposium 2004 San Francisco December 2004.
Comaniciu, D. ; Kanatani, K. ; Mester, Rudolf: Statistical Methods in Video Processing. ECCV 2004 Workshop SMVP 2004 Prague, Czech Republic May 16, 2004.
Mester, Rudolf: Towards a unified theory of motion estimation: bridging the gap between differential and matching approaches. Picture Coding Symposium 2004 San Francisco December 2004.
Mühlich, Matthias ; Mester, Rudolf: Unbiased Errors-In-Variables Estimation Using Generalized Eigensystem Analysis. 2nd Workshop on Statistical Methods in Video Processing (SMVP) during the 8th European Conference on Computer Vision (ECCV) Prague, Czech Republic May 11-14, 2004
Mester, Rudolf: A new view at differential and tensor-based motion estimation schemes. 25th annual conference of the Deutsche Arbeitsgemeinschaft für Mustererkennung (DAGM) Magdeburg 10.-12. September 2003.
Aach, T. ; Toth, D. ; Mester, Rudolf: Motion estimation in varying illumination using a Total Least Squares distance measure. Picture Coding Symposium 2003 Saint-Malo, France April 23-25, 2003.
Mester, Rudolf: On the mathematical structure of direction and motion estimation. 3rd International Symposium on Physics in Signal and Image Processing (PSIP) Grenoble, France January 29-31, 2003.
Mester, Rudolf: Special session on Advanced Methods for Motion Analysis. International Conference on Image Processing (ICIP 2003) Barcelona, Spain September 2003.
Mester, Rudolf: The generalization, optimization and information-theoretic justification of filter-based and autocovariance-based motion estimation. IEEE International Conference on Image Processing (ICIP) Barcelona, Spain September 14-17, 2003.
Mester, Rudolf: A system-theoretical view on local motion estimation. IEEE Southwest Symposium on Image Analysis and Interpretation Santa Fe, USA April 2002.
Mester, Rudolf: Some steps towards a unified motion estimation procedure .45th IEEE MidWest Symposium on Circuits and Systems (MWSCAS) Tulsa, Oklahoma August 4-7, 2002).
Mühlich, Matthias ; Mester, Rudolf: A considerable improvement in non-iterative homography estimation using TLS and equilibration. In: Pattern Recognition Letters 22 (2001), Nr. 11, S. 1181-1189.
Aach, T. ; Mester, Rudolf: Bayesian Illumination-invariant Change Detection using a Total Least Squares Test Statistic. Colloque GRETSI 2001 Toulouse, France September 2001.
Aach, T. ; Dümbgen, L. ; Toth, D. ; Mester, Rudolf: Bayesian Illumination-invariant Motion Detection. IEEE Signal Processing Society 2001 International Conference on Image Processing Thessaloniki, Greece October 7-10, 2001.
Mester, Rudolf ; Aach, T. ; Dümbgen, L.: Illumination-invariant change detection using a statistical colinearity criterion. Pattern Recognition: Proceedings 23rd DAGM Symposium München September 12-14, 2001.
Mester, Rudolf ; Mühlich, Matthias: Improving Motion and Orientation Estimation Using an Equilibrated Total Least Squares Approach. IEEE International Conference on Image Processing (ICIP) Thessaloniki, Greece October 7-10, 2001.
Mühlich, Matthias ; Mester, Rudolf: Subspace Methods and Equilibration in Computer Vision. Scandinavian Conference on Image Analysis Bergen, Norwegen Juni 2001.
Mester, Rudolf: Orientation estimation: conventional techniques and a new non-differential method. European Signal Processing Conference (EUSIPCO’2000) Tampere, Finnland September 2000.
Mühlich, Matthias ; Mester, Rudolf: A Note on Error Metrics and Optimization Criteria in 3D Vision. IEEE Intern. Workshop on Vision, Modeling, and Visualization Erlangen, Germany November 1999.
Trautwein, S. ; Mühlich, Matthias ; Mester, Rudolf: Estimating Consistent Motion From Three Views: An Alternative To Trifocal Analysis. Computer Analysis of Images and Patterns (CAIP’99) Ljubljana (Slowenien) September 1999).
Feiden, Dirk ; Mühlich, Matthias ; Mester, Rudolf: Robuste Bewegungsschätzung aus monokularen Bildsequenzen von planaren Welten. 21. DAGM-Symposium Bonn September 1999.
Mühlich, Matthias ; Mester, Rudolf: Ein verallgemeinerter Total Least Squares Ansatz zur Schätzung der Epipolargeometrie. Jahrestagung der Dt. Arbeitsgemeinschaft für Mustererkennung (DAGM’98) Stuttgart September 1998.
Mühlich, Matthias ; Mester, Rudolf: The Role of Total Least Squares in Motion Analysis. European Conference on Computer Vision (ECCV) Freiburg Juni 1998.
Mester, Rudolf: Stabilitäts- und Zuverlässigkeitsaspekte im Zusammenhang mit dem Structure from Motion Problem. Jahrestagung der Deutschen Gesellschaft für Photogrammetrie und Fernerkundung (DGPF) 1997 Frankfurt am Main September 1997.
Mester, Rudolf: Stochastische Modelle und Methoden in der Bildsequenzanalyse. Aachener Symposium zur Signaltheorie, Aachen März 1997.
Hötter, M. ; Mester, Rudolf ; Müller, F.: Detection and description of moving objects by stochastic modelling and analysis of complex scenes. Signal Processing: Image Communication 8 (1996), S. 281-293.
Hötter, M. ; Mester, Rudolf ; Meyer, M.: Detection of Moving Objects Using a Robust Displacement Estimation Including a Statistical Error Analysis. 13th International Conference on Pattern Recognition Wien August 1996.
Hötter, M. ; Mester, Rudolf ; Müller, F.: Moving Object Detection in Image Sequences using Texture Features. International Workshop on Time-varying Image Processing and Moving Object Recognition Florenz September 1996.
Mester, Rudolf ; Hötter, M. ; Pöchmüller, W.: Umwelterfassung mit bewegten Kameras. Aktives Sehen in technischen und biologischen Systemen, Workshop der GI-Fachgruppe 1.0.4 Bildverstehen Hamburg Dezember 1996.
Hötter, M. ; Mester, Rudolf ; Meyer, M.: Detection of Moving Objects in Natural Scenes by a Stochastic Multi-Feature Analysis of Video Sequences. 29th Annual 1995 International IEEE Carnahan Conference on Security Technology Sanderstead, Surrey, England Oktober 1995.
Aach, T. ; Kaup, A. ; Mester, Rudolf: On texture analysis: Local energy transforms versus quadrature filters. Signal Processing 45 (1995), Nr. 2, S. 173-182.
Mester, Rudolf ; Hötter, M.: Robust displacement vector estimation including a statistical error analysis. 5th International Conference on Image Processing and its Applications Edinburgh, UK, 1995, S. 168-172.
Mester, Rudolf ; Hötter, M.: Zuverlässigkeit und Effizienz von Verfahren zur Verschiebungsvektorschätzung. 17. DAGM Symposium Bielefeld September 1995.
Aach, T. ; Kaup, A. ; Mester, Rudolf: Change detection in image sequences using Gibbs random fields: a Bayesian approach. IEEE Workshop Intelligent Signal Processing and Communications Systems Sendai, Japan Oktober 1993.
Aach, T. ; Kaup, A. ; Mester, Rudolf: Statistical model-based change detection in moving video. Signal Processing 31 (1993), Nr. 2, S. 165-180.
Mester, Rudolf ; Franke, U.: Spectral entropy-activity classification in adaptive transform coding. IEEE Journal on Selected Areas in Communications 10 (1992), Nr. 5, S. 913-917.
Aach, T. ; Kaup, A. ; Mester, Rudolf: A statistical framework for change detection in image sequences. Symposium on signal and image processing Juan-Les-Pins September 1991.
Mester, Rudolf ; Aach, T. ; Franke, U.: Image segmentation experiments using the contour relaxation algorithm. In: Zeitschrift für Photogrammetrie und Fernerkundung (1991), Nr. 4, S. 127-132.
Aach, T. ; Kaup, A. ; Mester, Rudolf: Combined displacement estimation and segmentation of stereo image pairs based on Gibbs random fields. Intern. Conference on Acoustics, Speech, and Signal Processing ICASSP’90 Albuquerque April 1990.
Aach, T. ; Dawid, H. ; Mester, Rudolf: 3D-Segmentierung von Kernspintomogrammen unter Verwendung eines stochastischen Objektformmodells. Jahrestagung der Gesellschaft für Medizinische Datenverarbeitung und Statistik (GMDS) Aachen September 1989).
Mester, Rudolf ; Franke, U. ; Aach, T.: Fortschritte in der Modellierung natürlicher Bilder. In: ITG Fachbericht 107 „Stochastische Modelle und Methoden in der Informationstechnik“ (1989), S. 29-34.
Aach, T. ; Mester, Rudolf ; Franke, U.: Top-down image segmentation using object detection and contour relaxation. Intern. Conference on Acoustics, Speech, and Signal Processing ICASSP’89 Glasgow Mai 1989.
Franke, U. ; Mester, Rudolf ; Aach, T.: Constrained iterative restoration techniques: A powerful tool in region based texture coding. European Signal Processing Conference EUSIPCO-88 Grenoble, France September 1988.
Aach, T. ; Mester, Rudolf ; Franke, U.: From texture energy measures to quadrature filter pairs – A system theoretical view of texture feature extraction. European Signal Processing Conference EUSIPCO-88 Grenoble, France September 1988.
Mester, Rudolf ; Franke, U.: Image segmentation on the basis of statistical models for region oriented image coding. Picture Coding Symposium, Turin September 1988.
Mester, Rudolf ; Franke, U. ; Aach, T.: Image segmentation using likelihood ratio tests and Markov region shape models. European Signal Processing Conference EUSIPCO-88 Grenoble, France September 1988.
Franke, U. ; Mester, Rudolf: Region based image representation with variable contour and texture reconstruction quality. Cambridge Symposium on Visual Communications and Image Processing 88 Cambridge/Mass. November 1988.
Franke, U. ; Mester, Rudolf: Representation of the texture signals in region-based image coding schemes: A comparative study. Picture Coding Symposium Turin September 1988.
Mester, Rudolf ; Franke, U. ; Aach, T.: Segmentation of image pairs and sequences by contour relaxation. DAGM Symposium Mustererkennung Zürich September 1988.
Mester, Rudolf ; Franke, U.: Statistical model based image segmentation using region growing, contour relaxation and classification. Cambridge Symposium on Visual Communications and Image Processing Cambridge/Mass. November 1988).