Get Complete Project Material File(s) Now! »
Region-Based ‘Bayesian’ Formulation
Paragios and Deriche (2002) presented a unified approach, namely the geodesic active re-gion model to deal with frame partition problems. This model combines the edge-based functional from the geodesic active contour model, and the region-based functional derived from Maximum a Posteriori (MAP). The boundary term is similar to geodesic active con-tour proposed by Caselles et al. (1997). The boundary is regular, of minimal length, and attracted by the real region boundary. The region term aims at finding a contour C that maximizes two posterior probabilities of image intensities inside C and outside C. To sim-plify the problem, Paragios and Deriche (2002) made two assumptions: 1) all the contours are equally possible, i.e. P(C) = 1/Z, where Z is the total number of possible contours; 2) there is no correlation between the regions labeling, and all the pixels within each region are identically and independently distributed. Taking these into account, based on the Bayes’ theorem, the a posteriori segmentation probability for a contour C given the observed image I is determined by P(C|I) = Y Y x∈Ωin x∈Ωout Pin(I(x)) Pout(I(x)) . (1.16).
Maximization of the a posteriori segmentation probability is equivalent to minimizing its negative logarithm. The geodesic active region objective function is defined as Z Z E(C) = − log Pin(I(x)) dx − log Pout(I(x)) dx + µ L(C) . (1.17).
Active Contours with Shape Constraints
In the previous subsections, all the presented active contours, whether edge-based or region-based, drive the curve evolution based on the information from the image, except perhaps for having some regularization term to smooth boundary. In this case, little prior know-ledge about the object exists in the model, and hence the data quality has a large impact on the segmentation accuracy. Unfortunately, in many real applications, the image data is often damaged by noise or partial occlusion; or the contrast between the object and the background is poor. These factors may lead to unsuccessful segmentation. Recently, a large amount of work on the inclusion of shape knowledge in active contour models has been pre-sented. Having prior information about the expected shape of the object can significantly increase the robustness of the segmentation algorithm. In general, in a variational frame-work, a prior energy EP related to the shape knowledge is integrated together with a data energy ED describing the image information. A constant weight is normally used to bal-ance the contributions of these two terms. Note that Bailloeul (2005) converted the constant weight into a space function, to relax the influence of the constraints near the zero level set of the prior shape, while keeping a strong and uniform constraint far from the prior shape. This makes the active contours resemble the prior shape globally, but with some variations locally. Chen et al. (2001, 2002) considered a non-probabilistic model that minimizes an energy functional depending on the image gradient and the shape of interest. The basic idea is to influence the geometric active contour with a vector field that depends on the shape prior. The shape prior energy is formulated as Z 1 EP(C, T ) = d2(T (C(p))) |C0(p)| d p , (1.23).
where d(x) = d(C∗, x) is the distance of the point x from the shape reference C∗. Here T represents scale, rotation and translation transform. Therefore, this term evaluates the similarity of the shape of the contour C to C∗, but making use of a single shape reference (though with diﬀerent size, orientation and translation), this model cannot handle a large variation of shapes. To solve this limit, the authors suggested grouping the sampled curves into k clusters for some k, and finding the average shape C∗i(i = 1, . . . , k) in each group. However, this model is still sensitive to the initialization because of the inclusion of an edge-based data term.
Leventon et al. (2000) extended geometric active contours (Caselles et al., 1997; Kichenas-samy et al., 1995) by incorporating shape information into the evolution process. They computed a statistical shape model over a training set of curves, in which principal compo-nent analysis (PCA) is used to derive the principal deformation modes. The segmentation process is implemented in the level set framework, and the level set surface is driven solely by a data term in the first step. At each step of the evolution, the MAP position and shape of the object in the image are estimated, based on the prior shape information and the image information. In the second step, a correction of the previous segmentation is performed locally based on image gradients and curvatures, and globally towards the MAP estimate. Thus, an additional shape influence term is introduced to the original evolution equation of geometric active contours.
Multiscale and Multiresolution Analysis
A multiscale approach oﬀers an appropriate framework to fuse the information from diﬀe-rent scales. This is particularly important for high resolution images, because the road extraction task can be greatly facilitated with the aid of road detection results at reduced resolution, which needs less eﬀort at the cost of probable mistakes.
Baumgartner et al. (1997) presented a multiresolution approach for automatic extraction of roads from aerial images. For diﬀerent context regions, i.e. rural, forest, and urban areas, the model describes relations between background objects and road objects. The approach to road detection is based on the extraction of edges in a high resolution image and the extraction of lines in a reduced resolution image. Using both resolution levels and explicit prior knowledge about roads, hypotheses for road borders are generated. Quadrilaterals and polygons are constructed to represent road-parts and intersections. Neighboring road-parts are chained to form road-segments. Road-links are built by grouping road-segments and closing gaps.
For high resolution images, such as airborne images, Benharrosh (1998) aimed to obtain a multiresolution image with some parts at full resolution and others at coarser resolution. They proposed three methods of adaptive filtering that smooth an image while preserving cartographic structuring features or interesting regions of the image. The first one relies on an anisotropic diﬀusion method introduced by Saint-Marc et al. (1991). The second is based on wavelet theory. And the third method is based on an analogy between the image and an electrical network. These methods are applied to the generation of an adaptive quick-look and adaptive compression. Couloigner and Ranchin (2000) defined a hierarchical method to extract urban road networks semi-automatically from 2m resolution space-borne images. This method is per-formed in two steps. First, given two endpoints on both sides of the road, the two sides of each road are located from the original image and the scaling coeﬃcients of a wavelet trans-form at diﬀerent spatial resolutions. In the second step, the strip(s), if existing, is extracted using the detected road borders and the wavelet coeﬃcients at characteristic scales.
Prinet et al. (2000) addressed scale selection for curvilinear structure detection. This approach is based on the study of the existence and stability properties of the crest-lines that localize the center axis of the structures to be retrieved. The authors showed that the crest-lines exist and are stable over a band of width scale values. The selection of diﬀerent scales according to diﬀerent local resolution of the structure is followed by a fusion scheme that enables the merging of crest-lines. Zhang and Couloigner (2003) also made use of an analysis in the wavelet domain to detect road features; they aimed at providing an automatic method. In their framework for road network change detection, road sides, centerlines and junctions are extracted based on the observation of wavelet-based road features. Along the road gradient direction, the road centerline and the road borders correspond to respectively a local maximum and two zeros of wavelet coeﬃcients; a road junction is localized by a local maximum of wavelet coeﬃcients in a 7 × 7 neighborhood; the linked road branches are identified as the local maximum pixels along the border of the above neighborhood. However, this wavelet-based method is restricted to simple scenes such as countryside. Note that the work of Mayer et al. (1998), Laptev et al. (2000) and Peteri´ and Ranchin (2003), introduced in subsection 1.2.2 on active contours, also used a multiscale or multire-solution strategy.
Dynamic programming formulates an extraction problem as the minimization of a cost func-tion defined on a graph. Given knowledge about the presence of roads in the form of end-points, usually fixed manually, the algorithm is forced to track a connecting path between them through the image which best fits the model. Since this method restricts the topology of networks, dynamic programming is potentially semi-automatic. In this sense, a more reliable handling of obstacles is possible.
Fischler et al. (1981) aimed to extract roads from low-resolution aerial images. Diﬀerent low level operators are classified into two types: Type I operators identify roads with a high accuracy, but some roads may be missing; Type II operators extract all roads completely, but may yield some false detections. By combining them, a cost array is defined. Between two given endpoints, the best path, which gives the lowest cost, is chosen as the road using the F∗ algorithm.
Merlet and Zerubia (1996) extended the F∗ algorithm of Fischler et al. (1981) to cliques and to neighborhoods larger than one. By means of the cliques of more than two points, contrast information is introduced into the calculation of the minimum cost path, and the larger neighborhoods allow for the consideration of the curvature of the final path. All the needed information are synthesized in a unique cost function. Thanks to the curvature constraint, even if the initial endpoints are incorrect, or are located on diﬀerent roads, a correct path may still be obtained. The method is applied to detect roads and valleys from low-resolution SPOT satellite images. Barzohar and Cooper (1996) presented an automatic approach to find main roads in aerial images. Geometric-stochastic models are built for representing road images, and then MAP estimation is used for estimating the road boundaries in an image. First, the image is partitioned into windows. They searched the small windows to obtain initial road candidates by dynamic programming. Then, starting with the windows containing high confidence estimates, they obtained optimal global estimates of the roads using dynamic programming again. The algorithm produces two boundaries for each road.
Bonnefon et al. (2002) proposed a semi-automatic tracking method using the F∗ algo-rithm. From a starting point, they used a small search window in which the algorithm tries to find the optimal path. Costs for the F∗ algorithm depend on the diﬀerence of radiometry values between the original SPOT image and the detection image computed from it. The last point of the optimal path in the search window becomes the new starting point. The algo-rithm identifies the detected linear features using texture information from high-resolution images.
Dal Poz and do Vale (2003) presented a dynamic programming approach for road cen-terline extraction from medium- and high-resolution images. This approach is a modifica-tion of a pre-existing dynamic programming approach, proposed by Gruen and Li (1997), for road extraction from low-resolution images. A constraint that the gradients at road edges are antiparallel is incorporated into the cost function. This allows the approach to treat the road as a ribbon feature. Bucha et al. (2006) also used dynamic programming to extract road centerlines, but the weights of the edges in the graph are provided by a force field, drawn such that at each pixel, a two-dimensional vector defines interactions between pixels. In such an image representation, the vector is oriented to the center of the region composed of pixels having the same qualitative property, such as color and gray-scale level.
Mathematical morphology is a theory for the analysis and processing of signals in terms of geometrical structures, developed by Matheron (1975) and Serra (1982). Mathematical morphology can characterize topological and geometrical concepts such as size, shape, con-vexity, connectivity, and geodesic distance, on both continuous and discrete spaces. Mor-phological image processing is based on this theory. It consists of a set of operators, e.g. union, intersection and complementation, as well as dilation, erosion, openings, closings, thinning and other derived operators. These operators transform the image according to the above characterizations. Mathematical morphology was originally proposed to process binary images, and was later extended to gray-scale images and multi-band images.
Zhang et al. (1999) proposed an approach for detecting road networks from high re-solution images using a combination of mathematical morphology operations. In the pre-processing stage, the image is segmented and road network regions are separated from their surroundings. Morphological trivial opening is then adopted. A criterion T is defined as the long axis of the minimal ellipse which encloses an object. Only the connected compo-nents that satisfy the criterion T are retained after morphological reconstruction. Thus, this process preserves the elongated road areas, and filters out almost all the houses and small clusters of noise as well. The result is further refined by filling holes, removing small paths and recovering shadowed areas. In the high resolution simulation images and aerial photos, the approximate road centerline network is finally extracted. However, road gaps may still exist, if road surfaces are completely broken and there is no other information supporting the linkage.
Chanussot and Lambert (1998) introduced a simple and fast unsupervised method for the automatic extraction of road networks in SAR images. A series of morphological ope-rators is used in order to retain elongated structures with a specific width. The sequence of morphological filtering consists of an opening by reconstruction, a directional closing in 40 successive directions, an opening, and a closing top-hat operator. At every step, flat struc-turing elements are used, and their size is specified according to a priori information about the road’s maximum width and curvature. In the final stage, the roads are extracted by a simple thresholding applied to the response of the morphological operators. The drawback of this method is that the lack of contextual knowledge results in partial detection of the road network together with several spurious detections. Katartzis et al. (2001) combined and extended two earlier approaches for road detection in SAR satellite images, and described a model based method for the automatic extraction of linear features from aerial imagery. In fact, the authors made use of both bottom-up and top-down processing, i.e. a combined strategy. During its first local analysis step, to detect elongated structures, a set of morphological operators proposed by Chanussot and Lambert (1998) is modified to enhance the performance of the morphological filtering in the case of heavily noisy environments and partially disconnected roads. In its second global analysis step, a segment linking process is performed with some prior information, based on the Bayesian framework of Tupin et al. (1998) (which will be introduced in subsection 1.2.6).
In his survey of previously existing applications of mathematical morphology in geo-science and remote sensing, Soille and Pesaresi (2002) analyzed the suitability of morpho-logical operators for the processing of Earth observation data, detailed some new advances in the theory of mathematical morphology, and demonstrated their eﬃciency for extracting structural information from Earth observation data.
Markov Random Fields and Marked Point Processes
Contextual constraints are a general and powerful way to model spatial properties. Markov random fields (MRFs) provide a convenient and consistent way to model context-dependent entities such as image pixels and correlated features. MRF based models have been widely used to identify road networks.
Tupin et al. (1998) used both local and global techniques for linear feature extraction. First, they performed a local detection of linear structures based on the fusion of the results from two line detectors, both taking into account the statistical properties of speckle in SAR images. The masks of the line detectors have widths ranging from one to a maximal number of five pixels. The produced candidate road segments are then organized as a graph, together with an additional set of segments that correspond to all possible connections between them. The road identification is solved by the extraction of the best graph labeling based on an MRF model for road like structures and a MAP criterion. As a result, this algorithm allows complex topology and a priori detection; but its drawback is that the road network has to be represented as a graph, and the number of nodes is fixed.
Stoica et al. (2004) modeled thin networks as ensembles of line segments embedded in the image domain. A point process controls network parameters such as connectivity and curvature. The road network is approximated by connected line segments under constraints enforced by the interaction model. The specific properties of the road network in the image are described by the data term. The probabilistic model is solved by simulated annealing based on a Reversible Jump Markov Chain Monte Carlo algorithm. The main advantage is that during the optimization process, new segments can be created, and their location and orientation can be changed. Unlike the method of Tupin et al. (1998), the dimension of the space is not fixed. Results are shown on SPOT, ERS and aerial images. Lacoste et al. (2005) developed an extension of this model.
Negri et al. (2006) proposed a general processing framework for urban road network extraction in high-resolution SAR images. First, road candidates are extracted with two detectors. Then, road network topology is optimized with an MRF model, incorporating the prior knowledge about road junctions. A final network regularization step is based on perceptual grouping concepts.
Support Vector Machines
Support Vector Machine (SVM) is a powerful classification technique based on the princi-ples of statistical learning theory. SVM works by finding the hyperplane with the largest margin in the feature space that separates the positive and negative training samples.
Yager and Sowmya (2003) applied SVM to road extraction from rural areas using edge-based features. In the first level, the edges are found using a Canny edge detector. The edge length and the edge gradient are the features used to classify edges as road edges or non-road edges. After a learning stage from a training data set, an SVM is used to classify all the edges. In the second level, opposite road edges are paired to create road segments. They are classified as positive and negative edge pairs, using the pair width and the enclosed intensity. Another SVM is trained from the pair samples to classify all possible edge pairs. An extension of this work can be found in (Lai et al., 2005).
Song and Civco (2004) performed a two-step approach for road extraction from rural and urban areas. In the first step, SVM is employed to classify the input image into a road group and a non-road group. In the second step, the road group image is segmented into geometrically homogeneous objects using a region growing technique based on a similarity criterion, with higher weighting on shape factors over spectral criteria. Finally, a threshold-ing on the shape index and density features derived from these objects is applied to extract road features, which are further processed by thinning and vectorization to obtain road cen-terlines.
Utilization of New Sensor Data
The utilization of new sensor data is a new tendency for research in road extraction to overcome the diﬃculties from panchromatic satellite and aerial images.
Zhu et al. (2004) proposed an automatic road extraction technique that combines in-formation from LIDAR data and aerial optical images. Firstly, the method obtains height and edges of high objects from LIDAR data. Secondly, digital images are analyzed at these edges for road detection. Finally, shadowed parts are reconstructed by a spline-approximation algorithm. Hu et al. (2004) also combined information from LIDAR data and aerial optical images, taking advantage of deriving multiple clues and constraints to significantly minimize the uncertainty in the extraction process. To extract roads from LIDAR, Rottensteiner et al. (2005) used a two step process. First, roads are detected by a hierarchical technique classifying the LIDAR points into road or non-road. Each step in the classification hierarchy addresses the appearance of roads in the sensor with respect to spectral, geometric, topological, and contextual characteristics. Secondly, road vectorization is performed. The road centerline, orientation and width are detected.
Table of contents :
1.1 Active Contours
1.1.1 Edge-Based Active Contours
1.1.2 Region-Based Active Contours
1.1.3 Active Contours with Shape Constraints
1.2 Road Extraction
1.2.1 Road Characteristics
1.2.2 Active Contours
1.2.3 Multiscale and Multiresolution Analysis
1.2.4 Dynamic Programming
1.2.5 Morphological Methods
1.2.6 Markov Random Fields and Marked Point Processes
1.2.7 Recursive Filtering
1.2.8 Support Vector Machines
1.2.9 Utilization of New Sensor Data
1.2.10 Other Methods
2 Higher-Order Active Contours and Phase Fields
2.1 A General Framework for Image Segmentation
2.2 Summary of Higher-Order Active Contours and Phase Fields
2.2.1 Higher-Order Active Contours
2.2.2 Phase Field Models
2.3 Stability Analysis of the Standard HOAC Total Prior Model
2.3.1 Definition of a Bar
2.3.2 Basic Phase Field Term of a Bar
2.3.3 Standard Phase Field HOAC Term of a Bar
2.3.4 Standard HOAC Total Prior Model of a Bar
2.4 Overall Model for Linear Network Extraction
2.4.1 Data Energy
2.4.2 Optimization and Parameter Setting
3 Multi resolution Analysis of the Primary Model
3.1 Multiresolution Analysis and Wavelets
3.1.1 Definition of Wavelet Based Multiresolution Analysis
3.1.2 Haar Multiresolution Analysis
3.2 Model Definition at Multiple Resolutions
3.2.1 Multiresolution Data Model
3.2.2 Multiresolution Framework
3.3 Experimental Results and Comparisons
3.3.1 Results Using the Single-Resolution Model
3.3.2 Results Using the Multiresolution Model
3.3.3 Results Using the Multiresolution Framework and Comparisons
4 GIS Specific Prior for Map Updating
4.2 Specific Prior Energy
4.3 Experimental Results and Comparisons
4.3.1 Results Using the GIS-Based HOAC Model
4.3.2 Evaluation and Comparison
5 Modeling Shape for Network Extraction
5.2 Nonlinear Nonlocal HOAC Prior Energy
5.2.1 Contour Model Definition
5.2.2 Phase Field Model Definition
5.2.3 Stability Analysis of the Nonlinear Nonlocal HOAC Total Prior Model
5.3 Linear Nonlocal HOAC Prior Energy
5.3.1 Contour Model Definition
5.3.2 Phase Field Model Definition
5.3.3 Stability Analysis of the Linear Nonlocal HOAC Total Prior Model
5.4 Experimental Results and Comparisons
5.4.1 Nonlinear Nonlocal Overall Model
5.4.2 Linear Nonlocal Overall Model
A Stability Calculations
A.1 Energy Terms of a Bar
A.1.1 Basic Phase Field Term
A.1.2 Standard Phase Field HOAC Term
A.1.3 Nonlinear Nonlocal Phase Field Term
A.1.4 Linear Nonlocal Phase Field Term
A.2 Model Energy per Unit Length
A.2.1 Standard HOAC Total Prior Model
A.2.2 Nonlinear Nonlocal HOAC Total Prior Model
A.2.3 Linear Nonlocal HOAC Total Prior Model
B Evolution Equations of New HOAC Prior Energies
B.1 Nonlinear Nonlocal HOAC Prior Energy
B.2 Linear Nonlocal HOAC Prior Energy
C Another Nonlinear Nonlocal HOAC Prior Term
C.1 Definition of ˜ENL
C.2 Derivative of ˜E NL
D Another Type of Interaction Function
E Summary of Other Methods Used in Our Comparisons
E.1 Method by Wang
E.2 Method by Yu
E.3 Method by Bailloeul
F Publications and Scientific Activities of the Author