MODEL-BASED RECOGNITION
OF OCCLUDED CURVED OBJECTS
RACHID BENLAMRI
Department of Computer Engineering,
Etisalat College of Engineering
Po. Box 980, Sharjah
UNITED ARAB EMIRATES
http://www.ece.ac.ae
Abstract: - This paper describes a new model-based approach to recognize occluded curved objects from single range images. Image segmentation is based on surface curvature detection. Object representation is expressed in terms of an edge-junction graph and a surface-adjacency graph, which are used to infer the surface patches in the scene and their spatial relationship respectively. The matching strategy is based on Local Geometrical Patterns (LGP’s) extracted from adjacent surfaces. The idea is to generate, for each similar pair of object and model LGP, a location and orientation transformation called model-base, expressed in terms of a Translation Rotation Matrix (TRM), that allows a model (defined in its model coordinate system) to be aligned onto its corresponding object (defined in the viewer-centered coordinate system). Unlike existing methods, only one LGP is used to check for matching. Furthermore, once the two coordinate systems are aligned, no more geometrical transformations are involved in the matching process. This significantly enhances the matching search-space and reduces its time complexity. The system has been tested on real range images and synthetic data. The experimental results have shown that the system can reliably recognize scenes with acceptable accuracy.
Keywords: Range image segmentation and representation, Curvature operators, Model-based recognition.
1 Introduction
Although much progress has been made in the field of 3D object recognition using range images, it is still widely accepted that the recognition of real world scenes using a single image view is a difficult task [1]. This is mainly due to the nature of the scene that may include objects with irregular shapes that have no clearly defined geometry; objects viewed from any direction; and objects with self-occlusion or partially occluded by other objects [1]. Many model-based systems [2-4] addressing these recognition problems were proposed. The main issue involved in the design of such systems was object representation that often dictates the matching strategy. Hence, any success in the recognition process will depend on the quality of image segmentation and object representation. Also, most of the existing systems use time-consuming matching algorithms. In this paper, a new model-based approach investigating some of these recognition problems is proposed. The aim is to match arbitrarily viewed occluded curved objects extracted from a single range image against a set of CAD models in a database. A typical application would be the assembly or automated inspection of parts in a manufacturing environment.
The remainder of the paper is organized as follows: section 2 describes the proposed edge detection technique. In section 3, a detailed description of the object representation scheme and modeling is given. The matching algorithm is presented in section 4, and experimental results are discussed in section 5. Finally, conclusions from the work are drawn and further research work is suggested.
2 Edge Detection
Many approaches for range image segmentation have been developed [1, 5-9]. Most of these used local-surface properties, such as surface curvature and surface normal to describe 3D shape. Unlike most existing methods [1, 7-9] which proceed in different stages searching mainly for jump and crease boundaries by computing respectively, at every point, zero-crossings and extrema of surface curvature in some chosen directions, in this paper a single-step process is adopted to localize all types of edges. It consists of applying two partial derivatives of the surface curvature along the x and y directions to signal the presence of edge points in a simpler and a more efficient way in terms of computation.
In this study, to alleviate the effects of any noise present, the image is smoothed prior to computation of surface-curvature properties. The smoothing process in this case is very important since the calculation of curvature involves second-order partial derivatives and hence may significantly magnify the effects of noise. Moreover, curvature, being a local measure, is highly noise sensitive. Consequently, noise is removed by convolving the image with a rotationally invariant Gaussian mask of variance s. In order to reduce just noise and preserve significant features in the image, different values of s in the range [0.5-2.5] were tried. The value of 1.3 was found to be the most appropriate for the range images used in this study. Once smoothing is performed, two newly developed directional curvature operators are applied to determine edge points forming surface contours. The following section summarizes the segmentation approach adopted in this study. More details can be found in [10].
Let k(p) : be the analytical expression associated to the curvature of the surface function f at the image coordinates (x,y).
{1}
Based on {1}, Fan [11] has developed the directional surface curvature along a direction f. This is given by:
{2}
where:
: is the first-order directional derivative of f
: is the second-order directional derivative of f
A = f'(0°)
B = f’(90°)
Unlike existing methods, the proposed segmentation approach adopts a one-step process to determine whether an edge point belonging to either a jump-edge, smooth edge or a crease edge exists at a point p(x,y,z). For this, two partial derivatives, kx and ky , of the main curvature k are developed along the two main directions x and y respectively.
{3}
{4}
Since k is relatively constant along a small neighborhood of a points belonging to the same surface, no matter that surface is cylindrical or spherical, the derivative operators kx and ky will have both a value equal to zero. However, crossing through a contour point along two neighboring surfaces or occluded surfaces, will result into a sudden change of the curvature k, leading to a non null-value of kx, ky, or both. Also, there will be a change of curvature when going beyond a surface orientation discontinuity (creases). Based on this assumption, the two developed derivatives are used to signal the presence of an edge point. For the discrete range image, kx and ky are approximated by differences as follows:
where:
·
·
·
·
·
·
In terms of processing, since our goal is to check whether kx and ky are equal to zero, only the computation of their numerators is of interest. This significantly reduces the amount of computation in the convolution process. Fig.1 shows some of the results of edge detection using real-range images. In order to have a more realistic view of treated objects, range images are displayed using a computer graphics rendering technique rather than encoding depth in gray-level.
1.a Real range image of scene 1
1.b Edge detection results of scene 1
1.c. Real range image of scene 2
1.d. Edge detection results of scene 2
Fig.1 Edge Detection Results of Real Range Images
3 Object Representation & Matching
The most used features to build surface-based descriptions of 3D complex shapes are surface contours [12, 13]. In this study, these are determined by grouping edge points into three geometrical primitives (linear segments, cylindrical arcs, and spherical arcs). A recursive circular search scheme is used to link those edge points belonging to a linear segment, and which are characterized by a zero-curvature and forming a connex part (fig.2.a). However, cylindrical and spherical arcs are characterized by a nearly constant curvature along a small neighborhood. For example, starting from a point A, an arc is progressively constructed by approximating the curvature around a certain neighborhood of that point. For each determined arc, the direction and the curvature-ray are computed (fig.2.b). The next stage involves fusing/linking small edge segments to form lager segments. Edge grouping is based on parallelism relations, approximate continuity in curvature, and collinearity criteria (fig.3). However experimentation shows that edge grouping based on parallelism is more reliable than grouping based on co-curvilinearity, due to the presence of noisy curve segments. Therefore, a good strategy is to fuse edge segments using parallelism criterion first, and then to apply co-curvilinearity linking, and this, until no more grouping is possible. Details of edge grouping process can be found in [10]. Once all primitives have been determined, cylindrical arcs are separated from spherical arcs by computing the Z-value of the gravity center point G of the arc. If the computed Z-value of G corresponds to that in the image buffer, then the arc is cylindrical; otherwise, it is a spherical one. Spherical arcs are further classified as convex or concave if respectively the computed Z-value of G is less than, or, greater than its corresponding value in the image buffer.
a. Search for linear segments b. Arc construction
Fig.2 Linear-segment and arc construction
Fig.3 Edge linking / fusion
Objects are defined using a surface-based approach enforced by a boundary representation scheme. In such scheme, an object is represented by its constituent parts and their spatial relationships. This is achieved by constructing an edge-junction graph based on the extracted primitives. Edge-junction characteristics are then used to infer the visible surfaces and to determine their types. The surface generation process is based on the following assumptions:
· A planar surface could either be bounded by a closed-polygon, an open-polygon (occluded or concave surface), cylindrical arcs (cylinder’s base), or a combination of open-polygons and cylindrical arcs (fig.4.a). Based on this assumption, the edge-junction graph is used to generate such surfaces and to compute their equations.
· To determine a cylindrical surface, we first use a single cylindrical arc to generate a cylinder of infinite height. Then, all cylindrical arcs and linear segments belonging to that cylinder are considered to construct the corresponding cylindrical surface(s). This process is repeated until all cylindrical surfaces in the scene have been determined (fig.4.b).
· In order to generate a spherical surface, the preprocessed information (curvature-ray and convexity) associated to a single spherical arc is used to compute respectively the ray, center, curvature and the convexity of a spherical surface. Once the surface equation is computed, all arcs belonging to that sphere are considered to construct the corresponding spherical surface(s). As shown in fig.4.c, a generated spherical surface could be partitioned into two or more spherical surfaces based on arc-connexity criterion. For example the arcs (A, A’, B) and (A, A’’, B) are used to infer a part of the surface, whereas the other part is inferred by arc (C, C’, D). This process is repeated until all spherical surfaces are determined.
4.a Example of surface construction
4.b Construction of cylindrical surfaces
4.c Decomposition of spherical surfaces
Fig.4 Surface Construction & Decomposition
Objects in the scene are then inferred using a surface-adjacency graph. In such a graph, a node represents a surface from the scene, and an edge represents an adjacency relation between two surfaces. The objective is to determine connex sub-graphs that correspond to either parts of objects (in case of occlusion) or full objects.
A similar representation scheme is used to model the different objects in the database. A Computer Aided Design tool is used to generate the models as well as configurations of complex objects. A configuration consists of set of instance primitive-models. One of these instances is taken as a reference model. Many instances of the same model could be used in a single configuration. The spatial relationship between models forming a configuration is defined in terms of a set of model-bases that describe geometrical transformations applied on primitive models defined in the model coordinate system to align onto their corresponding model-instances defined in the configuration coordinate system.
Let C1 be the configuration formed by M1, M2, M3, M4, and let M1 be the reference model. As shown in fig.5, C1 is defined by:
C1 = {M1 , (M1,B1 ), (M2,B2 ), (M3,B3 ), (M4,B4 )}
where:
Bi: is the TRM that transforms Mi in configuration C1
Fig.5 Matching configurations
4 Object-Model Matching
It is now widely accepted that 3D recognition of real world scenes with possible occlusion is a difficult problem [1] that cannot be achieved without a consistent representation scheme that makes use of its invariant features to reduce the search space, and to confirm object hypotheses. To achieve these goals the proposed matching strategy uses just one LGP (a pair of adjacent surfaces) from the scene to match an object against different models in the database. In particular, for each sub-graph in the surface-adjacency graph, only one LGP is chosen for matching with similar LGPs of a candidate model. The idea is to generate for each similar pairs of object and model surfaces, a model-base (P,r,u,v) expressed in terms of a TRM that allows a model defined in its model coordinate system to be aligned onto its corresponding object defined in the viewer-centered coordinate system (fig.6). If the matching fails, no other LGP from that sub-graph is to be considered. Also, no successive geometrical transformations are applied while matching a model to the object. We just need to project the model into the generated model-base and compare the projection results with the visible part of the object in the scene. The matching process is repeated until all objects in the scene have been recognized. Matching of possible configurations could then be started.