Shape Indexing using Active Contours and Curvature Motion

Apma 990 Project Report

Ross Tulloch

Primary References:

Chan, T., L. Vese, ``Active Contours Without Edges,'' IEEE Transactions on Image Processing, Vol. 10, No. 2, pp. 266-277, 2001.

Mokhtarian, F., S. Abbasi and J. Kittler, ``Efficient and Robust Retrieval by Shape Content through Curvature Scale Space,'' Proc. International Workshop on Image DataBases and MultiMedia Search, pp. 35-42, Amsterdam, The Netherlands, 1996.

Mokhtarian, F., S. Abbasi and J. Kittler, ``Robust and Efficient Shape Indexing through Curvature Scale Space,'' Proc. British Machine Vision Conference, pp. 53-62, Edinburgh, UK, 1996.

1. Introduction

In the context of my project, shape indexing means cataloging images based on the curvature properties of the objects within the image. Classically, image properties such as color, texture and shape would be manually input to databases, which is certainly tedious and prone to error. The computer vision approach is to apply image processing techniques in order to catalogue and retrieve similar images automatically.

An example of a computer vision database is the Shape Queries Using Image Database (SQUID), which catalogues images solely by shape, and is based on their curvature properties. A demonstration of SQUID is available online at SQUID was developed in the mid-1990’s at the University of Surrey, England, by Farzin Mokhtarian’s computer vision group. Mokhtarian appears to be the authority on the application of Curvature Scale Space (CSS) to image processing problems because it is the topic of most of his papers since the late 1980’s.

The key idea behind SQUID is Curvature Scale Space (CSS), which enables it to quantify curvature conveniently and compactly. That is, by applying successively stronger smoothing operations on an image, one can characterize the curvature of the objects in the image by examining the evolution of zero curvature points.

SQUID uses Gaussian convolution on an explicitly defined Lagrangian boundary to create a CSS plot and then attempts to match images via a “cost” calculating algorithm. The goal for my project is to implement SQUID’s image segmentation and CSS generation parts using level set methods. I use Chan and Vese’s “Active Contours Without Edges” algorithm for image segmentation and then simple curvature motion to create the CSS plot. The algorithm gets “emotional” as soon as the implicit boundary is extracted to form an explicit curve.

2. Shape Indexing through Curvature Scale Space (Mokhtarian’s Method)

The following method is heralded as a fast and robust indexing method that can retrieve similar looking images. The method has been tested on a prototype database of 450 images of fish-like animals and the results were compared favorably with the responses of volunteers, who made their own subjective similarity assessments.

2.1Image Segmentation

Mokhtarian’s paper does not describe the specific image segmentation algorithm used to detect object boundaries within the image. The SQUID system takes boundary data points as input which form an explicit closed curve around one object in the image that is parameterized by normalized arc length:

.

2.2Curvature Scale Space Representation

Next, SQUID computes the curvature of the image:

and notes values of arc length s which have zero curvature. The key step is to convolve the boundary with a Gaussian kernel of width , so that the boundary is smoothed out. The Gaussian kernel is defined as:

.

Since the kernel is positive everywhere and monotonically decreasing away from s=0, it acts as a low pass filter on the image. Initially, for small , the kernel smoothes out only noise, but then for larger , it smoothes deep cavities and convexities. Convolving the boundary by g(s,) in the x and y directions independently gives the following for the updated x and y boundary points:

where the derivatives Xs and Ys follow from the definition of convolution since the s dependence can be given to either x(s) or g(s,). Thus we obtain the following for smoothed curvature:

.

The CSS convolution method is robust to noise, scale and orientation changes, and has been extended by Mokhtarian in recent papers to be relatively invariant to affine transformations by parameterizing the boundary by “affine length” instead of arc length. Figure 1 illustrates the algorithms robustness to such defects, and illustrates how a proper CSS looks.

Figure 1: a) A boundary and its CSS image, b) Change in orientation causes a circular shift in CSS image (Mokhtarian, 1996)

Figure 2 shows the effect on zero curvature points when applying successively larger smoothing ( on the boundary.

Figure 2: Shrinking and smoothing of the curve and decreasing the number of curvature zero crossings from left: =1,4,7,10,12,14 (Mokhtarian, 1996)

2.3Maxima extraction and CSS matching

As the smoothing increases, zero crossings converge on each other and eventually peak at maxima which are extracted and stored as the shape vector for the image. Typically only about 3 to 7 maxima need to be stored, and any small maxima (<1) are ignored (as noise) by thresholding. Figure 3 shows the extracted maxima for an example fish.

Figure 3: CSS image and its maxima for two orientations (Mokhtarian, 1996)

Notice that the maxima in the lower plot are close to being just a circular shift of the maxima in the top plot (ie. it would be just a circular shift if they were the same fish).

The first stage of CSS matching is checking the aspect ratio, which is a comparison between the largest maxima of each image. If max(new)/max(old) is not within some range (say 0.8 to 1.2) then they rule that the images are not similar.

Next, compare the set of maxima in the new “image” with the set from the old “model” by evaluating a cost function as follows. Create a “node” consisiting of the largest maximum of the image and the model, and then circular shift the image by =Sm-Si., where Sm and Si are the arc length coordinates of the largest maxima in the model and image. The initial cost is the absolute difference in  coordinates between the image and the model. Also create “nodes” for any maxima within 80% of the largest maximum of the image (the overall cost will be the minimum of the individual costs).

For each node, the cost of matching the model to the image is associated with the sum of the minimum “straight line” distances between remaining maxima in the image and model. Since the image and model may have a different number of maxima, the height  may be used as the distance when there are no maxima left in the image or the model. The lowest cost node is selected then the process is repeated by matching the image to the model. Since mirrored reflections do not change the shape of images, the cost of matching the mirror of the model to the image (and vice versa) is also computed. The final cost is the minimum of all computed costs.

The response of the SQUID system to a series of queries is shown in Figure 4. The top image in each column is the input and the lower ones are the ones judged most similar.

Figure 4: CSS image and its maxima for two orientations (Mokhtarian, 1996)

3. Level Set Approach to the Shape Indexing Problem

The key image processing steps in SQUID’s algorithm, image segmentation and the generation of the CSS plot via Gaussian smoothing, can easily be handled using level set methods. Chan and Vese’s “Active Contours without Edges” image segmentation algorithm is well suited for determining the fish boundaries, and the Gaussian smoothing can be replaced by simple curvature motion.

3.1Preprocessing

The source data for my experiments was the SQUID database of 1000 scanned fishes from the following books:

  • Marine animals of the Indo-Pacific V1 (1986) by Shirai
  • Fishes of the sea, The North Atkantic and Mediterranean (1991) by Lythgoe
  • The illustrated book of Fishes (1987), by Bristow.

An example of the original scanned images is shown in Figure 5. Due to copyright reasons, the SQUID database contains only boundary data for these images in the form of “.c” files, each of which contain between 400 and 1600 (x,y) points on the fish boundary, as illustrated in Figure 6.

Figure 5: Original kk100 fish scan (see SQUID website)

Figure 6: Plot of boundary points for kk100

Since the core component of my project was to implement an image segmentation algorithm, I didn’t want to start with the fish boundaries explicitly defined so I filled them using Matlab’s “fill” command as shown in Figure 7.

Figure 7: Examples of filled fish boundaries

Note that “filling” the fish was cosmetic, meaning that the active contour model would probably have no trouble converging to the image of just the boundary. However the “filled” fish do serve as a more realistic database query. To speed up my algorithm, particularly the implicit solve in the segmentation algorithm, I scaled down the individual fish images in Figure 7 to a set size of 107 by 143 pixel JPEG’s. The scaled fish are shown in Figure 8. As noted by Ben, JPEG may not have been the best format to write these scaled images.

Figure 8: Scaled down JPEG images

3.2Image Segmentation using “Active Contours without Edges” (Snakes)

This section follows directly from Chan and Vese’s 2001 paper. Many image segmentation algorithms use the concept of energy descent, in which a function (in this case the level set function ) is evolved until a given “fitting energy” is minimized. Consider a scalar image u0 and a closed curve C within the image. Define F1(C) and F2(C) as

where c1 is the average value of u0 inside the curve C and c2 is the average outside. Figure 9 illustrates that F1 is zero as long as the C surrounds a homogeneous region of the image and F2 is zero as long as C is surrounded by a homogeneous region.

F1>0F1>0F10F10

F20F2>0F2>0F20

Figure 8: Possible cases for the curve C, the fitting energy F1+F2 is minimized only in the orientation on the right

In this model, the energy functional F(c1,c2,C) is defined as

where the length of C is added as a regularization term. Applying the associated Euler-Lagrange equation (minimizing with respect to ) a level set formulation can be derived

where () is a smoothed out delta function which is defined as the derivative of the following smoothed out Heaviside function

.

Given this Heaviside function constants c1 and c2 are calculated at each time step using

.

To calculate c1 and c2 numerically, I used the simplest numerical integration that I thought I could reliably implement which was 2D Simpson’s rule. The evolution equation for  is discretized as

where the derivative operators D+ and D- are defined as

.

The implicit function  is evolved implicitly to steady state and I solved the linear system using BiConjugate Gradients with an incomplete LU factorization preconditioner since it is sparse and nonsymmetric. The boundary condition in the segmentation algorithm is

but since none of my images contained objects that touched the boundary I assumed that boundary conditions were not important since  would be zero on the domain boundary.

Due to my impatience, and desire for a quick solve time, I used a crude incremental time step t  nt0 since as the fitting energy F(c1,c2,C) gets small, the amount of motion in  near the interface decreases accordingly. Figure 9 shows an example of the evolution of  to the object boundaries.

Figure 9: Example of  evolving to the object boundary

The initial guess for  is the largest circle at the center of the image which penetrates both the donut and the object at the bottom right. Notice that the error is largest in the corners of the image which were the last parts of each object to be segmented (so the time step was largest there). Another example of the robustness of this algorithm is shown in Figure 10. A poor initial guess for  is chosen and the algorithm converges equally well.

Figure 9: Example of convergence with a poor initial guess for 

4. Reinitialization of the level set function 

During the evolution of the image segmentation algorithm, and many other level set algorithms, the level set function will gradually lose its initial signed distance property. If  becomes too flat or steep in areas near the zero contour then numerical errors can increase. The common reinitialization equation comes from Sussman’s (1994) paper for two phase flow, and is given by

,

where  is a fictitious time that is evolved until 0 (steady state) so that  is renitialized to ||=1 near the zero contour. I implemented the reinitialization scheme given in Sussman’s paper using 2nd order ENO interpolation, 2nd order Runge-Kutta timestepping, and the following sign function:

.

For time stepping I took =x/10 and let =x. The results of an example reinitialization are shown in Figure 10. The initial signed distance function’s zero contour was a circle in a rectangular domain, and was subsequently smoothed using 30 steps of curvature motion (left column of Figure 10). The results of re-initialization are shown on the right.

Figure 10: Example of reinitialization. (Left) Smothed implicit function , (right) reinitialized 

I had a few difficulties employing reinitialization during image segmentation. First the stopping criterion given in Sussman’s paper,

,

where =3x/2, appeared to be too strict, so I used E<1.5x2. Then as I increased the time step in the segmentation algorithm the reinitialization had difficulty converging. Fortunately reinitialization is optional in the segmentation algorithm, and Fedkiw noted that it could even be detrimental because it prevents interior contours from growing (although I suspect this would not be an issue with my fish images).

The implicit function  after image segmentation is shown for fish “kk100” without reinitialization in Figure 11. The spiked region near the tail of the fish was the last to be segmented when the time step was quite large. The black line denotes the zero contour (the artifacts on this line are due to MATLAB’s contour function). I suspect reinitialization would not be too helpful in this situation because the steep regions of  are precisely where t in the segmentation algorithm is going to zero.

Figure 11: Implicit function  for “kk100” after segmentation

The results of the segmentation algorithm on the six fish shown in Figure 8 are shown in Figure 12 (without reinitialization).

Figure 12: Fish boundaries after segmentation

5. Creating the CSS Plot using Curvature Motion

Mokhtarian creates curvature scale space plots by extracting the points of zero curvature on a successively smoothed object boundary. The smoothing operation is performed using convolution of Gaussians (of increasing widths) on an explicitly defined set of boundary points. Such smoothing can easily be implemented with an implicit level function  evolving under curvature motion.

5.1Curvature Motion

The evolution of a curve under curvature motion in a level set sense is , where , the curvature, is the divergence of the vector normal to the zero contour of ,

.

Discretization for this equation is also easy at first order using centered differences:

,

.

The equivalence of curvature motion to smoothing Gaussian convolution can be seen by noticing that the equation for curvature motion becomes the heat equation when  is a signed distance function such that ||=1. The solution of the heat equation in two dimensions is the initial value of  convolved with the heat kernel, which happens to be a Gaussian:

,

where time becomes the smoothing parameter, replacing the width of the Gaussian (2).

5.2Creating the CSS plot

After the object in the image has been segmented, the first step is to smooth the boundary until it is a single closed curve since all of my images contain a single fish. Pre-smoothing to a single curve simplifies the explicit capturing of the boundary data, so that it can be done using MATLAB’s “contour” function. The algorithm for creating the CSS plot contains the following 3 steps which are repeated until there are no more points of zero curvature on the zero contour of . First perform a curvature motion step, then capture the points of zero curvature on the zero contour of  and finally parameterize these points with respect to arc length. Since the total arc length of the zero contour shrinks as it becomes more smooth in both Gaussian convolution and curvature motion, the arc length is normalized to the interval [0,1).

An example plot of the zero contour of  and the zero contours of  is shown in Figure 13.

Figure 13: Zero contours of  and  at t=40 for kk100

Points on the zero contour of  and the zero contour of  were obtained using “contour” in MATLAB, which gave points on either the x-grid (xj=jx,yj) or on the y-grid (xj,yj=jx). Figure 14 illustrates the zero curvature crossings in the zero contour of  for Figure 13.

Figure 14: Zero curvature crossings (for  in Figure 13)

The zero contour of  is traversed clockwise, and points of zero curvature are noted whenever the sign between the nearest two grid points is different (lower plots and upper left plot in Figure 14). For consecutive points that satisfy this condition, the zero crossing is determined by interpolating to the middle these points. Zero crossings are also noted when the closest grid points to current point on the  contour have the same sign and the closest grid points to the next contour point both have a different sign (as illustrated in the top right plot of Figure 14).

5.3CSS maxima extraction

In order to extract the CSS maxima, I chose to search for CSS data points starting from the top of the plot and then filter, or mask, the rectangle below and +/- some distance to the sides of that point so that no other maxima could be within that region. Mokhtarian’s paper allowed for maxima directly below other maxima but to keep the algorithm as simple as possible, my scheme did not. The mask must extend circularly from 1 around to 0 since those are essentially the same points on the zero contour. Also any potential maxima below a nominal threshold of time=25 were rejected for simplicity, even though small maxima above the noise level would probably be important to keep.