Boundary Decomposition and Matching

Seckeita Taylor and Adrian Vern

December 4, 2000

EE547 – Dr. Reeves

Abstract:

This project has two parts. First, given a closed boundary of image, we use Gaussian filtering to find a representation for the curvature at every point on the boundary. This is done for varying values of filter width, and the resultant information is combined to form a unique representation of the boundary. Next, to test the accuracy of our decomposition, we develop a method of comparing two (independent) such representations to determine whether or not they are indeed the same boundary. We perform two types of testing: functionality and robustness.

Research:

There has been a lot of work done in the area of boundary decomposition, and there are several ways to take the boundary of a shape apart and develop a classification for it. However, there is a consistent set of criteria used to define a reliable representation. First of all, the representation must be invariant under transformations (rotation, scale). Computation time is also important. In addition, varying levels of detail must be contained within the representation. One of the largest problems, though, are open curves due to frame boundary or loss of information.

In order to remove the inherent noise along a boundary, a good procedure is to apply a Gaussian filter. However, doing so can also remove fine details along the boundary that are crucial to forming an appropriate representation. There are several methods of dealing with this problem. First, one can apply a non-linear filter to a boundary, then find an optimal sigma to use a Gaussian with [2]. One can also minimize the third derivative of Gaussian convolution to develop a smoothness measure [3]; this measure is used to identify discontinuities of curve tangents as well as selecting the appropriate scale of smoothing. Finally, one can use several values of sigma and perform multiple Gaussian smoothings [1]; this results in a scale-space image. A common method of defining curvature along a shape boundary is to first parametrize the x and y of the boundary, then convolve each with different derivatives of the Gaussian. Once curvature is found for every point, a scale-space image can be created.

To match one scale-space image with another, a node-matching algorithm is proposed by Mokhtarian and Mackworth; it is based on the Uniform Cost algorithm and an A* Search. The idea is to be smart and not compare every single scale-space loop. It suggests that one find an important match with one of the loops and continue; one need not match everything. The purpose of using this method is to compensate for boundaries which have lost information and are incomplete; however, this approach goes beyond the scope of our goals for the project, and we will limit our testing to closed-boundary images .

Overview:

The remainder of this paper is divided into three main sections. First we will explain the experiments we conducted, both for functionality and for robustness. Afterwards, we will explain how our programs work and why we chose to make them the way they are. Finally we will summarize our results and go on to suggest the next steps to take to investigate this procedure in more depth.

Experiment Description:

  1. Functionality (initial experiment):

The first thing to test is how well our program can recognize that two different object boundaries are in fact different, or that identical object boundaries are the same. We chose to use images of Africa and Illinois. The general shape of these regions is similar but different enough for a casual observer to be able to tell them apart (these shapes were manually drawn on graph paper and then edited in emacs).

For two dissimilar images, the desired output is for our boundary matching algorithm to tell us that the error is too large for these shapes to be the same. Conversely, the desired output for two identical images of Africa is an error of zero.

As our testing shows, our algorithm is able to recognize differences in how similar or different two given shapes are. The next step is to try to find out just how well our algorithm performs on images that gradually change from being the same to mostly the same to different.

  1. Robustness:

Once we knew that our algorithm worked on simple inputs, we decided to see how much noise we could add to the test image (identical to the original image) before it would no longer be recognized. To add noise to the test image, we used the gauss function in VisX. This spread noise throughout the image, both on the boundary pixels and the background.

We were unable to use these images as test images because boundary following is impossible. When one attempts to use boundary following on one of these test images, several regions are found, and a single output boundary vector is impossible to achieve.

The solution to this problem was to manually alter the original Africa image. By using a noise image as a reference, depending on the amount of noise present, we altered the boundary in such a way that the total number of pixels remained the same, while a complete closed boundary was maintained. Noise deviations of 20,40,60 and 80 were used.

Program Descriptions:

  1. vcurvature:

This program takes as an input a 2D-vector; the output is a 2D-vector. The length of the output vector is a multiple of the input vector because it contains the different curvature representations of the input. The number of representations determines the multiple mentioned above.

(*one important note; in order use a vector with vcurvature, one needs to do some type of boundary following on a thresholded shape. vtrace, a built-in VisX function, is useful for doing this.)

vcurvature separates the input vector into an x-vector and a y-vector; each of these represents the x- and y- values of the boundary pixels. The next step is to find the first and second derivatives of each of these new vectors. To do this, one can filter the x- and y- vectors with a 1st-derivative Gaussian to get the dX- and dY-vectors and with a 2nd-derivative Gaussian to get the d2X- and d2Y-vectors.

The Gaussian derivatives are simply calculated from their respective equations. However, one must be careful to make sure of two crucial characteristics. First, the Gaussian derivatives must have enough samples and span a broad enough range to resemble the desired shapes. The 1st-derivative should look similar to one period of a sine wave; the 2nd-derivative should look like the classic inverted “Mexican Hat”. The second important thing to note is while one is considering when one has enough samples for the filter. It is imperative that the Gaussian length is LESS than the length of the input vector. The reason for this is because of the way the convolution is performed.

The convolution of the Gaussians with the x- and y-vectors is the same in every case. However, the classic convolution cannot be performed in this program.

When one separates the input vector into x- and y-components, the perimeter of the boundary is obtained. Since this boundary is discrete and we plan to track the exact position throughout our project, we must maintain the same boundary length. In a standard convolution of two vectors, say u of length M and v of length N, and length of the result is N+M-1. Because the vector size changes, we must consider doing some sort of convolution where, if our input vector is of length L, our output is also of length L.

The approach used in this project involved using a filter of odd-length W, and then zero-padding the input vector with W/2 zeros on each end. The filter is centered on the first non-zero input pixel; in this position, the first element of the filter is aligned with the first zero. The convolution is then performed L times, so the last calculation has the filter centered on the last non-zero input pixel. Doing this filtering procedure is a good approximation as long as the input vector (without the zeros) is longer than the Gaussian.

Once the x- and y-derivatives are calculated, all the pieces are in place to compute the curvature. The curvature equation is defined as:

(dX*d2Y-dY*d2X)/(dX^2+dY^2)^(3/2)

When one plots out this equation, one can see that with an increasing sigma, the number of zero-crossings are reduced. The zero-crossings are the inflection points of the input vector. More will be discussed about these special points in a little while.

Once a curvature representation is calculated, one knows where the inflection points on the boundary are.

Fig.1Fig.2

Fig.1: Curvature vs. Perimeter (sigma=6,12,24,48): The zero-crossings represent inflection points. The smaller sigma is, the more activity on this plot; the larger sigma is, the more fine details in an image are filtered out.

Fig.2: Scale-space image of Africa (from [1]): Every point on this graph corresponds to a zero-crossing from Fig.1. The top of the loops are where inflection points meet and then disappear.

However, one has the option of calculating several curvature representations and storing them in a big output file. The last step of vcurvature is to take the curvature vector and threshold the values. Ideally, we would want all locations on the vector where zeros are found to be labeled 255 (or some other indicator of an inflection point) and all locations labeled zero (for no inflection point). However, one must be aware that the program does not find exact zeros; it must make some approximations. Whenever the program finds a transition point between a positive and negative value, it knows that a zero was lies somewhere in between. The program then assigns the point closest to zero to be the inflection point. This approximation is a little crude, but with more input boundary pixels the error is minimized.

So now, the output of vcurvature contains several elements, each of which contains a curvature representation for a given sigma. To create a true scale-space image, one can find many of these curvature representations for many sigmas and then plot the result. This plot has sigma for its y-axis and perimeter location for its x-axis. At sigma ~ 0, the most inflection points are found. If one were to keep plotting where inflection points appeared with respect to sigma and perimeter location, the inflection points would gradually extend up in the graph and curve to a side. Eventually, each curve would meet with a second curve, forming a semi-ellipse. The peak of each of these ellipses is where the inflection point disappears due to filtering.

The output of vcurvature is next sent to a boundary-matching algorithm. This algorithm compares two vcurvature outputs of discrete points, and it calculates the error between them. Depending on the error tolerance specified by the user, the matching algorithm determines if there is a boundary match or not.

2. vmatch

Vmatch decides if a match has been found between two images. It is designed to perform boundary matching on two 2D vectors that represent the scale space data of an image. This input is the information for the inflection points of the image at each sigma value. The sigma value is a factor in the smoothing done in the vcurvature program. Vmatch is comprised of several parts, the manipulation of the input data, the ordering of our data, and the error comparison of the vectors.

First, vmatch take the input of two 2D vectors. These vectors include a value of 255 or 0 to denote if an inflection point is found there, and the sigma value for that given point. Vmatch is concerned with the points where there is an inflection point, which is where the value of 255 is found. The location of this 255 value denotes the t value in the scale space plot shown in figure * . Vmatch first searches through the input vectors for places where there is an occurrences of the value 255. When that point is found, the program then calculates the location of this point and logs the corresponding sigma value. The values of the location of the point and the sigma value are then stored in the program’s buffer.

The next step of the program is to find the largest sigma value in each vector. Once that value and its location in the vector are found, it is stored in a new buffer followed by the subsequent location and sigma values. The vector then wraps around (using a mod function ) to include the rest of the values in the buffer. This reordering in the buffer is done to ensure that once the matching begins the program will start from the same vantage point in each vector.

The last and most important step in the vmatch program is the calculation of the error. Since the scale space is represented for each sigma value, the output appears as a number of points for each sigma value that follow a straight line behavior. Therefore the calculation of the error must be done on the location of the infection points rather than the sigma value as originally thought. Since error is the difference in the location of the inflection points for each sigma, it is calculated by referencing the location of the inflection points for the input file. Then comparing that location to the location of the inflection points in the test file. This is done for each sigma value. The total error is the summation of the difference between each inflection point . Once this error is calculated, it is compared to the maximum allowable error. This value is either inputted by the user or it is defaulted to a specified value in the program. If the total error is less than or equal to the maximum allowable error, the images are a match. If not, they are not a match. The output of vmatch is the maximum sigma value, the match status (match/or no match), the total error, and the maximum allowable error.

A sample f the output is as follows:

Input File
- Max sigma value in scale space=14 at position=31
Test File
- Max sigma value in scale space=13 at position=53
Boundary Match Found.

- Error = 91.428642

- Maximum Allowable Error = 99

Results and Observations:

As mentioned earlier, the first goal was to make sure that our program could recognize the case when two boundaries are very different as well as return an error of zero for two identical images. When we tested Africa against Illinois, we initially got zero error. This was due to comparing inflection points at the same sigma values. One of the calculations in the matching algorithm is dependent upon a difference between the sigmas in the two test boundaries. If we sample at the same sigma values throughout, this difference becomes zero. However, when we sample at different sigmas, the difference in the two boundaries becomes much more apparent.

After this, we tried inputing Africa with itself (with different sigma samplings, of course) and received an acceptable error. So at this stage in testing, we knew that our program could differentiate between what was a match and what was not.

Fig.2: AfricaFig.3: Illinois

Next, we gradually added noise to the Africa boundary and looked to see if a) the error in matching increased as noise increased and b) just how robust our matching algorithm was. Since the user specifies an error threshold, we decided to use 100 as the maximum allowable error between images for the images to be called the same. We used noise levels of 20, 40, 60, and 80:

Fig.4: Noise=20 Fig.5: Noise=40 Fig.6: Noise=60 Fig.7: Noise=80

As you can see from the images above, as noise is increased, the image becomes more distorted.

The following table illustrates the error found in comparing the original Africa image with Illinois and then the four noise-added images:

Illinois

/

Noise20

/

Noise40

/

Noise60

/

Noise80

Africa

/ error = 0.272 / error = 0.155 / error = 0.145 / error = 0.184 / error = 0.254

From this table, we can say that up to an error of 0.26, we will accept an image as looking like Africa. However, once a test image returns more error, like Illinois with error = 0.27, we can say that the test image is not the same as the original.

Conclusions:

In summary, we successfully implemented the curvature representation given in [1], and then showed that the decomposition held the relevant boundary information required to recognize the original boundary shape. This method is similar to decomposition of a boundary using Fourier Descriptors in that it uses a hierarchical breakdown of boundary information. In our program, even though a boundary is decomposed, it can be sufficiently reconstructed as long as a well-sampled scale-space image is created.

The next step in this program is to create scale-space images and use the boundary matching algorithm on them. The difference is that all of the inflection points would be found, thus giving the most accurate result of whether two boundaries are the same or not. Still, it would not be perfect; since the spacing between sigma values is infinite, interpolation would be required at some level. If this program were to be utilized in a display for people to look at, perhaps all that matters is that the resolution between inflection points in a scale-space image is fine enough so that a person would not notice quantizing effects.