Digital image warping methods

1. Introduction

digital image warping methodsand shows Geometric transformations, rotation, scaling, translation, shear, warping, affine mapping, bilinear mapping, projective mapping, similarity and mosaic image warping.

We are interested in the modeling and image warping, its applications for finding the 3-D image from 2-D image and showing the most basic mappings: affine, bilinear and projective. Digital image warping is a growing branch of image processing that deals with geometric transformation techniques. Early interest in this area dates back to the mid-1960s when it was introduced for geometric correction applications in remote sensing. Since that time it has experienced vigorous growth, finding uses in such fields as medical imaging, computer vision, and computer graphics. Although image warping has traditionally been dominated by results from the remote sensing community, it has recently enjoyed a new surge of interest from the computer graphics field. This is largely due to the growing availability of advanced graphics workstations and increasingly powerful computers that make warping a viable tool for image synthesis and special effects. Work in this area has already led to successful market products such as real-time video effects generators for the television industry and cost-effective warping hardware for geometric correction. Current trends indicate that this area will have growing impact on desktop video, a new technology that promises to revolutionize the video production market in much the same way as desktop publishing has altered the way in which people prepare documents.

Digital image warping has benefited greatly from several fields, ranging from early work in remote sensing to recent developments in computer graphics. The scope of these contributions, however, often varies widely owing to different operating conditions and assumptions.

morphing = (warping)2 + blending

The equation above refers to the fact that morphing is a two-stage process which involves coupling image warping with color interpolation. As the morphing proceeds, the first image(source) is gradually warped towards the second image (target) while fading out. At same time the second image starts warping towards the first image and is faded in. Thus, the early images in the sequence are much like the first image. The middle image of the sequence is the average of the first image distorted halfway towards the second one and the second image distorted halfway back towards the first one. The last images in the sequence are similar to the second one. Then, the whole process consists of warping two images so that they have the same "shape" and then cross dissolving the resulting images.

Geometric transformations

Geometric transformations permit elimination of the geometric distortion that occurs when an image is captured. Geometric distortion may arise because of the lens or because of the irregular movement of the sensor during image capture.

Geometric transformation processing is also essential in situations where there are distortions inherent in the imaging process such as remote sensing from aircraft or spacecraft. On example is an attempt to match remotely sensed images of the same area taken after one year, when the more recent image was probably not taken from precisely the same position. To inspect changes over the year, it is necessary first to execute a geometric transformation, and then subtract one image to other. We might also need to register two or more images of the same scene, obtained from different viewpoints or acquired with different instruments. Image registration matches up the features that are common to two or more images.

A geometric transform consists of two basis step:

  1. Pixel co-ordinate transformation, which maps the co-ordinates of the input image pixel to the point in the output image. The output point co-ordinates should be computed as continuous values ( real numbers) as the position does not necessarily match the digital grid after the transform.
  2. The second step is to find the point in the digital raster which matches the transformed point and determine its brightness value the brightness is usually computed as an interpolation of the brightness of several points in the neighborhood .

Definitions:

–Object space: coordinate where each component is defined

–World space: all components put together via affine transformation. (camera, lighting defined in this space)

–Eye space: camera at the origin, view direction coincides with the z axis. Hither and Yon perpendicular to the z axis

–Clipping space: All point is in homogeneous coordinate. Perspective division gets everything into 3D image space.

Warping

Image warping is the act of distorting a source image into a destination image according to a mapping between source space ( u , v ) and destination space ( x , y ). The mapping is usually specified by the functions x ( u , v ) and y ( u , v ) ( we use the term “ warp “ instead of its synonyms “ destination “ and “ transformation “ because of its conciseness and specificity: “ warp “ specifically suggests a mapping of the domain of an image, while “ transformation “ can mean a mapping of the image range as well ).

The warping technique, The basic idea behind warping is to transform a quadrilateral to a rectangle. A quadrilateral is a four- cornered region bounded by straight lines. Transforming a quadrilateral to a rectangle warps the objects inside quadrilateral.

There are two ways to warp, The first is control point warping illustrated in figure 1. Divide a section of an image into four smaller squares 1 ,2 , 3 and 4. Pick a control point anywhere inside the square section. This control point divides the square section into four quadrilaterals as shown in the top part of figure 1, the control point will dictate the warping.

Fig. 1. The Control Point Warping Process

A second from of warping is what I call object warping. Instead of picking a control point inside the image array, the user picks the four corners of a quadrilateral as shown in figure 2. Object warping transforms this quadrilateral to a square. The four corners of the quadrilateral can be almost anywhere inside or outside or outside the square. The outside the square option is a capability that control point warping did not have. Object warping is similar to but simpler than control point warping. Control point warping transformed four quadrilaterals to four squares inside an image array. Object warping transforms one quadrilateral to one square.

Fig. 2. The Object Warping Process

Image warping is a transformation which maps all positions in one image plane to positions in a second plane.Warping of images is an important stage in many applications of image analysis. Image warping is used in image processing primarily for the correction of geometric distortions introduced by imperfect imaging system It may beneeded to remove optical distortions introduced by a camera or viewing perspective (Tangand Suen, 1993; Heikkila and Silven, 1997), to register an image with a reference grid suchas a map, or to align two or more images (Brown, 1992). For example, matching is important in reconstructing three-dimensional shape from either a series of two-dimensional sectionsor stereoscopic pairs of images. Camera lenses sometimes introduce pincushion or barrel distortions, perspective views introduce a projective distortion and other nonlinear optical components, can create more complex distortions. Much effort has been expended in developing algorithms forregistering satellite images with both geographic information systems and with other forms ofremote sensing system, such as optical sensors and synthetic aperture radar. Recently there has been considerable interest in registering imagesproduced by medical sensing systems with body atlas information (Colchester and Hawkes,1991). Combined images produced using different imaging modalities also have great potential. For example, X-ray images reveal structure, whereas magnetic resonance images reveal functionality, so their synthesis generates more informative images and for example, see Hurn et al, 1996; Mardia and Little, 1994).A related problem is the warping of one-dimensional signals to bring them into alignment.This is sometimes referred to as dynamic time warping. Dynamic programming methods havebeen applied to speech processing(Sakoe and Chiba, 1978), handwriting analysis (Burr, 1983),alignment of boundaries of ice foes(McConnell et al, 1991) and of tracks in electrophesis gels (Skovgaard et al, 1995). Wang and Gasser (1996) considered theoretical issues. Where featureson two curves are already matched, the problem simplifies to one of monotone regression (see,for example, Ramsay, 1988).

Awarping is a pair of two-dimensional functions, u(x , y) and v(x , y), which map a position (x , y)in one image, where x denotes column number and y denotes row number, to position (u , v) inanother image (see Figure 3).

There have been many approaches to finding an appropriatewarp, but a common theme is the compromise between insisting the distortion is smoothand achieving a good match. In some recently published cases the warp seems unnecessarilyrough and (Conradsen and Pedersen, 1992), and Grenander and Miller, 1994 )In image processing, we do image warping typically to remove the distortion from an image, while in computer graphics we are usually introducing one. Image warps are also used for artistic purposes and special effects in interactive paint programs. For image processing applications, the mapping may be derived given a model of the geometric distortions of a system, but more typically the mapping is inferred from a set of corresponding points in the source and destination images. The point correspondence can be automatic, as for stereo matching, or manual, as in paint programs. Most geometric correction systems support a limited set of mapping typing, such as piecewise affine, bilinear, biquadratic, or bicubic mappings


We have discussed the most basic mapping: affine, bilinear, projective mapping, Mosaic warping and similarity. Affine mappings are very simple and efficient and may be constructed from any three-point correspondence. The two generalizations of affine mappings, bilinear and projective mappings can be constructed from any four-point correspondence. When generality beyond an affine mapping is needed, the projective mapping is preferable in most respects. Geometrically, the projective mapping is line preserving. Computationally, it is closed under composition and inversion, so it adapts easily to any scanning order.

Image Mosaicking is the process of seamlessly stitching together or blending a setof overlapping images of a scene into one large image. This process is needed invarious remote sensing, computer vision, and computer graphics applications. It isused in map building by piecing together georectified images. It is used inobject tracking by subtracting registered images and tracking image changes. It is used in the creation of panoramic images by integrating a sequence of overlappingimages. It is also used in 3-D scene reconstruction by integratingmultiple-viewrange images.Image mosaicking is the process of finding a global transformation that resamplea set of images of a scene into a common coordinate system. Depending on the scenecontent, the distance of the camera to the scene, and the position and orientation of thecamera capturing the images, different transformations may be needed to register andintegrate the images. For instance, if a sequence of images is obtained by a camera with a fixed lens center and a horizontal optical axis while rotating the camera about a vertical axis passing through the lens center and, assuming the camera is sufficientlyfar from the scene, adjacent images in the sequence will have only translational differences.Such images can be registered by shifting one over the other and locating theposition where maximum similarity is obtained between the images. If the camera isnot very far from the scene, the images obtained during the rotation of the camera canbe mapped to a cylindrical surface whose axis is parallel to the axis of rotation. Themapped images will then have only translational differences and can be registeredeasily. If the camera is not fixed and is not far from the scene, but the scene is flat,images of the scene can be registered by the projective transformation. This requiresa minimum of four corresponding points in the images. If the scene is not flat and thecamera is not very far from the scene,the images will have local geometric differences,requiring a nonlinear transformation to register them.If the camera parameters in each image acquisition are known, the camera modelsmay be used to relate the captured images and align them. Often, however, either thecamera parameters are not known or the provided parameters are not very accurate.Therefore, from information within the images, it is required to find a transformationthat seamlessly combines the images.

- Similarity transformation

A transformation that changes the distance between points by a fixed amount is called a similarity transformation or a dilation. The fixed amount of the change is called the scale factor.

Each similarity transformation requires an initial set, a scale factor, k, that is a real number, and a point of dilation, P. A single similarity transformation will be denoted as follows:

S(k, P)(initial set).

An affine transformation is called a similarity transformation if A is a non-zero scalar multiple of an orthogonal matrix. A similarity transformation is always invertible and its inverse is also a similarity transformation. Two subsets of that are mapped to one another by similarity transformations are called similar. The following proposition provides a recognition principle for similarity transformations.

In general, the parameters of coordinate transformations are determined from coordinates of sets of points which belong to the realization of both coordinate reference systems. In general, they are approximate, and their validity may be restricted to a particular region. The most widely used coordinate transformations are similarity transformations, where the two coordinate reference systems differ only by their position and orientation in space and by their scale. The similarity transformation is conformal. It can be performed on Cartesian coordinates as well as on ellipsoidal coordinates. Although some similarity transformations from source datum S to target datum T use only three parameters (T1, T2, T3), the generic seven-parameter formula takes the following form:

As the coordinate transformation equation is the result of an approximation of a strict formula, the rotations R1, R2, R3 must be small.

The value of the three translations T1, T2, T3 along the coordinates axes x, y, z the three rotations R1, R2, R3 and the scale correction D are established by minimizing the residuals between the coordinates of identical points in coordinate reference systems (S) and (T). Identical points means identical in space and time. Within the similarity transformation, movements of points are neglected; these movements are reflected in the residual deviations after the adjustment.

- Similarity measures

Various similarity measures have been used in template matching. They are, sum ofabsolute differences, cross-correlation coefficient moments Fouriertransformcoefficients. Mellin transformcoefficients. Haar transformcoefficients. Walsh-Hadamard transform coefficients .

K-S test and,most recently, mutual information .

Similarity using raw image intensities, This includes the sum ofabsolute differences and the cross-correlation coefficient. The sum of absolute differencesis the Minkowski metric of order one [90] and is defined by

Where when template is of size m x n pixels, image is of size M x N pixels, shows the template and is thewindow in the sensed image being matched with the template. Position [x, y] showsthe position of the template in the sensed image. By changing x and y, the templateis shifted in the image and at each shift position the sum of absolute differencesbetween corresponding pixels in the template and the window is determined. Thismetric actually measures the dissimilarity between the template and the window. Thesmaller the D, the more similar will be the template and the window. A study carriedout by Svedlow et al. found that instead of raw image intensities, gradient ofintensities produce a more reliable matching.

At each shift position mn additions and subtractions are needed; therefore, thecomputational complexity of sum of absolute differences is mn. A sequential similaritymeasure proposed by Barnea and Silverman reduces the computation timeby stopping the processwhen the sum reaches a value larger than a previously reachedvalue since the search is for the smallest sum. To speed up the search, Dewdney used a steepest descent approach to guide the search for the best-match positionwithout searching the entire image. To speed up the template matching process, Vanderburgand Rosenfeld used a two-stage process where a part of the templatewas first used to find the candidate match positions and then the entire template wasused to find the best-match position among the candidates. These methods increasespeed at the cost of increasing the mismatch probability.The cross-correlation coefficient is a distance metric of order two and is defined by

Where , and are the template and window averages, respectively. The closer the intensities in the window varywith those in the template, the higher the value for S will be. The cross-correlationcoefficient varies between −1 and 1. The larger the coefficient, the more similar thetemplate and the window will be.Computation of the cross-correlation coefficient involves on the order of mn additionsand multiplication, but the coefficient of mn is much larger than that for thesum of absolute differences. Note that as [x, y] is changed, the numerator of (4.47)becomes a convolution operation and fast Fourier transform may be used to speed upthe computation. Speed up is also achievable by using a two-step search ,first a subset of pixels in the template is used to find the candidate match positionsand then the entire template is used to find the best-match position from among thecandidates.