MORPHOMETRIC ANALYSIS IN EVOLUTIONARY MORPHOLOGY USING GEODESICS BETWEEN CONTINUOUS CURVES

INTRODUCTION

The quantitative study of the form of biological structures has been pervasive in evolutionary biology since the early twentieth century (Thompson, 1917), becoming particularly important after the early 1990s (Adams et al., 2004). Thus, a wide variety of morphometric techniques have been applied to questions involving the understanding of shape variation in research areas such as comparative anatomy (Harvati, 2003; Marugán-Lobón and Buscalioni, 2003; 2009), taxonomy (Dodson, 1975; Chapman, 1990; Naylor and Marcus, 1994), ontogeny (Dodson, 1976; Fink and Zelditch, 1995; Rohlf, 1998), phylogenetic inference (Zelditch et al., 1995; 2000; MacLeod, 2001), and functional morphology (Andrews, 1974; Nickel et al., 2006; Kulemeyer et al., 2009).

Two major categories of methodologies were developed: traditional and geometric morphometrics (Adams et al., 2004). Traditional morphometrics consists of multivariate statistical analysis of morphological variables, the values of which are taken from linear measurements, angles, areas, and ratios (Blackith and Reyment, 1971; Marcus, 1990). The second set of techniques, geometric morphometrics (Corti, 1993), includes quantification of the geometry of structures by discrete sampling of their contours; this is accomplished by identifying homologous points known as landmarks (Bookstein et al., 1985; Small, 1996; Dryden and Mardia, 1998). However, landmark-based methods depend heavily on the chosen homologous points. Likewise, in some biological structures it may not be possible to locate such landmarks. This shortcoming was addressed in the sliding semi-landmark method, where a set of sample points are allowed to slide along a given outline while minimizing a criterion such as bending enery or Procrustes distance but restricted to retain their relative positions (Bookstein, 1997; Slice, 2007). In another subset of techniques, outline-based morphometrics, geometric contours are often represented using Fourier Descriptors (Persoon and Fu, 1986; Rohlf 1986; 1990). However, these mathematical functions ignore the non-linear geometry of the shape space of closed curves. While such representations can be used to classify shapes, it is not possible to perform intrinsic statistical analysis with them. Another statistical approach, Active Shape Models (Cootes et al., 1995), uses principal component analysis of landmarks to model shape variability. Despite its simplicity and efficacy, this approach is rather limited because it ignores the non-linear geometry of shape space.

Here, we show the applicability of a recently developed landmark-free method of geometric morphometrics to the study of shape variation in evolutionary biology. We begin by describing the rationale of the method. Then, we demonstrate its efficacy by replicating a well-known morphometric study concerned with the shape variation within a species of planktonic foraminifera, comparing the results of the analysis using the new method with those from previous work. Finally, we implement the new methodology to search for patterns of variation in the pubis of hadrosaurid dinosaurs.

GEODESIC DISTANCE ANALYSIS

We propose the implementation of a recently developed method of geometric morphometrics to the study of the shape in biological structures. This approach was developed by Klassen et al. (2004), who originally referred to it as “analysis of planar shapes using geodesic paths”. However, because other methods of geometric morphometrics deal also with planar shapes, we preferred a more specific denomination and, thus, we here refer to this method as Geodesic Distance Analysis or GDA.

The basic idea in GDA is that dissimilarities between shapes are quantified using geodesic lengths within a Riemannian (non-Euclidean) shape space. The term “geodesic” refers to the shortest path between two points in a curved space. GDA considers the shapes of planar and closed continuous curves in R2, without any need for landmarks, diffeomorphisms, or level sets to model shape variations. The objects of study, such as, for example, fossil bones, are represented by the continuous curves of their boundaries. These contours, or planar curves, are then parameterized by the arc length. The angle, made by the velocity vector with the positive x-axis, is defined as a function of arc length (Srivastava et al., 2006). Thus, angle functions are chosen to represent and analyze shapes.

In GDA, shapes are invariant to rigid motions, such as translation and rotation, and also to uniform scaling. Scale variations are removed by forcing all curves to be of length 2π. Variations due to translation are already removed because the angle function θ is invariant to curve translations in R2. Variations due to rotation are eliminated by imposing a restriction to θ Є { θ 0 + L2} such that, 1/2π θ (s)ds = π. Closed curves are also invariant to the placement of origin (or starting point) of the parameterization.

Once the curves of interest have been transformed into angle functions, geodesic paths (or geodesics) between shapes are constructed. Klassen et al. (2004) approximate geodesics on the shape space by successively drawing infinitesimal line segments in L2(WHAT IS L2?) and projecting them onto such space (FIG XX). For any two shapes S1 and S2 (the target shape) belonging to the shape space, we search for a tangent direction t at S1, such that the geodesic in that direction reaches the target shape in unit time (WHAT IS UNIT TIME?). This search is performed by minimizing the chord length (i.e., L2 distance between the two shapes) using a gradient process (WHAT IS A GRADIENT PROCESS?). In other words, a geodesic between two shapes is the path that minimizes the energy to bend one shape into the other. Given an observed shape, GDA is an efficient tool to decide which family of shapes a given shape belongs to does it belong to.

A further refinement of GDA is the development of an algorithm that allows to model shapes of planar contours on elastic curves (Mio et al., 2007). In eleastic GDA a metric is used to capture the elastic properties of the curves. Geodesics in the Riemannian shape space are still used to quantify shape dissimilarities. While the original approach described in Klassen et al. (2004) modeled shapes by “bending-only”, in the approach of Mio et al. (2007) elastic curves along the geodesic paths are obtained by bending, stretching or compressing segments non-uniformly along their lengths. An advantage of elastic GDA is its capability of working with open curves as well as with closed ones.

For a given collection of shapes, GDA allows for the computation of summary statistics, such as mean and covariance. For example, the mean shape of a group of shapes θ1,..., θ2 contained in the space S and geodesic distance d(θi, θj) between θi and θj, is given by the Karcher mean (μ). This mean is defined as the element μ Є S that minimizes ∑i=1 to n d(θi, θj)2. GDA offers an environment for the performance of shape statistics that are intrinsic to the space of planar continuous curves.

In practice, we perform GDA analysis as follows. We beging by tracing the contour of the subject shapes using a Wacom Graphire graphic tablet with Illustrator CS2. The drawing of the closed outlines were saved as vectors and imported into Matlab version 6.5 or R version 2.9.0, which are the software packages we use to run the GDA algorithms. The GDA output is a matrix of pairwise geodesic distances among all the curves for a given collection of shapes. The GDA distances may be subsequently analyzed by ordination processes, such as multiple dimensional scaling, to give two- and three-dimensional portrayals of shape variation.

GDA TEST CASE: MORPHOMETRIC ANALYSIS OF GLOBOROTALIATRUNCATULINOIDES

Lohmann (1983) and Lohmann and Schweitzer (1990) used a data set of two dimensional outlines of oriented specimens of the planktonic foraminifera Globorotalia truncatulinoides to introduce a new method of landmark-free shape analysis: Eigenshape Analysis (ESA). [Using this analytical technique the authors were able to reduce the subtle changes in shape to correlations with a limited number of principal components-like axes. The process begins by digitizing each two dimensional outline with a number of equally-spaced Cartesian points. For each shape the vector of points is converted into a vector of tangents using the method of Zahn and Roskies (1972; see also Bookstein, 1978), and a singular value decomposition of the matrix of shape correlations or covariances yields eigenvectors and eigenvalues, each “accounting” for a decreasing portion of the original variability among the shapes. Individual shapes are projected onto the eigenshape axes and the shape scores utilized as quantitative measures of morphologic difference. ] omit?

Lohmann (1983) and MacLeod (1999) illustrate the results of applying ESA to a set of 26 digitized outlines of G. truncatulinoides (reproduced in Figure ?). The original shapes are projected onto the first two Eigenshape axes, which together account for a total of 73.59% of the original “correlation” between the tangent functions defining the shape outlines (a three axis solution accounts for 76.84%). Also plotted is the hypothetical shape defining the positive end of the first eigenshape axis. Lohmann (1983) and Lohmann and Schweitzer (1990) demonstrate how the individual sample scores and the idealized first eigenshape of stratigraphic samples can be used to illustrate patterns of climatic/stratigraphic related morphologic change in the species.

Subsequent treatments of the technique (e.g. Full and Ehrlich, 1986; Rohlf, 1986; Bookstein, 1990,1991; MacLeod ,1999; MacLeod and Rose, 1993; Lestrel, 2000) expanded and refined

the methodology as well as highlighting some of the inherent problems. Programs and copies of the G. truncatulinoides data are available from PaleoNet at

In order to provide a basis for comparison of GDA with the better-known ESA, we applied GDA to the same G. truncatuloides data used by Lohmann (1983). Since GDA yields a matrix of distances between the shapes, some other technique must be employed to generate a comparable limited-dimensional description of the shape space. We have used non-metric Multidimensional Scaling (nmMDS), an ordination technique that finds a non-parametric monotonic to relate the differences between shapes in a dissimilarity (distance) matrix to placement of those shapes in a (user-specified) limited-dimensional space (Shepard, 1962; Kruskal and Wish, 1977; Young and Hamer, 1994). Similar to ESA, the “importance” of each axis can be quantified by the correlation between inter-shape distances, as projected onto the limited-dimensional space, versus the original matrix of distances provided by GDA. For comparison to Lohmann’s ESA, we have chosen a two dimensional solution, as shown in Figure ?? . The nmMDs analysis was performed using routine “isoMDS” in package “MASS” of program “R”, version 2.9.0 (R Development Core Team, 2009). Ordination of the shapes onto these two axes accounts for 65.42% of the original distance information present in the GDA results (three axis solution accounts for 77.88%).

Both methods produce ordinations of the shapes that make intuitive sense. In Figure ? forms with an open umbilicus are predominantly on the right while those with a more closed umbilicus are on the left (this distinction is better displayed in Figure 4 of Lohmann, 1983, than this figure which more closely duplicates the analysis of MacLeod, 1999). Similarly, in Figure ??, open umbilicus forms are toward the top and closed toward the bottom. Note also that in Figure ?? compressed and high conical shapes appear better separated than in Figure ?, with compressed, cold-water forms toward the lower left, transitioning through the average shapes in the central to upper right, and ending with the open, inflated forms in the top.

One aspect common to both methods (and largely neglected by previous authors) is a treatment of how well the analytical results describe the individual samples in the data set. This is available in ESA as the sum of squares of the sample loadings for the axes used. In GDA with nmMDS, similar information can be produced as the squared correlations of each sample’s ordinated distances with its distances in the original matrix. Both can be viewed as the percentage of each individual shape’s variation explained. As can bee seen in Table ?, both ESA and GDA-nmMDS analyses have considerable variation in how well the individual shapes are projected onto the first two axes.

Shape number / ES loadings squared / GDA-nmMDS correlation squared
1 / 0.75 / 0.82
2 / 0.68 / 0.81
3 / 0.82 / 0.90
4 / 0.66 / 0.81
5 / 0.81 / 0.40
6 / 0.85 / 0.90
7 / 0.79 / 0.79
8 / 0.67 / 0.77
9 / 0.76 / 0.79
10 / 0.66 / 0.80
11 / 0.75 / 0.69
12 / 0.74 / 0.75
13 / 0.64 / 0.65
Shape number / ES loadings squared / GDA-nmMDS correlation squared
14 / 0.72 / 0.75
15 / 0.58 / 0.88
16 / 0.74 / 0.77
17 / 0.77 / 0.68
18 / 0.75 / 0.68
19 / 0.72 / 0.79
20 / 0.77 / 0.74
21 / 0.58 / 0.74
22 / 0.84 / 0.83
23 / 0.84 / 0.79
24 / 0.71 / 0.81
25 / 0.70 / 0.83
26 / 0.63 / 0.79

In order to qualify the differences between the methods, we have generated some simple idealized shapes (shown in Figures ??? and ????) which have similar features at varying locations around the periphery of the shape with an obvious sequence of variation from shape B1 to B6. Two shapes are inverted or rotated versions of others (indicated with “u”) to determine how each method deals with changing the origin of the digitized sequence. As can be seen in the accompanying plots, GDA-nmMDS produces a more satisfying 2 dimensional portrayal of the shape variability than ESA, with a recognizable gradient of shape change between B1 and B6. The gradient of shape variation between the extremes, rather than being linear, is a 3-dimensional helix (or strongly curved horseshoe, similar to “Kendall’s horseshoe”, in 2 dimensions), reflecting the difficulties of portraying a non-metric shape index in metric space. However, even with the inherent multidimensional distortion, GDA better replicates the trend than ESA. The underlying reason for this is that ESA is a essentially principal components analysis of vectors of tangent transforms in which each equally-spaced ordinal position on the perimeter is considered “equivalent” (“homologous”?) to the same ordinal position in all other shapes. Features which vary significantly in their placement around the periphery of a shape are considered different features. In GDA, the elastic geodesic allows similar but variably located features to be considered “equivalent” through “distortion.”Check this last bit with Joshi, it makes sense to me but may not be correct.

GDA APPLICATION: PATTERNS OF VARIATION IN THE PUBIS OF HADROSAURID DINOSAURS

DISCUSSION AND CONCLUSION

ACKNOWLEDGEMENTS

REFERENCES

AdamsDC, Rohlf FJ, SiceDE. 2004. Geometric morphometrics: ten years of progress following the ‘revolution’. Ital J Zool 71:5-16.

Andrews HE. 1974. Morphometrics and functional morphology of Turritella mortoni. J Paleont 48:1126-1140.

Blackith R, Reyment RA. 1971. Multivariate morphometrics. New York: Academic Press. 412 p

Bookstein, FL,1978. The measurement of biological shape and change.Springer-Verlag, Berlin and New York. 191 p.

Bookstein, FL. 1990. Analytic Methods: Introduction and Overview. Pp. 61–74 inRohlf, F. J. and Bookstein, F. L., eds. Proceedings of the Michigan MorphometricsWorkshop. The University of MichiganMuseum of Zoology, Special Publication2, Ann Arbor, MI.

Bookstein, FL. 1991. Morphometric Tools for Landmark Data: Geometry andBiology. CambridgeUniversity Press. Cambridge.

Bookstein FL. 1997. Landmark methods for forms without landmarks: localizing group differences in outline shape. Med Image Anal 1:225-243.

BooksteinFL, Chernoff B, Elder RL, Humphries JMJ, Smith GR, Strauss RE. 1985. Morphometrics in evolutionary biology: the geometry of size and shape change, with examples form fishes. Philadelphia: Academy of Natural Sciences of Philadelphia, Special Publication. 277 p

Chapman RE. 1990. Shape analysis in the study of dinosaur morphology. In: Carpenter K, Currie, PJ, editors. Dinosaur systematics: approches and perspetives. Cambridge: CambridgeUniversity Press. p 163-178.

Cootes TF, Taylor CJ, Cooper DH, Graham J. 1995. Active shape models – their training and application. Computer Vision and Image Understanding 61:38-59.

Corti M. 1993. Geometric morphometrics: an extension of the revolution. Trends Ecol Evol 8:302-303.

Dodson P. 1975. Taxonomic implications of relative growth in lambeosaurine hadrosaurs. Syst Zool 24:37-54.

Dodson P. 1976. Quantitative aspects of relative growth and sexual dimorphism in Protoceratops. J Paleont 50:929-940.

Dryden IL, Mardia KV. 1998. Statistical shape analysis. New York: John Wiley. 347 p

Fink WL, Zelditch ML. 1995. Phylogenetic analysis of ontogenetic shape transformations: a reasssessment of the piranha genus Pygocentrus (Teleostei). Syst Biol 44:343-360.

Full WE, Ehrlich R. 1986. Fundamental problems associated with “Eigenshape Analysis” and similar “Factor” Analysis procedures. Mathematical Geology 18(5):451-463.

Harvati K. 2003. Quantitative analysis of Neanderthal temporal bone morphology using three-dimensional geometric morphometrics. Am J Phys Anthropol 120:323-338.

Klassen E, Srivastava A, Mio W, Joshi SH. 2004. Analysis of planar shapes using geodesic paths on shape spaces. IEEE Trans Pattern Anal Machine Intell 26:372-383.

Kruskal JB,Wish M. 1977. Multidimensional Scaling. Sage University Publication series on Quantitative Applications in the Social Sciences.Series number 07-011. BeverleyHills and London:Sage Publications.

Kulemeyer C, Asbahr K, Gunz P, Frahnert S, Bairlein F. 2009. Functional morphology and integration of corvid skulls – a 3D geometric morphometric approach. Front Zool 6:2.

LestrelPE. 2000. Morphometrics for the Life Sciences. In Recent Advances in Human Biology, C. Oxnard, Series Ed. Singapore: World Scientific. 261pp.

Lohmann, GP. 1983. Eigenshape analysis of microfossils: A general morphometricmethod for describing changes in shape. Mathematical Geology 15:659-672.

Lohmann, G P.,Schweitzer, PN. 1990. On eigenshape analysis. In Rohlf, FJ.,Bookstein, FL., eds. Proceedings of the Michigan MorphometricsWorkshop. The University of MichiganMuseum of Zoology, Special PublicationNo. 2, Ann Arbor. p. 145-166

MacLeod N. 1999. Generalizing and extending the eigenshapemethod of shape space visualization and analysis. Paleobiology 25(1):107-138.

MacLeod N. 2001. Landmarks, localization, and the use of morphometrics in phylogenetic analysis. In: Edgecombe G, Adrain J, Lieberman B, editors. Fossils, phylogeny, and form: an analytical approach. New York: Kluwer Academic/Plenum. p 197-233.

MacLeod N, Rose KD. 1993. Inferring locomotor behavior in Paleogenemammals via eigenshape analysis. American Journal of Science 293-A:300-355.

McLellan T, Endler JA. 1998. The relative success of some methods for measuring and describing the shape of complex objects. Syst Biol 47:264-281.

Marcus LF. 1990. Traditional morphometrics. In: Rohlf FJ, BooksteinFL, editors. Proceedings of the Michigan morphometrics workshops. Ann Arbor: University of Michigan

Marugán-Lobón J, Buscalioni AD. 2003. Disparity and geometry of the skull in Archosauria (Reptilia: Diapsida). Biol J Linn Soc 80:67-88.

Marugán-Lobón J, Buscalioni AD. 2009. New insight on the anatomy and architecture of the avian neurocranium. Anat Rec 292:364-370.

Mio W, Srivastava A, Joshi SH. 2007. On shape of plane elastic curves. 2006. Int J Comp Vision 73:307-324.

Naylor GJP, Marcus LF. 1994. Identifying isolated shark teeth of the genus Carcharhinus to species: relevance for tracking phyletic change through the fossil record. Am Mus Novit 3109:1-53.

Nickel M, Bullinger E, Beckmann F. 2006. Functional morphology of Tethya species (Porifera): 2. Three-dimensional morphometrics on spicules and skeleton superstructures of T.minuta. Zoomorph 125:225-239.

Persoon E, Fu K. 1986. Shape discrimination using Fourier descriptors. IEEE Trans Pattern Anal Machine Intell 8:388-397.