Evol Biol (2012) 39:536–553 DOI 10.1007/s11692-012-9178-3 SYNTHESIS PAPER How to Explore Morphological Integration in Human Evolution and Development? Philipp Mitteroecker • Philipp Gunz Simon Neubauer • Gerd Mu¨ller • Received: 6 February 2012 / Accepted: 5 April 2012 / Published online: 28 April 2012 Ó Springer Science+Business Media, LLC 2012 Abstract Most studies in evolutionary developmental biology focus on large-scale evolutionary processes using experimental or molecular approaches, whereas evolutionary quantitative genetics provides mathematical models of the influence of heritable phenotypic variation on the shortterm response to natural selection. Studies of morphological integration typically are situated in-between these two styles of explanation. They are based on the consilience of observed phenotypic covariances with qualitative developmental, functional, or evolutionary models. Here we review different forms of integration along with multiple other sources of phenotypic covariances, such as geometric and spatial dependencies among measurements. We discuss one multivariate method [partial least squares analysis (PLS)] to model phenotypic covariances and demonstrate how it can be applied to study developmental integration using two empirical examples. In the first example we use PLS to study integration between the cranial base and the face in human postnatal development. Because the data are longitudinal, we can model both cross-sectional integration and integration of growth itself, i.e., how cross-sectional variance and covariance is actually generated in the course of ontogeny. We find one factor of developmental integration (connecting facial size and the length of the anterior cranial base) that is highly canalized during postnatal development, leading to decreasing cross-sectional variance and covariance. P. Mitteroecker (&) G. Mu¨ller Department of Theoretical Biology, University of Vienna, Althanstrasse 14, 1090 Vienna, Austria e-mail: [email protected] P. Gunz S. Neubauer Department of Human Evolution, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany 123 A second factor (overall cranial length to height ratio) is less canalized and leads to increasing (co)variance. In a second example, we examine the evolutionary significance of these patterns by comparing cranial integration in humans to that in chimpanzees. Keywords Canalization Cranial growth Developmental integration Modularity Morphometrics Partial least squares analysis Introduction A central theme in evolutionary developmental biology (EvoDevo) is the influence of the developmental system— the processes by which genotype translates into phenotype—on evolutionary change (e.g., Raff 1996; Wagner 2000; Arthur 2002; Mu¨ller 2007). EvoDevo studies often focus on large-scale evolutionary processes, such as the emergence of novel anatomical structures or of entire body plans, and on how development constrains or drives these processes. In parallel, there is a long-standing tradition in evolutionary quantitative genetics to model the influence of heritable phenotypic variation—which largely is determined by the developmental system—on the response to natural selection (e.g., Fisher 1930; Lande 1979; Arnold et al. 2001). While EvoDevo usually aims at qualitative, causal explanations, quantitative genetics provides a set of formal mathematical models. Studies performed under the heading of morphological integration or phenotypic integration typically are situated in-between these two styles of explanation. Observed phenotypic variances and covariances are interpreted in terms of (qualitative) developmental or functional models, and evolutionary inferences are derived from the observed patterns (e.g., Chernoff and Evol Biol (2012) 39:536–553 Magwene 1999; Pigliucci and Preston 2004; Mitteroecker and Bookstein 2007; Hallgrimsson et al. 2009). Conclusions mostly are not based on formal models, but on the ‘‘consilience’’ (Wilson 1998; Bookstein in press) of multiple lines of evidence, both quantitative and qualitative. In the early 20th century, pioneers such as D’Arcy Thompson, Sewall Wright, and Paul Terentjev developed ingenious approaches to study the integration of morphological traits. Thompson (1917) considered inter-species differences of complex anatomical structures as relatively simple—hence structurally integrated—geometric transformations, whereas Terentjev (1931) and Wright (1932) devised hierarchical statistical models to explain phenotypic covariances within a population. In 1958, at a time when most scientists focused on the evolution of single isolated traits, the two paleontologists Everett Olson and Robert Miller published their influential book ‘‘Morphological Integration,’’ in which they emphasized developmental and functional dependencies among traits and their resulting coevolution. Olson and Miller’s statistical and conceptual approaches, however, were relatively simple— as were those of Berg (1960), who continued Terentjev’s work in botany. Based on extensive plant breeding and crossing experiments, Jens Clausen and colleagues (e.g, Clausen and Hiesey 1960) computed an array of phenotypic correlations and interpreted them in a thoughtful genetic context. In the 1980s, Jim Cheverud, Miriam Zelditch, and others (e.g., Cheverud 1982, 1989; Zelditch 1987, 1988; Cheverud et al. 1989) raised renewed interest in morphological integration by applying novel statistical techniques to primate and rodent morphology. By connecting morphological integration to the emerging concepts of developmental and genetic modularity, it became part of contemporary EvoDevo theory and evolutionary quantitative genetics (e.g., Lande 1980; Bonner 1988; Cheverud 1982, 1996a, b; Raff 1996; Wagner and Altenberg 1996). Advances in geometric morphometrics and multivariate statistics have led to another series of publications on morphological integration in the new millennium (e.g., Rohlf and Corti 2000; Klingenberg and Zaklan 2000; Bookstein et al. 2003; Klingenberg et al. 2003; Hallgrimsson et al. 2004, 2006; Bastir and Rosas 2005; Monteiro et al. 2005; Gunz and Harvati 2007; Mitteroecker and Bookstein 2008). All these approaches share the focus on inter-dependences between measured traits at various causal and statistical levels, which are interpreted within a developmental, functional, or evolutionary context. Most work in contemporary EvoDevo is experimental and at the molecular level, whereas empirical quantitative genetic research requires large-scale breeding experiments to reliably estimate genetic variances and covariances. Therefore, studies of morphological integration, which can make use of adult or postnatal individuals and concentrate 537 on phenotypic instead of genetic covariances, are ideal to address EvoDevo questions in anthropology and primatology. In this paper we aim to place the study of morphological integration in a contemporary biological and biometric context. We describe a multivariate statistical approach to explore patterns of integration and apply it to study morphological integration of cranial growth in humans and chimpanzees. As the line separating an insightful study of morphological integration from an ad hoc story is relatively thin, we start with an outline of the conceptual framework—the biometrics of morphological integration—, determining how statistics and biology must meet in order to arrive at a successful consilience of the two kinds of evidence. The Biometrics of Morphological Integration Where do Covariances Come From? The parts of an organism develop in a coordinated way. Adjacent elements of complex anatomical structures, such as the cranium, physically interact during development to form a tightly integrated adult phenotype. Many growth factors and signaling molecules affect different tissues and body parts, and thus mechanistically link these parts in the course of their development. Likewise, signaling cascades and induction processes interconnect different body parts during development. These processes have been termed developmental integration and referred to as ‘‘individuallevel integration’’ by Cheverud (1996a, b); the underlying mechanisms are rooted in individual development and can be studied experimentally (compare also Needham’s 1933 approach to ‘‘dissociability’’ in development). Variation of such integrated developmental processes in a sample of different individuals induces a covariance between the phenotypic traits affected by these processes. A related concept in genetics, exactly a century old, is pleiotropy: the effect of genes (or of mutations of these genes) on multiple traits (e.g., Hodgkin 1998; Wagner and Zhang 2011). Pleiotropy can result from multiple molecular functions of a gene product, from the expression of a gene in multiple tissues, and from the chemical and mechanical integration of developmental processes. Allelic variation of a pleiotropic gene induces covariance between the affected traits. Genotype-phenotype maps are graphical or mathematical representations of the relationship between a set of (pleiotropic) genes and a set of phenotypic traits; they are frequently used in theoretical studies to represent integration due to pleiotropic genes and to compute the induced phenotypic covariances (e.g., Wagner and Altenberg 1996; Stadler and Stadler 2006; Mitteroecker 123 538 Evol Biol (2012) 39:536–553 a b c Fig. 1 a A pathmodel of a simple genotype-phenotype map, illustrating the linear effects of two local growth factors A and B on the phenotypic traits V1 . . .V6 . When both factors have unit variance, the phenotypic covariances are given by the products of the corresponding path coefficients, e.g., Cov(V1, V2) = 0.42 9 0.28 = 0.12. Because of the modular genotype-phenotype map, the two groups of variables V1 . . .V3 and V4 . . .V6 are uncorrelated—they are variational modules. b A genotype-phenotype map with two pleiotropic factors C, D. Phenotypic covariances are given by the sums of the covariances induced by C and D, e.g., Cov(V1, V2) = 0.3 9 0.2 ? 0.3 9 0.2 = 0.12. Note that the covariances between V1 . . .V3 and V4 . . .V6 cancel out so that the variables have the same modular covariance structure as in (a), even though the genotype-phenotype map is not modular. c Another simple but slightly more realistic genotype-phenotype map, consisting of one global and two local factors, such as in Wright’s 1932 model. Phenotypic covariances reflect the local or modular growth factors only if all path coefficients are approximately equal (Mitteroecker and Bookstein 2007) and Bookstein 2007; Pavlicev and Hansen 2011; see also Fig. 1). A major, often dominating component of phenotypic covariation is related to allometry, the effect of overall size on organismal shape (Huxley 1932; Bookstein 1991; Gould 1977; Klingenberg 1998). Whenever size varies, allometry induces phenotypic covariances (both in ontogenetic samples and in samples of adult specimens). Individual differences in body size owe to a large part to differences in the timing of growth and development. In primates, body size is mainly determined by the amount and duration of the expression of growth hormones during postnatal development and by the onset of steroid hormone expression during puberty (e.g., Bogin 1999). Apart from these highly pleiotropic growth factors, covariances in a population due to allometry not necessarily reflect developmental integration. Two body parts under completely independent genetic and developmental control would still covary in a population if the amount or duration of overall growth varies (they would be uncorrelated only after statistically controlling for overall size; see below). In addition to developmental integration and pleiotropy, phenotypic covariances in a population can owe to linkage disequilibrium, the non-random association of alleles at two or more loci affecting different traits. Linkage disequilibrium can result from genetic linkage, the co-inheritance of genes due to their physical proximity on a chromosome. Among other factors such as non-random mating and population structure, linkage disequilibrium can also result from correlational selection. When several anatomical elements are jointly involved in a particular function, their dimensions usually need to fit together tightly (consider, e.g., the bony and cartilaginous elements of a joint such as the knee). This functional integration leads to correlational selection, i.e, selection for particular character combinations, which in turn leads to the covariance of traits within a population, even if they are not linked developmentally. Likewise, a joint function of traits during development can result in prenatal correlational selection (internal selection). However, the contribution of linkage disequilibrium to phenotypic covariances is small compared to developmental integration and pleiotropy unless correlational selection is strong and persisting (Lande 1980, 1984; Lynch and Walsh 1998; Sinervo and Svensson 2002). Following the usual distinction in quantitative genetics between genetic variation and environmental variation, one can contrast genetic integration with environmental integration (Cheverud 1982, 1996a, b). Genetic integration is the co-inheritance of traits, resulting from pleiotropy and from linkage disequilibrium. Environmental integration is the integration of phenotypic traits due to environmental, non-heritable influences on development. Covariances and correlations between traits are not only determined by common developmental and genetic causes, but also by the variance of the underlying growth processes and by the allele frequencies of pleiotropic genes (e.g., Pigliucci 2006; Mitteroecker and Bookstein 2008; Hallgrimsson et al. 2009). If a pleiotropic factor does not vary in a sample, it induces no covariance, even though the traits are developmentally linked. Two species might share the same pleiotropic growth factor (the same developmental integration), but differ in the variance of this factor and hence also in covariance. Reflecting Wagner and Altenberg’s (1996) distinction between variation and variability, Hallgrimsson et al. (2009) thus defined integration as the 123 Evol Biol (2012) 39:536–553 539 ability to covary, which is determined by the underlying developmental factors; the manifestation as observable phenotypic covariance depends on the variation of these factors in a population. Phenotypic variances and covariances in samples of adult individuals are the result of variation in a vast array of developmental processes. A covariance between two traits close to zero can result from the absence of any developmental and genetic factors leading to integration (or from the lack of variation in these factors), but two or more pleiotropic factors may also cancel out: some factors inducing a positive covariance and some factors inducing a negative covariance of the same amount would lead to statistically (but not developmentally) independent traits (Fig. 1a, b). More often, covariances of opposite sign may not cancel out exactly but lead to a reduced total covariance, even though many developmental or genetic factors may link the traits (Clausen and Hiesey 1960; Houle 1991; Cheverud 1984; Gromko 1995; Pigliucci 2006; Mitteroecker and Bookstein 2007; Mitteroecker 2009; Pavlicev and Hansen 2011). In addition to all these biological factors, a further (and often neglected) source of phenotypic covariances is the nature of the measurements themselves. For example, in a set of distance measurements between landmarks, distances sharing the same start or end point necessarily correlate, but no biological interpretation of this correlation is warranted. Likewise, size-corrected measurements, such as distance ratios with the same denominator or Procrustes shape coordinates, are geometrically dependent. Also the spatial distribution of measurements affects the correlation structure: closely adjacent measurements necessarily correlate higher than more distant measurements (e.g., Mitteroecker 2009; Huttegger and Mitteroecker 2011). For example, Sawin et al. (1970) reported an approximately linear decline in the correlation among dimensions of rabbit bones with their spatial distance. An even more fundamental difficulty is the definition of measurements or phenotypic traits (particularly of distance measurements), as pointed out by Wagner and Zhang (2011). For instance, the length of the upper jaw (LU) and the length of the lower jaw (LL) are highly correlated and developmentally integrated, but the mathematically equivalent variables ‘‘upper jaw length plus lower jaw length’’ (LU ? LL) and ‘‘difference between upper and lower jaw length’’ (LU - LL) are uncorrelated. Phenotypic covariances thus cannot be interpreted without reference to the generation of the variables and their spatial and geometric dependencies. traits, and low correlations as evidence of the absence of integration. Based on this rationale, they defined q-sets as sets of variables with high mutual correlations within one set and low correlations between variables from different q-sets. Terentjev (1931) and Berg (1960) referred to such highly correlated sets of variables as correlation pleiades, whereas in the more recent morphometric and quantitative genetic literature they are called variational modules (Wagner et al. 2007; Mitteroecker 2009, Wagner and Zhang 2011). They are frequently interpreted as indications of developmental modules (e.g., Klingenberg 2008). Given the many possible origins of covariances listed above, it should be evident that phenotypic covariances and correlations can not be taken as direct evidence for developmental integration. In particular, this applies to low covariances, which can result from different developmental factors with opposite effects rather than from the absence of any such factors (which is very unlikely in higher animals). Terentjev (1931) and Wright (1932) thus removed estimates of pleiotropic factors from the data before interpreting correlations as the result of local or modular developmental processes (see also Hansen 2003; Mitteroecker and Bookstein 2007). Both arrived at a hierarchical model of factors influencing phenotypic variation and covariation—a nested arrangement of factors with different pleiotropic ranges (Fig. 1c). Mitteroecker and Bookstein (2007) showed that net phenotypic covariances reflect developmental modularity as expected by Olsen and Miller only for size measurements (distances, volumes, etc.) and if these factors induce almost isometric growth. The multivariate estimation of factors that together explain the observed variances and covariances make much more biometric sense than the interpretation of raw covariances. The factors can be interpreted as regressions of the variables on the (unmeasured) factor scores—as models of how an underlying growth factor affects phenotypic traits (the path coefficients in Fig. 1). They quantify the ability of phenotypic traits to covary, not the actual covariance. All the loadings of one factor can be interpreted and visualized as a single spatial pattern. There is a large body of statistical literature on exploratory and confirmatory factors analysis, but only few approaches have been applied to morphometric data. Below, we describe one multivariate approach to study morphological integration, two-block partial least squares analysis, which turns out to be closely related to Wright’s (1932) method. How to Interpret Phenotypic Covariances? How Does Integration Affect Evolution? Olson and Miller (1958), like many other authors, interpreted high phenotypic correlations or covariances as evidence of developmental or functional integration between Developmental integration due to heritable pleiotropic factors as well as genetic linkage leads to joint inheritance (genetic integration) of trait values. Directional selection of 123 540 Evol Biol (2012) 39:536–553 a trait A that is genetically correlated with a trait B will induce an indirect response in trait B to the selection of A (e.g., Lande 1979; Falconer and Mackay 1996). For example, many developmental processes affect both forelimbs and hindlimbs. If individuals with long hindlimbs would produce more offspring than those with short hindlimbs, a larger fraction of the offspring will have longer hindlimbs than of the parent generation and, because of the joint inheritance, also longer forelimbs. Forelimb length indirectly responds to the selection on hindlimb length. Interpreting the evolutionary change of forelimb length itself as an adaptation thus would be highly misleading. If the indirectly affected trait B would be neutral with respect to fitness, it might be permanently changed as an indirect response to selection of A (Fig. 2a). If B would itself be under stabilizing or conflicting directional selection, the genetic correlation between the two traits would only affect short-term evolution (e.g., Schluter 1996). Eventually, selection would compensate for the indirect response in trait B, leading to a curved instead of a linear ‘‘evolutionary trajectory’’ (Fig. 1b)—genetic integration would have no persisting evolutionary effect. If, for instance, forelimb length affects some relevant function and hence is under stabilizing selection, directional selection of the hindlimbs would initially modify average forelimb length, but after some generations—depending on the genetic correlation and the selection pressures—the forelimbs would again assume their optimal length. Note that the short-term response to selection is determined by the net genetic variances and covariances, regardless of the underlying genotype-phenotype map or the actual developmental integration (e.g., both genotypephenotype maps in Fig. 1a, b induce the same covariance structure and hence lead to the same response to selection). Models of long-term evolution (including the model in Fig. 2) often are based on the idealized assumption that the genetic covariance structure remains stable. But genetic variances and covariances are modified both by directional and stabilizing selection, and by the pattern of new variation and covariation produced by mutations, which in turn is largely determined by the developmental system and the genotype–phenotype map (Lande 1979, 1980; Cheverud 1984). In some cases, when a trait is tightly integrated with another trait that is under very strong stabilizing selection, integration might prevent any evolutionary change (developmental constraint; Cheverud 1984; Maynard Smith et al. 1985). For example, Galis et al. (2006) explained the highly conserved number of cervical vertebrae in mammals by the deleterious side effects during development that a modification of the number of vertebrae would have. By contrast, integration between functionally related traits that are subject to the same selection regime can facilitate evolution by channeling variation in an adaptive direction. The response of a population to selection is determined by genetic variance and covariance (quantified by the G matrix), not by phenotypic (co)variance (the P matrix). By contrast, most studies on morphological integration are based on phenotypic covariances (but see, e.g., Leamy 1977; Cheverud 1982; Martinez-Abadias et al. 2009; in press). There is a large and inconclusive body of literature on the question of whether P is a useful substitute for G in evolutionary models (e.g., Cheverud 1988, 1996a, b; Roff 1997; Marroig and Cheverud 2004). Reliable estimates of genetic covariances require large-scale breeding experiments, which are not possible in anthropology and primatology; estimates based on collections of human bones with Fig. 2 The ellipses represent the distribution of heritable phenotypic variation (G matrices) for the two traits A, B, and the gray values represent fitness. In a only trait A is under directional selection but trait B indirectly responds because of the genetic correlation between traits. In b trait A is under directional selection and B under stabilizing selection (it is at the fitness optimum already). Trait B initially responds to the selection of A, but later assumes its original value again 123 Evol Biol (2012) 39:536–553 541 known genealogies (such as the Hallstatt collection) are connected with large standard errors. In typical studies of morphological integration, however, only the major factors of covariance, which are reflected both by G and P, can be reliably identified and interpreted. and for larger evolutionary changes, involving non-linear genotype-phenotype effects, constrained pleiotropy and modular genotype phenotype maps are important for increasing evolvability (Mitteroecker 2009; Pavlicev and Hansen 2011). Does Integration Evolve? How to Estimate Factors of Developmental Integration? Evolvability is the capacity for an adaptive response to selection (e.g., Wagner and Altenberg 1996; Hansen and Houle 2008). The evolvability of a trait is determined by the amount of heritable phenotypic variance of this trait and by the genetic covariance with other traits. If one of two genetically correlated traits is under directional selection, the other trait will indirectly respond to selection (Fig. 2). If this other trait is under stabilizing selection, or under directional selection in the opposite direction, the indirect response would have negative effects on fitness. Thus, functionally unrelated traits, which are subject to different selection regimes, should be genetically uncorrelated (variational modular) in order to maximize evolvability. On the other hand, functionally related traits should vary in a concerted way to increase evolvability (think again on the elements of a joint or of the masticatory apparatus). One would thus expect that integration evolves to reflect functional dependencies among traits; Riedl (1978) called this the ‘‘imitatory epigenotype’’. The contemporary EvoDevo literature and some of the quantitative genetics literature use the term modularity instead: development, or the genotype-phenotype map, should evolve so that most genes mainly affect functionally related traits (a modular genotype-phenotype map with restricted pleiotropy; Raff 1996; Wagner and Altenberg 1996; Mitteroecker 2009; Wagner and Zhang 2011). However, empirical evidence for the evolution of developmental integration is scarce. Quantitative genetic models predict that genetic correlations evolve to reflect functional dependencies (Lande 1979, 1980; Cheverud 1996a, b; Arnold et al. 2008), but models about the evolution of the underlying genotypephenotype map (developmental integration) are partly contradictory (reviewed by Pavlicev and Hansen 2011). It became clear, however, that variational modularity— reduced genetic correlations between functionally unrelated traits—does not require a modular genotype-phenotype map. Multiple pleiotropic genetic factors can partly cancel out so that genetic covariances are reduced (Fig. 1). Hansen (2003) and Pavlicev and Hansen (2011) showed that under most selection scenarios genotype-phenotype maps with multiple overlapping pleiotropic factors even lead to a higher evolvability than purely modular genotypephenotype maps because of the increased genetic variance. However, for new mutations with large pleiotropic effects In his 1932 paper ‘‘General, group and special size factors’’, Sewall Wright devised a method to estimate general factors that account (in a least-squares sense) for all the pairwise correlations between certain groups of variables. In addition to general factors, Wright estimated group factors that account for the residual correlations within these groups of variables. He selected the actual groups of variables by careful inspection of the residual correlations after removing an initial estimate of a general factor. Based on these groups of variables, he updated the general factor (see also Bookstein 1991; Mitteroecker and Bookstein 2007). Wright arrived at a hierarchy of nested and overlapping general factors and group factors (e.g., Fig. 1c). He did not interpret these factors as single genes but as the ‘‘entire array of factors, environmental as well as genetic, which have a general effect on growth’’ (p. 605). The hierarchy of general factors and group factors corresponds well to our usual biological explanations. General factors (common factors in Mitteroecker and Bookstein 2007, 2008) reflect genetic factors with wide pleiotropic ranges, such as genes expressed in different body parts or genes with many downstream effects. They also reflect epigenetic interactions of developmental processes, such as tissue inductions or mechanical interactions, as well as common environmental influences, linking the variation of different tissues or body parts. These general or common factors account for the joint variation—the integration—of different morphological traits. Group factors or local factors, by contrast, reflect factors with more local effects on growth. Notice that while these local factors are more ‘‘modular’’ than the general factors, the hierarchical and overlapping group factors do not necessarily induce morphological modularity. Wright applied his approach (sometimes referred to as Wright-style factor analysis) only to a small number of variables. For more variables, as they occur in modern morphometrics, a visual inspection of correlations or covariances is not possible. Furthermore, in geometric morphometrics not all covariances can be large and positive, not even for isometric growth. Wright’s approach cannot be completely extended to a modern multivariate context, even though the algebra remains valid. A series of recent papers on morphological integration used another technique, two-block partial least squares analysis (PLS), which was invented by Herman Wold in 123 542 1966 (for morphometric examples see Bookstein 1991; Rohlf and Corti 2000; Bookstein et al. 2003; Gunz and Harvati 2007; Mitteroecker and Bookstein 2008). Several variants of this technique are used in multivariate biometrics and chemometrics; they are often referred to as ‘‘multivariate calibration’’ techniques (Martens and Naes 1989). For two groups or blocks of variables, the algorithm seeks a linear combination for each block so that the covariance between these two linear combinations is a maximum. Further components can be extracted after regressing or projecting out these linear combinations separately from each block of variables. The high-dimensional pattern of covariances between the two blocks can thus be represented by a small number of dimensions (linear combinations). Several extensions of the PLS algorithm to multiple blocks have been published. In studies of morphological integration, the groups of variables usually are derived from some developmental or functional models, and PLS is used to explore the multivariate pattern of covariance between these groups. Even though Wright-style factor analysis and two-block PLS originate from different statistical contexts, Mitteroecker and Bookstein (2007) demonstrated that both techniques are numerically identical. Both are least-squares estimates of between-block covariances or correlations, yet differing in their typical applications. Wright inferred the groups of variables from residual correlations, whereas they are defined prior to the analysis in most PLS applications. When both techniques are applied to the same groups of variables, the resulting path coefficients or weightings for the linear combinations differ only by scaling. In PLS, the weightings are standardized to unit sum of squares, whereas in Wright’s approach they are scaled to reflect how much of the pattern in one block corresponds to how much of the pattern in the other block (but both approaches give the same pattern). Mitteroecker and Bookstein (2007) showed how the PLS scores can be scaled in order to reflect this quantitative relationship as in Wright-style factor analysis. Only after such a scaling can PLS vectors (singular warps) be visualized within a single shape configuration (for examples see Mitteroecker and Bookstein 2008 and below). The resemblance of PLS and Wright-style factor analysis allows for a biological interpretation of PLS. When PLS is applied as an exploratory tool to represent the multivariate pattern of covariance between two or more blocks of variables, these blocks need not necessarily represent developmental or variational modules; they may just be selected because the corresponding anatomical units serve different functions or have different evolutionary histories. But the PLS loadings for all blocks, taken together and scaled accordingly, can be interpreted as a pleiotropic factor integrating these blocks. The choice of the 123 Evol Biol (2012) 39:536–553 blocks of variables and the selected sample determine how well the estimated factors correspond to actual biological models. Several PLS dimensions (pleiotropic factors) can be extracted and removed from the data; variances and covariances of the residuals are then due to local or modular developmental factors—group factors sensu Wright (for more details see Bookstein 1991; Mitteroecker and Bookstein 2007, 2008). What Kinds of Samples do We Need to Study Developmental Integration? Developmental integration is best studied experimentally or in longitudinal growth series. In the latter case, when individuals are measured at multiple age stages, covariances can be computed for individual growth, i.e., for individual differences between the age stages (see the analysis below). But for technical, biological, and ethical reasons, large ontogenetic samples of primates often are cross-sectional (i.e., each individual is measured only once, such as in museum collections) so that individual growth and development cannot be assessed. Yet, patterns of covariance in a cross-sectional sample comprising different age stages do not necessarily reflect developmental integration. For example, two body parts such as the face and the neurocranium both grow postnatally, but the way the average facial growth coincides with the average neurocranial growth does not necessarily imply any causal relationship; if the face and the neurocranium were under completely independent genetic and epigenetic control, they would still be (spuriously) correlated across different age groups. In a cross-sectional sample, developmental integration should be studied across individuals of the same developmental stage (see also below). Phenotypic covariances within such a sample result from one or more common developmental factors or tissue interactions, or from some of the other sources described above. Covariances in a sample of adult individuals reflect developmental processes and interactions throughout the complete prenatal and postnatal ontogeny (Hallgrimsson et al. 2007; Mitteroecker and Bookstein 2009). When a sample is comprised of multiple populations or species that differ in average phenotype, overall covariances are dominated by the species differences. Hence these covariances among populations not only depend on developmental integration and linkage, but to a large extent on the coevolution of traits due to joint selection, drift, and gene flow, as well as on the phylogenetic relations among the populations in the sample (Armbruster and Schwaegerle 1996). Developmental and genetic integration must be assessed from within-population covariances. If it can be expected that the populations have similar integration Evol Biol (2012) 39:536–553 patterns and are not too different in mean shape (see below), one can use pooled within-population covariances, which are the covariances after subtracting from each individual the corresponding population mean. To some degree, the same argument also applies to different sexes within a sample, so that sexual dimorphism might be removed from the data by subtracting the corresponding population-specific sex mean from each individual. The ensuing integration pattern may then be compared to the species mean differences or the between-species covariances (the covariances across the species means), i.e., to the pattern of evolutionary integration. As mentioned above, in a cross-sectional sample, integration should be studied across individuals of the same developmental stage. But developmental stages often can not clearly be identified or even be defined. Alternatively, one may use specimens of the same age, or adult individuals (of any age). In such a sample, variability in developmental timing would considerably affect phenotypic variances and covariances. Static allometry or ontogenetic allometry based on the final age period likely captures most of these shape differences and hence should be removed from the data (e.g., by regressing or projecting out size from the variables; Rohlf and Bookstein 1987). How to Register Landmark Configurations in Studies of Integration? Geometric morphometric studies require the superimposition (registration) of landmark configurations in order to remove variation in overall position, size, and orientation (e.g., Rohlf and Slice 1990; Bookstein 1996; Mitteroecker and Gunz 2009). When studying the integration between two or more anatomical parts (sets of landmarks), the landmark configurations can either be superimposed by a single Procrustes registration, or each part can be superimposed separately. In the first case, relative size, position, and orientation of the two parts are retained whereas they are lost when superimposing the parts separately. A single superimposition, however, induces covariances between the parts that must not be biologically interpreted. For example, the standardization of overall size during the Procrustes superimposition induces a negative correlation between the relative sizes of two parts: if one part increases in relative size, the relative size of the other part necessarily decreases. The same applies to size corrections for other kinds of measurements (e.g., linear distances). Does the Number of Measurements Matter? The number of measurements used to describe a structure can be interpreted as a form of weighting of this part relative to other parts (Mitteroecker and Huttegger 2009; 543 Huttegger and Mitteroecker 2011): the more measurements (e.g., linear distances, landmarks) per anatomical part, the more influence has this part on multivariate statistical parameters. For example, the covariance between two singular warp scores (which is equal to the singular value) depends on the number of landmarks (Mitteroecker and Bookstein 2007). Likewise, most indices of overall integration (e.g., Pavlicev et al. 2009; Haber 2011) depend on the number and spatial distribution of measurements. However, when sufficiently many measurements are taken, estimates of the pattern of integration (such as singular warps or common factors) usually are unchanged by small modifications of the number and position of measurements (see also the analysis below). Likewise, estimating the position of semilandmarks along curves or surfaces (Bookstein 1997; Gunz et al. 2005) or the position of completely missing landmarks (Gunz et al. 2009) increases the covariance between sets of (semi)landmarks, but usually does not considerably affect the spatial pattern of integration. How to Compare Integration Across Populations and Species? Primates, like other groups of closely related species, share the vast majority of genes, organs, bones, and muscles. The physical and chemical conditions affecting development are the same in these species. Environments and life styles vary, but only within a limited range. Anatomical differences between related primates are mainly quantitative (e.g., bones differ in size and shape across primates, but basically all primates have the same bones). Developmental integration—the way changes in the development of one trait affects the development of other traits—thus is expected to be conserved across primates. Because of size differences, however, patterns of variation (and thus also of covariation) may differ considerably. For the given reasons, it is important to separate differences in variance from differences in covariance when comparing integration in multiple populations (see also Mitteroecker and Bookstein 2008; Hallgrimsson et al. 2009). As an example, consider integration between the length of the upper and the lower jaw in humans and chimpanzees. Apparently, upper and lower jaws must be tightly integrated in length to maintain dental occlusion; if the length of the upper jaw increases 1 cm, the length of the lower will also increase about 1 cm, both in humans and in chimps. But because average jaw length in chimpanzees is larger than that in humans, jaw length is also more variable in chimps. Thus—despite the same developmental and functional relationship—the covariance (and usually also the correlation) between upper and lower jaws is larger in chimpanzees than in humans. Differences in 123 544 covariance between two groups do not necessarily indicate differences in developmental integration; a comparison of regression slopes, for instance, would be more useful. Consider further a data set comprised of many cranial measurements on humans and chimpanzees. If applying PLS to both species separately, the first PLS dimension might capture integration between upper and lower jaws in chimpanzees, as it dominates both variance and covariance. In humans, however, where integration of the jaws contributes less to total variance and covariance, it might be represented be the second or higher dimension, or might be ‘‘smeared’’ over multiple dimensions. But concluding that integration differs across the two species, just because the first PLS dimensions differ, would be misleading. Integration is the same, just variance differs. One way to avoid this problem is to compare the statistical association between the same two traits or between the same two linear combinations of traits across different groups. For example, the PLS axes might be computed from only one species or from the pooled within-species distribution, and the scores along these axes are compared across both species (see Mitteroecker and Bookstein 2008 and the analysis below for examples). Another problem is that if average population phenotypes differ substantially, changes in the position of homologous landmarks need not necessarily be comparable across populations. For example, the foramen magnum is approximately horizontally oriented and located below the brain in humans, whereas it is almost vertical and posterior to the brain in mice. The landmarks basion and opisthion (the two borders of the foramen magnum in the midsagittal plane) may be considered as biologically homologous in both species, but an upward shift of these landmarks would indicate a completely different process in humans than in mice. In such a case, integration cannot be compared quantitatively between the two species, but shape changes and integration patterns can be compared qualitatively, e.g., by visual comparison of deformation grids. Primates are more similar than humans and mice, of course, but still differ considerably in certain morphological aspects (e.g., prognathism, brow ridges, cranial crests). Comparative analyses of integration should be carefully interpreted in this regard. Integration of Postnatal Cranial Growth We appled the principles described above to study morphological integration between the cranial base and the face during postnatal human development. The cranial base and the face differ in developmental origin, mode of ossification, and postnatal growth pattern (e.g., Lieberman et al. 2000; Sperber 2001; Helms et al. 2005; Mitteroecker 123 Evol Biol (2012) 39:536–553 and Bookstein 2008), but they physically interact in the course of development. A large number of studies focused on the influence of the cranial base on facial form and orientation during human development and evolution (e.g., Moss and Young 1960; Biegert 1963; Ross and Henneberg 1995; Enlow and Hans 1996; Bookstein et al. 2003; Bastir and Rosas 2005, 2006; Lieberman et al. 2000; Bastir et al. 2010; Lieberman 2011). Most, if not all, of these studies analyzed covariances and correlations in cross-sectional samples or studied average growth patterns. In our first analysis, we study integration of individual postnatal facial growth using longitudinal data in order to investigate how adult integration is generated during ontogeny. In the second analysis, we compare morphological integration in adult humans to that in adult chimpanzees. Analysis 1: Longitudinal Growth Our sample consists of 13 male and 13 female untreated Caucasian individuals of the Denver Growth Study, a longitudinal X-ray study carried out between 1931 and 1966. On a total of 500 lateral radiographs, covering the age range from birth to early adulthood, 18 landmarks were digitized by Ekaterina Stansfield. In the present study we used three of these landmarks to represent the cranial base (basion, sella, nasion) and further three landmarks to represent the maxilla (posterior nasal spine, nasospinale, prosthion) (Fig. 3a; Bulygina 2003; Bulygina et al. 2006). The landmark configurations were superimposed by a Generalized Procrustes Analysis, standardizing for overall size, position, and orientation of the configurations (Rohlf and Slice 1990; Mitteroecker and Gunz 2009). We decided for a single superimposition of all landmarks (instead of two separate ones for the face and the cranial base; see above) because position and orientation of the maxilla relative to the cranial base determines facial size and is an important aspect of cranial morphology. Because not all individuals were radiographed at the same ages, we interpolated the shape coordinates for each individual by a local linear regression. We removed sexual dimorphism by subtracting from each configuration the age- and sex-specific average. The resulting shape coordinates of the landmarks were used for further statistical analysis. Because the data are longitudinal (the same 26 individuals were measured at different ages), we could study morphological integration of growth itself. In other words, we did not study covariation between the shape variables xt at a given age t, but covariation between the shape differences xt?1 - xt. We started our analysis with the shape differences between 2 and 4 years of age. We computed a two-block PLS analysis for the age-related shape differences between the six shape coordinates (three x- and three ycoordinates) of the cranial base and the six shape coordinates Evol Biol (2012) 39:536–553 545 Fig. 3 a The landmarks on the cranial base (basion, sella, nasion) and the upper jaw (posterior nasal spine, nasospinale, prosthion) used in the geometric morphometric analysis. The landmarks are a subset of the data used in Bulygina et al. (2006). b The first pair of singular warps is visualized as a single shape deformation (in both directions); it can be interpreted as a the first general or common factor integrating cranial shape. c The second pair of singular warps (second common factor) of the maxilla, giving for each extracted dimensions one singular vector for the cranial base and one singular vector for the maxilla (the term singular vector is derived from the actual computation, a singular value decomposition). These vectors contain a weighting or loading for each variable, so that the covariance between the linear combinations (weighted sums) specified by these vectors is a maximum. Because in geometric morphometrics the variables are shape coordinates, the vectors can be represented as shape deformations and are called singular warps (Bookstein 1991; Bookstein et al. 2003). The corresponding linear combinations are called singular warp scores; they can be interpreted as coordinates along the singular vectors in shape space. Like in a principal component analysis, multiple dimensions (pairs of singular vectors) can be extracted, each one orthogonal to all previous dimensions. By convention, the singular vectors are unit vectors, i.e., the squared elements of each vector sum up to 1. They represent the shape features with maximum covariance in the sample, but they do not specify how much of one pattern relates to how much of the other pattern. This relationship can be estimated by a major axis regression between the singular warp scores, which is equal to the first principal component axis of these scores. When the singular vectors are weighted by the corresponding loadings, they are equal to Wright’s general factors (Mitteroecker and Bookstein 2007). Results The first pair of singular warps is visualized as a single shape deformation in Fig. 3b and can be interpreted as the first general or common factor integrating facial shape. A relatively long maxilla with an anteriorly positioned prosthion and an inferiorly positioned posterior nasal spine is associated with a flexed cranial base and a relatively short clivus (sella-basion distance). Conversely, a less flexed cranial base and an elongated clivus is associated with a short upper jaw. Note that these shape changes affect the relative width of the pharynx. Basically, this factor seems to reflect a large face together with a long anterior cranial base relative to the clivus, which is part of the middle cranial fossa. The second common factor reflects the mainly uniform shape differences between short and high faces versus long and low faces: both the cranial base and the jaw are affected in the same way by this pattern. The two common factors account for 95.5 % of the summed squared covariances between the variables of the 123 546 face and the cranial base; they are both statistically significant with P \ 0.01 (the covariances or singular values significantly deviate from a permutation distribution; Rohlf and Corti 2000; Mitteroecker and Bookstein 2008). The two factors are uncorrelated in the growth period from 2 to 4 years (r = 0.04), and they also show very low correlation for the other growth periods as well as for the cross-sectional samples. The common factors thus seem to represent independent growth processes (but not necessarily two single genes), even though average shape change from 2 to 4 years comprises a combination of the two common factors, both an increase of facial height and a relative enlargement of the face and the anterior cranial base (see Bulygina et al. 2006). We estimated and visualized integration of growth from 2 to 4 years of age; the common factors of the subsequent growth periods closely resemble to the ones presented in Fig. 3 and hence are not shown. Figure 4a plots the covariance between the first pair of singular warps (between the shape features affected by common factor 1 as estimated from the growth from 2 to 4 years) for all oneyear growth intervals within the sampled age range (from 2 to 3 years, 3–4, 4–5, etc.). The covariance decreases sharply and almost goes to zero at about 8 years of age; it rises again during puberty and decreases thereafter. This plot also shows the variance of common factor 1 during the different growth intervals. Apparently, covariance decreases because variance decreases. The pleiotropic nature of the common factor remains unchanged but it ceases to vary across individual growth, probably because it stops contributing to development. Instead of covariances of growth, we can also compute the usual cross-sectional covariances (covariance across individuals of the same age) between the shape features affected by common factor 1. Interestingly, even though growth of these shape features is integrated and the underlying common factor varies during the first 6–8 years of postnatal growth (Fig. 4a), cross-sectional variance and covariance decrease (Fig. 4b). But how can the cross-sectional variance of the factor decrease, even though it varies during growth? The answer is given in Fig. 5a: Growth (the shape difference xt?1 - xt) is negatively correlated with individual morphology (xt) till about 8 years of age. This means that individuals with a high score for common factor 1 (relatively small face, long clivus) experience less than average growth of these features (relative facial size increases, relative clivus length decreases as compared to the average), and individuals with a low score experience more than average growth along this factor. Such a process of variance reduction during growth has been termed targeted growth or developmental canalization (e.g., Waddington 1942; Tanner 1963; Debat and David 2001). From 2 to 8 years of age, cross-sectional variance of common 123 Evol Biol (2012) 39:536–553 factor 1 is halved; the individuals gain similar relative sizes of the face and the clivus, including a similar relative pharyngeal width. The developmental dynamics of common factor 2 clearly differ from the dynamics of the first factor. Cross-sectional variance and covariance continuously increase (Fig. 4d) even though variance and covariance of growth decrease during the first years of age (Fig. 4c). Cross-sectional (co)variance increases because growth along common factor 2 is uncorrelated with individual age-specific morphology—growth is not canalized (Fig. 5b). Variances and covariances thus accumulate during ontogeny. The increasing variance of both common factors during early puberty reflects the massive variation in the onset of puberty. After all individuals have reached puberty and experienced a growth spurt, variance decreases again. But such a reduction of variance usually is not interpreted as a canalization process. Analysis 2: Comparing Integration Between Humans and Chimpanzees In order to compare integration between the face and the cranial base in humans to that in chimpanzees, we used CT scans of 20 adult human individuals (10 males, 10 females from different human populations) from the sample used in Bookstein et al. (2003) and 22 adult chimpanzees (Pan troglodytes, most specimens of unknown sex) from the sample used in Neubauer et al. (2010). Additionally, we included two fossils with a preserved cranial base from the Bookstein et al. (2003) data set, the early modern human skull Mladecˇ I from central Europe (Czech Republic) and the South African Australopithecus africanus STS 5. The cranial base is represented by the four landmarks basion, dorsum sellae, sphenobasilare, and nasion, and the face by the five landmarks posterior nasal spine, nasospinale, prosthion, rhinion, and glabella (Fig. 6a). The assignment of nasion to the cranial base instead of to the face (or to both parts) is somewhat arbitrary, especially as glabella is assigned to the face. However, the presented results do not depend on these choices (see also the Discussion). The landmark configurations were superimposed by a single Procrustes registration. After subtracting from each human individual the corresponding sex average and projecting out allometry, we estimated common factors (scaled singular warps) from the residual shape coordinates as described above. The first common factor reflects variation in overall cranial length relative to cranial height, whereas the second common factor represents the size of the face and the anterior cranial base relative to the length of the clivus (Fig. 6b, c). These two factors clearly resemble the factors estimated from the longitudinal sample above (Fig. 3). They account for 93 % Evol Biol (2012) 39:536–553 547 Fig. 4 a Covariance between the first pair of singular warp scores (covariance between face and cranial base due to common factor 1) during one-year growth periods (individual shape differences between 2 and 3 years, 3 and 4 years, etc.), shown as a black dashed line. The red line indicates the variance of common factor 1 in these growth intervals (note the different scale at the right vertical axis). b Cross-sectional variance (red line) and covariance (black dashed line) for common factor 1 across individuals of the same age. c Variance and covariance for common factor 2 during one-year growth periods. d Cross-sectional variance and covariance for common factor 2 (Color figure online) Fig. 5 Correlation of individual shape changes during one-year growth intervals with the morphology before growth ðCorðxtþ1 xt ; xt ÞÞ plotted against age (t) for both common factors. A negative correlation indicates targeted growth or canalization, because individuals with a low score for this factor grow more than the average and individuals with a high score grow less 123 548 Evol Biol (2012) 39:536–553 b a c Fig. 6 a Midsagittal landmarks on the cranial base (basion, dorsum sellae, nasion, sphenobasilare) and the face (posterior nasal spine, nasospinale, prosthion, rhinion, glabella) measured on CT scans of adult humans, chimpanzees, and two fossils. The first two singular warps estimated from the human sample are visualized as common factors in (b) and (c). They clearly resemble the common factors in Fig. 3, just the order is reversed of summed squared covariances in the cross-sectional human sample. Using these common factor estimates from the human sample, we computed scores along the factors (singular warp scores) both for humans and chimps. The scores were computed separately for the face and the cranial base. Note that we use the original human sample here, not the one corrected for sex and allometry, in order to investigate if average sex and species differences follow the integration pattern. Figure 7b shows that—despite apparent mean shape differences—integration due to common factor 1 is similar in humans and in chimps: the point clouds are similarly oriented, i.e., one unit of shape change in the cranial base is associated with a similar amount of facial shape change in both species. Yet, variance (and thus also covariance) along this factor is much larger in humans than in chimpanzees. Mladecˇ I falls within the modern human distribution, whereas STS 5 clusters with the chimpanzees. The situation is different for common factor 2: In contrast to humans, chimpanzees are not integrated along the second common factor; both fossils are closer to the human than to the chimpanzee distribution. Human males and females completely overlap for both factors. Discussion 123 Olson and Miller coined the term ‘‘morphological integration’’ with their 1958 book and raised a broader interest for this topic in the paleontological and biological communities. Yet, they never defined morphological integration in their book (as pointed out, e.g., by Chernoff and Magwene 1999). As Olson and Miller, many subsequent authors referred to morphological integration either as the underlying developmental and functional causes or as the statistical pattern of phenotypic variances and covariances. We showed that net phenotypic covariances do not directly reflect the underlying developmental and genetic factors of integration, most importantly because covariances depend on the variance of pleiotropic factors in a sample, and multiple pleiotropic factors with opposite effects may partly cancel out. Like Cheverud (1996a, b), we thus used the term integration to denote the biological processes and properties leading to phenotypic covariance. The definition of integration by Hallgrimsson et al. (2009) as the ability to covary, rather than actual covariance, closely reflects this distinction. The use of covariances to describe the relationship between phenotypic traits originates from early biometrics Evol Biol (2012) 39:536–553 a 549 b Fig. 7 Scores along the first two common factors, computed separately for the face and the cranial base (first two pairs of singular warp scores). Chimpanzees are shown as filled gray circles, male and female humans as filled and empty black circles, respectively. Along common factor 1, humans and chimpanzees are similarly integrated but differ in variance (a), whereas chimpanzees are not as integrated along common factor 2 as humans are (b) and from their role in predicting response to selection. But actual scientific models usually are not based on covariances, but on regressions (e.g., Bookstein in press). Regression quantifies the average effect of one variable onto another—how a change in one trait would alter another trait in the course of development or evolution—reflecting the typical reasoning in biology. Multivariate factors, such as Wright’s general factors or singular warps, can be interpreted as regressions of the variables on the (unmeasured) factor score—they are models of how an underlying growth factor affects phenotypic traits (the path coefficients in Fig. 1). When comparing integration between humans and chimps in Fig. 7, we interpreted similar regression slopes between the singular warp scores as an indication of similar integration, regardless of the different covariances. Regressions and factor models describe the ability or propensity of phenotypic traits to covary; the induced covariance depends on the sample variance. We reviewed different forms of integration (developmental integration, genetic integration, environmental integration) along with multiple other sources of phenotypic covariances, such as geometric and spatial dependences between the measurements. Developmental integration is the result of a very large number of developmental processes. The effect of single developmental factors can only be identified experimentally (see, e.g., the work by Hallgrimsson et al. on mice; Hallgrimsson et al. 2004, 2006, 2009). Modern imaging techniques allow for the threedimensional measurement of morphological features even during organogenesis and fetal development (e.g., Metscher 2009; Metscher and Mu¨ller 2011). Based on morphometrics alone, it is impossible to specify how many genes or growth processes actually contribute to an estimated common factor and the induced phenotypic covariance, but it is a wellknown phenomenon that the vast array of genetic variation is ‘‘funneled’’ into a smaller set of pathways, which in turn influence a smaller set of developmental processes (Hallgrimsson and Lieberman 2008). Furthermore, it has been argued that structural and organizational integration at the morphological level determines the identity and homology of anatomical elements, regardless of the complex underlying genetic and developmental networks (Mu¨ller and Newman 1999; Mu¨ller 2003). Careful studies of morphological integration can model these few mediating processes and can compare them across age groups and across different species. The heritable (additive genetic) part of the induced phenotypic variances and covariances is sufficient to predict short-term response to selection. Because experimental approaches are not possible in humans and primates, studies of morphological integration are a good alternative to investigate developmental integration in these taxa. Cranial Integration in Humans and Chimpanzees We studied morphological integration between the cranial base and the face during postnatal human growth using 26 individuals of the longitudinal Denver Growth Study. This is a relatively small sample for reliably estimating variances and covariances and it is probably no representative random sample, but we could still estimate and interpret two common factors (general factors in Wright’s terminology, or pleiotropic factors in the quantitative genetic language) underlying the integration between the cranial base and the face. These factors were estimated from 123 550 individual growth between 2 and 4 years of age, but the same two factors were also detectable from the other growth periods and even from the cross-sectional subsamples. In the second analysis, the same two factors could be estimated from a different and very heterogenous sample of 20 adult humans, even though number and position of the landmarks differed slightly across the two data sets. The first common factor reflects a large face together with a long anterior cranial base (the ‘‘roof’’ of the face) relative to the clivus (which is part of the middle cranial fossa), thus also determining the relative width of the pharynx. The face is under different developmental control than the brain and the basicranium, but facial length and the length of the anterior cranial base need to fit (‘‘growth counterparts’’; Enlow and Hans 1996)—they are developmentally integrated. The anterior cranial base elongates in concert with the frontal lobes of the brain, reaching approximately 95 % of its adult length by the end of the neural growth period, but the more inferior portions of the anterior cranial base continue to grow as part of the face after the neural growth phase, forming the ethmomaxillary complex (Lieberman et al. 2000). This common factor is of apparent functional relevance because of its effect on jaw size and on pharyngeal width. This is likely the reason why common factor 1 is canalized during postnatal ontogeny: the sample variance of common factor 1 is halved within six years of postnatal development. The second common factor is a standard finding in cephalometrics: Relative height and length of anatomical elements are integrated within the cranium (dolichocephalic versus brachycephalic crania), where shorter faces are associated with a more flexed cranial base than longer faces (e.g., Enlow and Hans 1996; Bookstein et al. 2003; Bastir and Rosas 2004; Mitteroecker and Bookstein 2008). Growth processes are integrated in this way throughout full postnatal development and the cross-sectional covariance between these shape aspects increases during ontogeny. This factor seems to be less developmentally canalized than common factor 1, probably because it is of no obvious functional relevance. Using longitudinal data, we could study both crosssectional integration and integration of growth itself, i.e., how cross-sectional variance and covariance is actually generated. The variance of common factor 1 during growth is much higher than the variance of common factor 2, and likewise, common factor 1 has a higher cross-sectional variance at age 2 than the second factor (cross-sectional variances of about 0.4 versus 0.3; Fig. 4). But owing to the different developmental dynamics, the situation is reversed at age 16: Common factor 2 is three times more variable than common factor 1 in the cross-sectional sample of 16-year-olds. It is a common finding in cephalometrics that the ratio of overall length to height (our common factor 2) 123 Evol Biol (2012) 39:536–553 is the most dominant pattern of variance and covariance apart from allometry (e.g., Bookstein et al. 2003; Bastir and Rosas 2004; Mitteroecker and Bookstein 2008). In our second analysis, we compared integration between the face and the cranial base in cross-sectional samples of adult humans and chimpanzees. Basically the same two common factors result from this cross-sectional human sample as from the longitudinal X-ray sample (just in reversed order). Both humans and chimpanzees are similarly integrated regarding the overall length to height ratio of the face and the cranial base, but differ in the integration between facial size and anterior cranial base length. In humans, the face is positioned below the anterior cranial base and hence both parts are developmentally integrated. In chimpanzees, large parts of the face are more anteriorly positioned than the brain case, so that facial size and the length of the anterior cranial base are less integrated and almost uncorrelated in our sample. This is probably the reason why the average species difference between humans and chimpanzees—the evolutionary integration—along this common factor does not resemble the human integration pattern (Fig. 7b). Evolutionary integration of length to hight ratios, by contrast, more closely resembles the common pattern of developmental integration in both species (Fig. 7a). As we set out in the beginning, these insights are not derived from a formal mathematical model, nor are they based on biological experiments. They are based on the consilience of multiple lines of evidence: spatial statistical patterns (deformation grids), temporal statistical patterns (ontogenetic dynamics of variance and covariance), and qualitative biological models (both of development and function). The Palimpsest Most studies on integration and evolutionary quantitative genetics assess covariances in cross-sectional samples of adult individuals. It has been shown that cross-sectional variances and covariances continually change during development (e.g., Zelditch et al. 2006; Hallgrimsson et al. 2007, 2009; Mitteroecker and Bookstein 2009). Hallgrimsson et al. (2007) used the metaphor of a medieval palimpsest to described the ontogeny of the adult covariance structure: Much like a reused scroll on which the shadows of the various texts accumulate over time, ‘‘the covariation structure of an adult skull represents the summed imprint of a succession of effects, each of which leaves a distinctive covariation signal determined by the specific set of developmental interactions involved’’ (p. 164). In our analysis we were able to show how variation in growth at different age stages contributes to the final pattern. We can thus add a further piece to the palimpsest Evol Biol (2012) 39:536–553 metaphor: Variation of growth processes can accumulate during ontogeny, whereas other growth processes are canalized so that cross-sectional variances and covariances decrease. Some text fragments on the palimpsest accumulate over time, whereas others get (partly) erased in the course of ontogeny. Many, but not all growth processes varying in a population might be reflected in the adult covariance structure. Acknowledgments We thank Mihaela Pavlicev and Fred Bookstein for stimulating discussions and helpful comments on the manuscript. We are grateful to Ekaterina Stansfield for loaning us the digitized Denver growth study data. References Armbruster, W. S., & Schwaegerle, K. E. (1996). Causes of covariation of phenotypic traits among populations. Journal of Evolutionary Biology, 9, 261–276. Arnold, S. J., Bu¨rger, R., Holenhole, P. A., Beverly, C. A., & Jones, A. G. (2008). Understanding the evolution and stability of the G-matrix. Evolution, 62, 2451–2461. Arnold, S. J., Pfrender, M. E., & Jones, A. (2001). The adaptive landscape as a conceptual bridge between micro- and macroevolution. Genetica, 112-113, 9–32. Arthur, W. (2002). The emerging conceptual framework of evolutionary developmental biology. Nature, 415(14), 757–764. Bastir, M., & Rosas, A. (2004). Facial heights: Evolutionary relevance of postnatal ontogeny for facial orientation and skull morphology in humans and chimpanzees. American Journal of Physical Anthropology, 47, 359–381. Bastir, M., & Rosas, A. (2005). Hierarchical nature of morphological integration and modularity in the human posterior face. American Journal of Physical Anthropology, 128(1), 26–34. Bastir, M., & Rosas, A. (2006). Correlated variation between the lateral basicranium and the face: A geometric morphometric study in different human groups. Archives of Oral Biology, 51, 814–824. Bastir, M., Rosas, A., Stringer, C., Manuel Cue´tara, J., Kruszynski, R., Weber, G. W., et al. (2010). Effects of brain and facial size on basicranial form in human and primate evolution. Journal of Human Evolution, 58(5), 424–431. Berg, R. L. (1960). The ecological significance of correlation pleiades. Evolution, 14, 171–180. Bogin, B. (1999). Patterns of human growth. Cambridge: Cambridge University Press. Bonner, J. T. (1988). The evolution of complexity by means of natural selection. Princeton, NJ: Princeton University Press. Bookstein, F. (1991). Morphometric tools for landmark data: Geometry and biology. Cambridge, UK: Cambridge University Press. Bookstein, F. (1996). Biometrics, biomathematics and the morphometric synthesis. Bulletin of Mathematical Biology, 58(2), 313–365. Bookstein, F. (1997). Landmark methods for forms without landmarks: Morphometrics of group differences in outline shape. Medical Image Analysis, 1(3), 225–243. Bookstein, F. L. (in press). Reasoning and measuring: Numerical inferences in the sciences. Cambridge: Cambridge University Press. Bookstein, F. L., Gunz, P., Mitteroecker, P., Prossinger, H., Schaefer, K., & Seidler, H. (2003). Cranial integration in Homo: Singular 551 warps analysis of the midsagittal plane in ontogeny and evolution. Journal of Human Evolution, 44(2), 167–187. Bulygina, E., Mitteroecker, P., & Aiello, L. C. (2006). Ontogeny of facial dimorphism and patterns of individual development within one human population. American Journal of Physical Anthropology, 131(3), 432–443. Chernoff, B., & Magwene, P. M. (1999). Morphological integration: Forty years later. In: Morphological integration, (pp. 319–354). Chicago: University of Chicago Press. Cheverud, J. M. (1982). Phenotypic, genetic, and environmental morphological integration in the cranium. Evolution, 36, 499–516. Cheverud, J. M. (1984). Quantitative genetic and developmental constraints on evolution by selection. Journal of Theoretical Biology, 110, 155–171. Cheverud, J. M. (1988). A comparison of genetic and phenotypic correlations. Evolution, 42(5), 958–968. Cheverud, J. M. (1989). A comparative analysis of morphological variation patterns in papionins. Evolution, 43, 1737–1747. Cheverud, J. M. (1996a). Developmental integration and the evolution of pleiotropy. American Zoologist, 36, 44–50. Cheverud, J. M. (1996b). Quantitative genetic analysis of cranial morphology in the cotton-top (Saguinus oedipus) and saddleback (S. fuscicollis) tamarins. Journal of Evolutionary Biology, 9, 5–42. Cheverud, J. M., Wagner, G. P., & Dow, M. M. (1989). Methods for the comparative analysis of variation patterns. Systematic Zoology, 38, 201–213. Clausen, J., & Hiesey, W. M. (1960). The balance between coherence and variation in evolution. PNAS, 46(4), 494–506. Debat, V., & David, P. (2001). Mapping phenotypes: Canalization, plasticity and developmental stability. Trends in Ecology & Evolution, 16(10), 555–561. Enlow, D., & Hans, M. (1996). Essentials of facial growth. Philadelphia, PA: Saunders Company. Falconer, D. S., & Mackay, T. F. C. (1996). Introduction to quantitative genetics. Essex: Longman. Fisher, R. A. (1930). The genetical theory of natural selection. Oxford: Clarendon. Galis, F., Van Dooren, T. J., Feuth, J. D., Metz, J. A., Witkam, A., & Ruinard, S., et al. (2006). Extreme selection in humans against homeotic transformations of cervical vertebrae. Evolution, 60(12), 2643–2654. Gromko, M. H. (1995). Unpredictability of correlated response to selection: Pleiotropy and sampling interact. Evolution, 49, 685–693. Gould, S. J. (1977). Ontogeny and phylogeny. Cambridge: Harvard University Press. ` chignonO ´: Gunz, P., & Harvati, K. (2007). The Neanderthal O Variation, integration, and homology. Journal of Human Evolution, 52(3), 262–274. Gunz, P., Mitteroecker, P., & Bookstein, F. L. (2005). Semilandmarks in three dimensions. In: D. E. Slice (Ed.), Modern morphometrics in physical anthropology (pp. 73–98). New York: Kluwer Press. Gunz, P., Mitteroecker, P., Neubauer, S., Weber, G. W., & Bookstein, F. L. (2009). Principles for the virtual reconstruction of hominin crania. Journal of Human Evolution, 57(1), 48–62. Haber, A. (2011). A Comparative Analysis of Integration Indices. Evolutionary Biology, 38, 476–488. Hallgrimsson, B., Brown, J. J., Ford-Hutchinson, A. F., Sheets, H. D., Zelditch, M. L., & Jirik, F. R. (2006). The brachymorph mouse and the developmental-genetic basis for canalization and morphological integration. Evolution & Development, 8(1), 61–73. Hallgrimsson, B., Dorval, C. J., Zelditch, M. L., & German, R. Z. (2004). Craniofacial variability and morphological integration in 123 552 mice susceptible to cleft lip and palate. Journal of Anatomy, 205(6), 501–517. Hallgrimsson, B., Jamniczky, H., Young, N. M., Rolian, C., Parson, T. E., Boughner, J. C., et al. (2009). Deciphering the palimpsest: Studying the relationship between morphological integration and phenotypic covariation. Evolutionary Biology, 36(4), 355–376. Hallgrimsson, B., & Lieberman, D. E. (2008). Mouse models and the evolutionary developmental biology of the skull. Integrative and Comparative Biology, 48, 373–384. Hallgrimsson, B., Lieberman, D. E., Young, N. M., Parsons, T., & Wat, S. (2007). Evolution of covariance in the mammalian skull. Novartis Found Symp 284 (Tinkering—The Microevolution of Development), 284, 164–190. Hansen, T. F. (2003). Is modularity necessary for evolvability? Remarks on the relationship between pleiotropy and evolvability. Biosystems, 69(2–3), 83–94. Hansen, T. F., & Houle, D. (2008). Measuring and comparing evolvability and constraint in multivariate characters. Journal of Evolutionary Biology, 21(5), 1201–1219. Helms, J. A., Cordero, D., & Tapadia, M. D. (2005). New insights into craniofacial morphogenesis. Development, 132(5), 851–861. Hodgkin, J. (1998). Seven types of pleiotropy. The International Journal of Developmental Biology, 42(3), 501–505. Houle, D. (1991). Genetic covariance of fitness correlates: What genetic correlations are made of and why it matters. Evolution, 45, 630–648. Huttegger, S., & Mitteroecker, P. (2011). Invariance and meaningfulness in phenotype spaces. Evolutionary Biolog, 38, 335–352. Huxley, J. S. (1932). Problems of relative growth. London: Methuen and Co. Klingenberg, C. P. (2008). Morphological Integration and Developmental Modularity. Annual Review of Ecology, Evolution and Systematics, 39, 115–132. Klingenberg, C. P. (1998). Heterochrony and allometry: The analysis of evolutionary change in ontogeny. Biological Reviews, 73, 70–123. Klingenberg, C. P., Mebus, K., & Auffray, J. C. (2003). Developmental integration in a complex morphological structure: how distinct are the modules in the mouse mandible? Evolution & Development, 5(5), 522–531. Klingenberg, C. P., & Zaklan, S. D. (2000). Morphological integration between developmental compartments in the Drosophila wing. Evolution, 54(4), 1273–1285. Lande, R. (1979). Quantitative genetic analysis of multivariate evolution, applied to brain: Body size allometry. Evolution, 33, 402–416. Lande, R. (1980). The genetic covariance between characters maintained by pleiotropic mutations. Genetics, 94, 203–215. Lande, R. (1984). The genetic correlation between characters maintained by selection, linkage and inbreeding. Genetical Research, 44, 309–320. Lieberman, D. E. (2011). The evolution of the human head. Cambridge, MA: Belknap Press/Harvard University Press. Lieberman, D. E., Ross, C., & Ravosa, M. J. (2000). The primate cranial base: Ontogeny, function, and integration. Yearbook of Physical Anthropology, 43, 117–169. Leamy, L. (1977). Genetic and Environmental Correlations of Morphometric Traits in Randombred House Mice. Evolution, 31(2), 357–369. Lynch, M., & Walsh, B. (1998). Genetics and analysis of quantitative traits. Sunderland, MA: Sinauer Associates. Marroig, G., & Cheverud, J. M. (2004). Did natural selection or genetic drift produce the cranial diversification of neotropical monkeys? American Naturalist, 163(3), 417–428. Martens, H., & Naes, T. (1989). Multivariate calibration. Chichester: Wiley. 123 Evol Biol (2012) 39:536–553 Martinez-Abadias, N., Esparza, M., Sjovold, T., Gonzalez-Jose, R., Santos, M., & Hernandez, M. (2009). Heritability of human cranial dimensions: Comparing the evolvability of different cranial regions. Journal of Anatomy, 214(1), 19–35. ˘ lez-JosO, } R., MartSˇnez-AbadSˇas, N., Esparza, M., Sj£vold, T., GonzG ˘ ndez, M., et al. (in press). Pervasive genetic Santos, M., HernG integration directs the evolution of human skull shape. Evolution. Maynard Smith, J., Burian, R., Kauffman, S., Alberch, P., Campbell, J., Goodwin, B., et al. (1985). Developmental constraints and evolution: A perspective from the mountain lake conference on development and evolution. The Quarterly Review of Biology, 60(3), 265–287. Metscher, B. D. (2009). MicroCT for developmental biology: A versatile tool for high-contrast 3D imaging at histological resolutions. Developmental Dynamics, 238(3), 632–640. Metscher, B. D., & Mu¨ller, G. B. (2011). MicroCT for Molecular Imaging: Quantitative Visualization of Complete Three-Dimensional Distributions of Gene Products in Embryonic Limbs. Developmental Dynamics, 240, 2301–2308. Mitteroecker, P. (2009). The developmental basis of variational modularity: Insights from quantitative genetics, morphometrics, and developmental biology. Evolutionary Biology, 36, 377–385. Mitteroecker, P., & Bookstein, F. (2009). The ontogenetic trajectory of the phenotypic covariance matrix, with examples from craniofacial shape in rats and humans. Evolution, 63(3), 727–737. Mitteroecker, P., & Bookstein, F. L. (2007). The conceptual and statistical relationship between modularity and morphological integration. Systematic Biology, 56(5), 818–836. Mitteroecker, P., & Bookstein, F. L. (2008). The evolutionary role of modularity and integration in the hominoid cranium. Evolution, 62(4), 943–958. Mitteroecker, P., & Gunz, P. (2009). Advances in geometric morphometrics. Evolutionary Biology, 36, 235–247. Mitteroecker, P., & Huttegger, S. (2009). The concept of morphospaces in evolutionary and developmental biology: Mathematics and metaphors. Biological Theory, 4(1), 54–67. Monteiro, L. R., Bonato, V., & Reisb, S. F. (2005). Evolutionary integration and morphological diversification in complex morphological structures: Mandible shape divergence in spiny rats (Rodentia, Echimyidae). Evolution & Development, 7(5), 429–439. Mu¨ller, G. B. (2003). Homology: The evolution of morphological organization. In: G. B. M§ller, & S. A. Newman (Eds.), Origination of organismal form: Beyond the gene in developmental and evolutionary biology. Cambridge, MA: MIT Press. Mu¨ller, G. B. (2007). Evo-devo: extending the evolutionary synthesis. Nature Reviews Genetics, 8(12), 943–949. Mu¨ller, G. B., & Newman, S. A. (1999). Generation, integration, autonomy: Three steps in the evolution of homology. Novartis Foundation Symposium, 222, 65–73. Needham, J. (1933). On the dissociability of the fundamental processes in ontogenesis. Biological Reviews, 8, 180–223. Neubauer, S., Gunz, P., & Hublin, J. -J. (2010). Endocranial shape changes during growth in chimpanzees and humans: A morphometric analysis of unique and shared aspects. Journal of Human Evolution, 59, 555–566. Olson, E. C., & Miller, R. L. (1958). Morphological Integration. Chicago: University of Chicago Press. Pavlicev, M., & Hansen, T. F. (2011). Genotype-phenotype maps maximizing evolvability: Modularity revisited. Evolutionary Biology, 38(4), 371–389. Pavlicev, M., Wagner, G., & Cheverud, J. M. (2009). Measuring evolutionary constraints through the dimensionality of the phenotype: Adjusted bootstrap method to estimate rank of Evol Biol (2012) 39:536–553 phenotypic covariance matrices. Evolutionary Biology, 36, 339–353. Pigliucci, M. (2006). Genetic variance-covariance matrices: A critique of the evolutionary quantitative genetics research program. Biology and Philosophy, 21, 1–23. Pigliucci, M., & Preston, K. (eds.) (2004). Phenotypic integration: Studying the ecology and evolution of complex phenotypes. Oxford: Oxford University Press. Raff, R. (1996). The shape of life: Genes, development, and the evolution of animal form. Chicago: Univeristy of Chicago Press. Riedl, R. J. (1978). Order in Living Organisms. New York: John Wiley and Sons. Roff, D. A. (1997). Evolutionary quantitative genetics. New York: Chapman & Hall. Rohlf, F. J., & Bookstein, F. (1987). A comment on shearing as a method for ‘‘size correction’’. Systematic Zoology, 36, 356–367. Rohlf, F. J., & Corti, M. (2000). The use of two-block partial leastsquares to study covariation in shape. Systematic Biology, 49, 740–753. Rohlf, F. J., & Slice, D. E. (1990). Extensions of the Procrustes method for the optimal superimposition of landmarks. Systematic Zoology, 39, 40-59. Ross, C., & Henneberg, M. (1995). Basicranial flexion, relative brain size and facial kyphosis in Homo sapiens and some fossil hominids. American Journal of Physical Anthropology, 98, 575–593. Sawin, P. B., Fox, R. R., & Latimer, H. B. (1970). Morphogenetic studies of the rabbit XLI. Gradients of correlation in the architecture of morphology. American Journal of Anatomy, 128(2), 137–145. Schluter, D. (1996). Adaptive radiation along genetic lines of least resistance. Evolution, 50(5), 1766–1174. Sinervo, B., & Svensson, E. (2002). Correlational selection and the evolution of genomic architecture. Heredity, 89, 329–338. Sperber G. H. (2001). Craniofacial development. Ontario: BC Decker Inc.. 553 Stadler, P. F., & Stadler, B. M. R. (2006). Genotype-phenotype maps. Biological Theory, 1(3), 268–279. Tanner, J. M. (1963). Regulation of Growth in Size in Mammals. Nature, 199, 845–850. Terentjev, P. V. (1931). Biometrische Untersuchungen u¨ber die morphologischen Merkmale von Rana ridibunda Pall. (Amphibia, Salientia). Biometrika, 23, 23–51. Thompson, D. A. W. (1917). On growth and form. Cambridge: Cambridge University Press. Waddington, C. H. (1942). The canalization of development and the inheritance of acquired characters. Nature, 150, 563. Wagner, G., & Zhang, J. (2011). The pleiotropic structure of the genotype-phenotype map: The evolvability of complex organisms. Nature Reviews Genetics, 12, 204–213. Wagner, G. P. (2000). What Is the Promise of Developmental Evolution? Part I: Why Is Developmental Biology Necessary to Explain Evolutionary Innovations? Journal of Experimental Zoology. Part B, Molecular and Developmental Evolution, 288, 95–98. Wagner, G. P., & Altenberg, L. (1996). Complex adaptations and the evolution of evolvability. Evolution, 50(3), 967–976. Wagner, G. P., Pavlicev, M., & Cheverud, J. M. (2007). The road to modularit. Nature Reviews Genetics, 8, 921–931. Wright, S. (1932). General, group and special size factors. Genetics, 15, 603–619. Zelditch, M. L. (1987). Evaluating models of developmental integration in the laboratory rat using confirmatory factor analysis. Systematic Zoology, 36, 368–380. Zelditch, M. L. (1988). Ontogenetic variation in patterns of phenotypic integration in the laboratory rat. Evolution, 42(1), 28–41. Zelditch, M. L., Mezey, J. G., Sheets, H. D., Lundrigan, B. L., & Garland, T. (2006). Developmental regulation of skull morphology II: Ontogenetic dynamics of covariance. Evolution & Devlopment, 8, 46–60. 123

© Copyright 2018