Uses of phylogenies Yves Desdevises Université Pierre et Marie Curie (Paris 6) Observatoire Océanologique de Banyuls France
[email protected] http://desdevises.free.fr
1
References • Felsenstein J. 1985. Phylogeny and the comparative method. American Naturalist 125: 1-15.
• Harvey P. & Pagel M. 1991. The comparative method in evolutionary biology. Oxford University Press.
• Brooks D. & McLennan D. 1991. Phylogeny, ecology, and
behaviour. A research program in comparative biology. The University of Chicago Press.
• Harvey P. et al. 1996. New uses for new phylogenies. Oxford University Press.
• Brooks D. & McLennan D. 2003. The nature of diversity. An evolutionary voyage of discovery. The University of Chicago Press.
2
• Reconstruction of phylogenies for something else than taxonomy/systematic
• Trees: from all character types but molecular data can generate branch lengths
• Most methods require robust and well resolved trees: limitation in many cases
3
Phylogenies • Uses • Discover the origin of organisms • Understand their evolutionary relationships
• Classification 4
• But also... • Inference on ancestral character state • Study of correlated evolution • Study of adaptation • Molecular evolution • Diversification rates and key innovations • Estimation of dates of divergence • Biogeography and phylogeography • Cospeciation • etc.
5
Adaptation • Adaptation = trait, or set of traits, increasing
fitness, then leading to individuals to produce more descendants than those without this trait
• Trait which origin is associated to an increase of
functional efficiency favoured by natural selection
• Produce of natural selection • Difficult to reveal: must be tested 6
• Act on design, and/or reproduction • Must be investigated in an evolutionary context • Adaptation “appears” in a lineage • ≠ inherited from an ancestor 7
• Example: adaptation = hydrodynamic shape
8
• In the cetacean clade, hydrodynamic shape is
inherited, and is not an adaptation sensu stricto in each species
9
Adaptation/Acclimatization/ Phenotypic plasticity evolution via natural selection, in a • Adaptation: species, many generations. Modification of the genotype. Often not reversible
physiological, biochemical, • Acclimatization: anatomical change, in an individual, due to the
exposure to a new environment. Often reversible
plasticity: ability of a genotype to • Phenotypic produce different phenotypes related to an
environmental change, in an individual. Plasticity can be adaptive.
Acclimatization
Adaptation
10
Time
Phenotypic plasticity
11
A component of evolution • Evolution ≠ adaptation • Adaptation (selection) • Structure (physics) • Phylogenetic inertia (traits from ancestor) • Constraints (everything is not possible) • Organism = trade-offs 12
• Gradual (Darwin) or punctuational changes • Many transitional steps • Regulatory genes: small genetic modification = big phenotypic change
• Combination of several elements: different
structures, genes, organisms (endosymbiosis)
13
Adaptation ≠ perfection • Time lag between adaptation and environmental change: “imperfections”, maladaptation
• Constraints • Genetic: e.g. persistence of deleterious
homozygous if heterozygous have a superior fitness
• Development: development influences the possibility of variations
14
• Historical • Adaptation arises on existing state: the possible solution may not be optimal
• Do not infer abusive adapative interpretations!! • “Spandrelism”
15
Study of adaptation • Theoretical approach: models, predictions • Observational approach: in the field • Experimental approach: alteration of potentially adaptive structure
• Comparative approach: correlation between structures or structure/function in several lineages (usually species), taking phylogeny into account. Comparative method/analysis.
16
Comparative analysis • Study of character correlated evolution • Search for adaptations via a cross-species correlative approach
• Link: trait/trait or trait/environment • Size / longevity • Metabolism / temperature • Colour / gregarity 17
• Some key dates • 1985: Felsenstein • 1991: Harvey & Pagel; Brooks & McLennan • Closely related species tend to be similar, and are not independent observations (pseudoreplication)
• Phylogenetic constraints (morphology,
physiology, genetic, development, ...): inertia
• Phylogeny = confounding variable 18
Example Species
Altitude
TSizeaille
A B C D E F G H I J
5 6 8 12 14 24 25 28 29 30
2 6 6 7 12 13 14 14 16 17
Size
Hypothesis: size is an adaptation to altitude
Altitude correlation: adaptation?
19
Problem Species
Altitude
Size
A B C D E F G H I J
5 6 8 12 14 24 25 28 29 30
2 6 6 7 12 13 14 14 16 17
Size
• Phylogeny makes data non independent
Altitude
20
Solution
Size (corrected)
• Control phylogenetic constraints
Altitude (corrected)
no correlation Comparative analysis
21
Comparative analysis • Several methods exist to take phylogeny into account in analyses
• Most known: independent contrasts (Felsenstein, 1985)
• Trait variation partitionning: Environment
Phylogeny
?
100 % of trait variation Westoby et al. (1995)
22
When to use it?
• Look for causality: hypothesis • Does X influence Y? • Not in a predictive framework • What is the value of Y for a given X? • Not with any kind of character: assume a link
with phylogeny, the character must be potentially transmittable
23
Quantitative characters • Many methods • Also for semi-quantitative variables • Qualitative variables may be used via recoding
24
Phylogenetic association matrices
• Assess phylogenetic dependence (structure) of data
• Two basic types • Variance-covariance matrices • Distance matrices 25
Variance-covariance matrix Species
1 S
3 4
Species
2
2 S
S
1,3
0
0
S2 S2,3
0
0
0
0
1,2
S2
S2 S4,5 S
2
5
26
Patristic distance matrix Species
1
Species
2 3 4
D5,1
0 0
D3,2
D5,3
0 0
0
5
27
Models of character evolution
• Two basic models • Brownian motion (BM): variance is a linear function of (proportional to) time = branch lengths on tree
28
• Brownian motion
Time
.
Trait (x)
29
• Addition of constraints (adaptive, ...) and estimation of corresponding parameters: different relationships between branch lengths and character variances (e.g. Ornstein-Uhlenbeck model: OU)
Divergence
genetic drift (BM) BM + stabilizing selection
Time
30
Methods
• Methods based on evolutionary models • Independent contrasts (FIC; Felsenstein 1985) (CAIC, COMPARE, PDAP, ...)
• Phylogenetic generalized least squares (PGLS; Martins 1994) (COMPARE)
• Phylogenetic mixed model (PMM; Lynch et al. 2004) (COMPARE)
31
• Methods with statistical bases • Autoregressive method (ARM; Cheverud et al. 1985) (AUTOPHY)
• Phylogenetic eigenvector regression (PVR; Diniz-Filho et al. 1998)
32
• Different assumptions • Results may be different • Methods may be difficult to compare • Interesting approach: use several methods and compare results
33
Independent contrasts FIC; PIC
• Most used method • Based on a variance-covariance matrix • Evolutionary model = Brownian motion: genetic drift, some selective regimes
• Time = branch length = variance • Supposes well known phylogeny • Ideally for quantitative variables but a variable can be qualitative
34
• Correlation among traits
Time
.
ρ = 0.9 ρ = 0.0 Traits (x, y)
35 Estimation of ancestral character values (weighted means)
8 22 9 24
X Y
Contrasts
7 20
X Y 2 4
9 24
2
10 25
8 11
3
14 30 12
42,5
6 10
40 37,5
Y
30
20 40
27,5 25 22,5
11 10 9 IC Y
17 35
35 32,5
8 7 6 5 4 3
20
2
17,5 6
8
10
12
14 X
16
18
20
22
1
2
3
4
5 IC X
6
7
8
9
• Contrasts must be standardised: divided by their standard deviation (√variance)
36
• Contrasts can be used in various kinds of analyses • Regressions (through the origin, because contrasts computation eliminate the constant term)
• Multivariate analyses • Powerful because few parameters to estimate • Model testing: contrasts vs SD • Slope = 0 if BM • Else: transformation of branch lengths or different model (PGLS, ...)
37
• Improvements to take polytomies into account • Do not consider data variance! • “Remove phylogeny” (without explicitly quantifying it) in correlation among traits
• Paradox to study adaptation because supposes selection (≠ BM)
38
MacroCAIC: study of taxonomical diversification
• Derived from CAIC: independent contrasts • Uses the variable “number of taxa per clade” in a comparative analysis
• values at the nodes are not means (like in classical IC) but sums of values for their descendants
39
5
3
2
• Measures if the value at a node is correlated to the number of species descending from it
• Can assess diversification rate (if branch lengths = time)
40
Phylogenetic generalized least square regression PGLS
• Generalization of FIC to other models • Consider phylogenetic structure of the error term
(non independent observations, heteroscedasticity) via variance-covariance matrix
• Estimation of a constraint parameter (adaptation, stabilizing selection...) adding to BM
41
• Consider data variances • The number of parameters to estimate reduces power
• Allows to reconstruct ancestral character states (like IC)
42
Phylogenetic eigenvector regression PVR
• Statistical basis (no explicit model) • Transforms distance matrice in principal coordinates (PCs)
• Problem: account only for a part of the phylogeny • The first PCs are the tip of the tree 43
• Simple solution: patristic distance matrix (genetic distances) between species
1
1
n
1
n
n
Problem: requires to express other variables (”environment”) also in distance matrices
44
• Transformation of phylogenetic distances in principal coordinates ACGTTCGGA ACTGTCGGA AGTGTCCGA
010010100 010110110 001110110
( )
Distance matrix 1
Raw or patristic distances
n
1
1
n
n
Principal coordinate analysis
Obtention of maximum n-1 independent variables (principal coordinates) representing phylogenetic distances
1
n-1 (max)
1
n
45
• Variance splitting is function of the phylogeny size and structure
• PCs selection: Broken stick, Backward elimination, ...
• Efficient method with few taxa • Explicitely quantifies the fraction of trait variation linked to phylogeny
• Y = βX + ε ; T = P + S 46
• Versatility of distance matrices: trees, reticulograms, raw distances, ...
• Allows variation partitionning and quantification of “phylogenetic niche conservatism”
• Software • R 4.0 (Mac), DistPCoA (Mac, PC) for PCoA • Permute! (Mac) for tests 47
Phylogenetically structured environmental variation
• “Phylogenetic” fraction(inertia) • “Non historical” fraction, specific (adaptation, ...) • We can quantify the common fraction (phylogenetic niche conservatism) Environment
Phylogeny
100 % of trait variation
?
48
Variation partitionning
• Effect of 2 variables X1 and X2 on a variable Y • Example: effect of temperature (X1) and humidity (X2) on growth (Y)
• Temperature and humidity each have an influence on growth
• Temperature and humidity are correlated 49
Variation explained by X1 = R21 = a+b Variation explained by X2 = R22 = b+c
a
b
c
d
100 % of Y variation Variation explained by X1 and X2 = R21,2
= a+b+c Unexplained variation = d
With a+b+c+d = 100 %
a, b, c, et d are computed via subtraction
50
Application to a phylogenetic context
• Y = variable we want to explain: “trait” potentially controlled (at least in part) by phylogeny
• X1 = “environment”: regression (e.g. linear) • X2 = “phylogeny”: we have to express it 51
Method
• Phylogenetic effect: regression of the studied trait Y on principal coordinates (P): R2P
• Environmental effect: regression of Y on the variable(s) E: R2E
• Phylogeny and environment: multiple regression of Y on variables P and E: R2P+E
52
Phylogeny = R2P (= a+b)
a
Environment = R2E (= b+c)
b
c
d
100 % of Y variation Phylogeny and environment = R2P+E (= a+b+c)
b = phylogenetica!y structured environmental variation Desdevises et al. (2003)
53
Discrete characters • Binary characters • 0/1 (presence/absence, 2 states, ...) • Characters with multiple states, ordered or not • 0/1/2/3 • Question: for 2 characters • Is the presence of a character state associated to (leads to) the presence (the appearance) of a given state in the other character?
54
Χ2 test on transitions
• Independent species
• P < 0,005 • Transitions • P = 0,99 55
Concentrated Change Test
• Maddison (1990) • Are changes in character B located at the same places in the tree than those for character A ?
• Generation of the null distribution (no
association) via random permutations of data (MacClade)
• Do not take branch lengths into account • Depends on branch number 56
57
58
Pairwise comparisons
• Maddison (2000) • Comparison of pairs of taxa with no phylogenetic connexion (Mesquite)
• Algorithm to find the optimal number of taxon pairing, with various options concerning the contrasts between character states
59
• Pairwise comparison: flight/no flight • Separated pairs: independent comparisons
60
• Pair choice: contrasting one character or the other, or both, or none (maximum number of pairs).
• Only constraint: phylogenetically separate • The more pairs, the more statistical power, but choice must be relevant
• e.g. contrast pairs for the independent character,
and see what happens for the response character
61
• Several choices
62
• No model • No information on direction of change • No need to infer ancestral states: but few pairs
• Low power if few pairs 63
ML tests
• Test via ML: estimation of transition rates from a character test to another (BayesTraits (includes Discrete))
• LRT: is likelihood different when changes are constrained to be independent?
• Take branch lengths into account • Inference of ancestral states 64
Inference of ancestral character states • Character mapping on tree • Discrete or continuous characters • With or without model • Discrete characters: Markov chain (MC) • Quantitative characters: Brownian motion (BM), OU, ...
65
Discrete characters • Parcimony (MacClade, Mesquite) • Maximum likelihood (generates confidence intervals; BayesTraits, Mesquite)
• Bayesian inference (BayesTraits, Mesquite, MrBayes, SIMMAP)
66
Example • Determination of character polarity: ancestral or derived
67
• Environmental attribute • Example: pollination in flowers
bees → birds
68
• Adaptation test • Character evolution
bees → birds
Support hypothesis of adaptation
69
bees → birds
Adaptation unsupported
70
• Parsimony • Simple • No explicit model
• No
uncertainty
• Do not
consider branch lengths
71
• Example: feeding in echinoderm larvae
72
• Maximum likelihood • Explicit model (e.g. MC or BM) • Estimation of transition rates from a state to another: π
• Assess uncertainty in reconstruction:
probability of observed from model and tree
• Importance of model (fixed values of
parameters, estimated or not) and tree characteristics (topology and branch lengths)
73
• Same example
74
• Bayesian inference • Needs prior probability on transition rates π between states (hypothesis + MCMC)
• Can consider uncertainty on tree • Also for continuous characters
MP
ML
BI
75
• Simulation of character evolution
76
Continuous characters • Linear parimony: minimise quantity of changes
(absolute or squared (= BM)) between descendants and ancestors (MacClade, Mesquite)
• Generalized linear model: includes a model • More general (includes previous approaches)
77
Datation
• How to date speciation events (nodes)? • Test of evolutionary hypotheses
78
79
• Strict molecular clock
No clock
Clock
Test via LRT
80
• Need of a calibration: known date (fossil,
geological event, ...). Ideally several dates
• If available, use mutation rate • Need branch lengths to infer other nodes, which implies linking divergence to time
• Fossils: difficult to locate on branches, only minimum ages
81
• Differences fossil/molecules:
“stem” (fossils)/”crown” (molecules)
82
• If molecular clock (= constant evolutionary rate for a gene or a protein) and (at least) one calibration point, it is easy
• But... in general substitution rate varies with time or lineage: no global molecular clock
83
• Problems with a global clock
Datation error Evolutionary rate of gene Y different from X Gene Z: highly variable rate
84
Solutions
• Remove sites/genes/lineages responsible for rate variation
• OK if variation is rare • Need a previous relative rate test A e.g. d = d then d = d d = 0 • • Idem with d = d • d follows a normal distribution under AO
BO
AC
AO
O
B
C
BO
BC
some models of DNA substitution: standard statistical tables
• Low power
85
• Methods incorporating rate variations (relaxed clocks) • Local molecular clocks: different according to tree regions (few different rates)
• Relaxed clocks (many different rates) Local clock Global clock
No clock
86
Local clocks
• Method based on quartets (Qdate) • Subgroups of 4 taxa split in 2 monophyletic groups with known divergence dates
• Based on character evolutionary model (ML) • Take into account variation in substitution rate (maximum 2 rates)
• Generates confidence intervals for dates
(precision depends on sequence lengths)
87
88
• Methods based on full trees (PAML, r8s, BEAST) • Rates assigned to different tree parts, with tests, or external knowledge
• Good if many fossils with accurate dates • Different models for estimation 89
90
• If model: very important here
91
Relaxed clocks • Rate smoothing • Many different rates • Need of a relationship to link rates
92
• Gradual rate change: each rate partly depends on the precedent (autocorrelation)
Different mathematical relationships (models) can link rates: penalty for important changes (penalty function)
93 REVIEWS
Trees • Non parametric rate smoothing, (r8s) • Considers fixed branch lengths (then uses
within trees: phylogeny and historical associations
observed number of substitutions) to infer rates
• Penalized likelihood (r8s) Roderic D.M. Page and Michael A. Charleston model that may change the number of • Uses a volutionary associations The association between two or between organisms, such as be• • •
E
substitutions for a given branch lenght) among genes, organisms more lineages over evolutionary time is
Cophylogeny
•
tween hosts and their parasites9 (including viruses10), endosymbionts and their hosts11, and insects and plants12,13, can have a long evolutionary history, which is reflected in similarities between their evolutionary trees14. At a larger scale still, organisms can track geological history such that sequences of geological events (e.g. continental break-up) are directly reflected in the phylogenies of those organisms15. In each association, one entity (the ‘associate’) tracks the other (the ‘host’) with a degree of fidelity that depends on the relative frequency of four categories of events: codivergence, duplication, horizontal transfer and sorting (Box 1). Joint cladogenesis of host and associate is codivergence. If the associates undergo cladogenesis independently and both descendants remain associated with the host then we have a duplication of associate lineages. Cladogenesis accompanied by one descendant colonizing a new host is horizontal transfer. ‘Sorting event’ is a generic term for the apparent absence 95 of an associate from a host. The analogies among the categories of events for the different kinds of association (Table 1) need not imply close analogy among the processes; rather, the analogy is among the patterns these processes produce. For example, although the processes of gene duplication and allele divergence are different, the resulting pattern is the same – more than one gene lineage in the same organismal lineage.
geographical areas have a recurrent theme spanning several Bayesian and approach (BEAST, Multidivtime) traditionally been studied different fields within biology, from by biologists from different disci- and molecular evolution to coevolution Use prior parameter estimation MCMC to plines, with little interaction beand biogeography. In each yield posterior and confidence intervals tween them. Consequently, recog‘historical association’, one lineage is nition of the fundamental similarity associated with another, and can be Forofall approaches, hypothesis of gradual change the problem faced by molecular thought of as tracking the other over 94a greater or systematists, parasitologists and evolutionary time with biogeographers has been slow in lesser degree of fidelity. Examples coming1–3. This is particularly true include genes tracking organisms, of the parallels between the relationparasites tracking hosts and organisms ship between gene and organismal tracking geological and geographical phylogeny, and the macroevolutionchanges. Parallels among these ary associations studied by paraproblems raise the tantalizing prospect sitologists and biogeographers. that each is a special case of a more The analogy between vicariance general problem, and that a single biogeography (organisms tracking analytical tool can be applied to areas) and host–parasite cospeciall three kinds of association. ation (parasites tracking hosts) has been recognized for some time4; for Roderic Page is at the a parasite the host can be thought of Environmental and Evolutionary Biology, Study common of twoDivision associated of asofanthe ‘area’, hence history host speciInstitute of Biomedical and Life Sciences, ation is equivalent phylogenetic trees to a vicariance University of Glasgow, Glasgow, event (Fig. 1). The suggestion that UK G12 8QQ (
[email protected]); these macroevolutionary patterns Michael Charleston is at the are analogous to the relationship Dept of Zoology, University of Oxford, between gene and species trees is Oxford, UK OX1 3PS (
[email protected]). a more recent development1,3.
Types of historical association
Historical associations can be divided into three basic categories (Table 1): genes and organisms, organisms and organisms, and organisms and areas. At the molecular level, each gene has a phylogenetic history that is intimately connectedHost-parasite with, but not necessarassociations ily identical to, the history of the organisms in which the Parasites Hosts Parasites gene resides5,6. Processes such as gene duplication, lineage sorting and horizontal transfer can produce complex gene Hosts trees that differ from organismal trees3,7,8. Associations
(a)
(b)
Organism Gene
(c)
Host Parasite
Area Organism
Time
1998 Fig. 1. Historical associations among genes, organisms and areas:Page (a)&aCharleston gene tree embedded in a species tree; (b) a parasite cospeciating with its host; and (c) a clade of organisms diverging in concert with geological events (vicariance). In each case one entity (the ‘associate’) can be thought of as tracking the other (the ‘host’).
Reconstructing the history of an association
Despite the relative lack of interaction among these different disciplines, strikingly similar concepts have arisen independently from them. Parasitologists16,17 recognized the problem of multiple parasite lineages decades before Fitch’s8 analogous distinction between paralogous and orthologous genes18 (Box 1). Molecular systematists19 and cladistic biogeographers20 independently developed similar methods for interpreting the history of gene trees and biogeographic patterns, respectively. One implication of the parallels among the different kinds of association is that they can be studied using the same methods. 96 Reconciled trees (Box 2) originated in molecular systematics19 but have been applied to both host–parasite coevolution21 and biogeography22. As well as visualizing the relationship between host and associate, reconciled trees provide a quantitative measure of the extent to which the
• Cospeciation; coevolution; cophylogeny; parallel cladogenesis; cocladogenesis ; cophylogenetic descent ; cophylogenetic maps ...
• Here: macroevolutionary context • How to reconstruct the common evolutionary history of two clades, for example host and parasites
• Some key dates • 1981: Brooks (see Klassen 1992) • 1994: Page; Hafner et al.
97
Theoretical prerequisites
• Well known and fully resolved trees • Exhaustive sampling • Monophyletic groups • Not so easy 98
Four coevolutionary events
Cospeciation
Transfer
Duplication
Sorting
99
Tests
• Global congruence • Individual events • Host-parasite links • Importance of the null hypothesis • Cospeciation (e.g. Johnson et al. 2001) • Random associations (Legendre et al. 2002) 100
Methods
• Brooks parsimony analysis (BPA; Brooks 1981) • Reconciled trees (Component, TreeMap 1 et 2; Page 1993, 1994; Charleston 1998)
• Generalised parsimony (TreeFitter; Ronquist 1995)
• Probabilistic methods: ML, Bayes (Huelsenbeck et al. 1997, 2000)
• Homogeneity test (Johnson et al. 2001) • Congruence test (ParaFit; Legendre et al. 2002)
101
• Most methods work well if • Widespread cospeciation • ≈ 1 host / 1 parasite • Small phylogenies • Else: computationally intensive, optimal solution not guaranteed
• Different methods: different results 102
Reconciled trees TreeMap (Page 1994)
• Goal: fit parasites tree onto host tree by adequately mixing the 4 types of events
• Criterion: maximise cospeciations (TM 1) • Test against a random distribution • Can take branch lengths into account 103
• TreeMap 1: problems • Transfers added a posteriori • Limited optimality criterion (can generate many optimal solutions)
• Difficulty with widespread parasites 104
• TreeMap 2 (Charleston & Page, 2002) • Introduces event costs • Optimisation of global cost • Many modifiable parameters • Tests: • Global cost • Cherry Picking Test: influence of individual associations
105
• Assume that parasite’s ancestor is on host’s ancestor
• Each host must have at least a parasite(!!) • Needs fully resolved trees • TreeMap cana generate files for BPA and TreeFitter
106
Example talpoides
wardi 17
13
minor
bottae
thomomyus bursarius
15
actuosi 10
hispidus
ewingi
14
18 chapini
cavator
16
12 11
panamensis
underwoodi
12 setzeri
10 9
14 13
cherriei
cherriei
8
11 heterodus
costaricensis
Phylogenies: COI Pocket Gophers
Chewing Lice
107
TreeMap 1 talpoides wardi thomomyus bottae minor actuosi bursaris ewingi
(a)
4
hispidus chapini
cavator panamensis
cavator panamensis
chapini cavator panamensis
underwoodi setzeri
underwoodi setzeri
underwoodi setzeri
cherriei setzeri cherriei heterodus
cherriei setzeri cherriei heterodus
(b)
costaricensis
talpoides wardi thomomyus bottae minor actuosi bursaris ewingi
talpoides wardi thomomyus bottae minor actuosi bursaris ewingi
hispidus chapini
hispidus
underwoodi setzeri
panamensis underwoodi setzeri
cherriei setzeri cherriei heterodus
cherriei setzeri cherriei heterodus
costaricensis
hispidus
(e)
costaricensis
cherriei setzeri cherriei heterodus
(c)
costaricensis
14
talpoides wardi thomomyus bottae minor actuosi bursaris ewingi hispidus chapini
chapini cavator
cavator panamensis
(d)
talpoides wardi thomomyus bottae minor actuosi bursaris 5 ewingi
talpoides wardi thomomyus bottae minor actuosi bursaris ewingi
hispidus chapini
costaricensis
14
3
cavator panamensis underwoodi setzeri cherriei setzeri cherriei heterodus
(f)
costaricensis
108
TreeMap 2
• 6 optimal solutions (out of 9) 1 of 9 | Co = 8, Sw = 4 (total distance: 3.565), Du = 10, Lo = 20;oftotal 9 | Co cost = 8, = 14 Sw = 4 (total distance: 3.67), Du = 10, Lo = 0; total 3 of 9 cost | Co==14 12, Sw = 3 (total distance: 3.009), Du = 6, Lo = 1; total cost = 10 talpoides talpoides talpoides wardi wardi wardi thomomyus thomomyus thomomyus bottae bottae bottae minor minor minor actuosi actuosi actuosi bursarius bursarius bursarius ewingi ewingi ewingi hispidus chapini
hispidus chapini
hispidus chapini
cavator panamensis
cavator panamensis
cavator panamensis
underwoodi setzeri
underwoodi setzeri
underwoodi setzeri
cherriei cherriei
cherriei cherriei
cherriei cherriei
heterodus costaricensis
heterodus costaricensis
heterodus costaricensis
4 of 9 | Co = 12, Sw = 3 (total distance: 3.009), Du = 6, Lo = 61;oftotal 9 | Co cost = 12, = 10Sw = 2 (total distance: 1.9345), Du = 6, Lo = 3; 6 of total 9 | cost Co = =12, 11Sw = 2 (total distance: 1.9345), Du = 6, Lo = 3; total cost = 11 talpoides talpoides talpoides wardi wardi wardi thomomyus thomomyus thomomyus bottae bottae bottae minor minor minor actuosi actuosi actuosi bursarius bursarius bursarius ewingi ewingi ewingi hispidus chapini
hispidus chapini
hispidus chapini
cavator panamensis
cavator panamensis
cavator panamensis
underwoodi setzeri
underwoodi setzeri
underwoodi setzeri
cherriei cherriei
cherriei cherriei
cherriei cherriei
heterodus costaricensis
heterodus costaricensis
heterodus costaricensis
109
Cospeciation: Test
• Test against random distribution (randomised trees) of inferred number of cospeciations
• Confrontation with observed value Observed value
250
Fréquence
200
P < 0,05 The observed number of cospeciations is higher to random iterations
150
100
*
50
0 1
2
3
4
5
6
7
8
9
10
11
12
Nombre de cospéciations
110
TreeFitter (Ronquist 1995)
• Generalized parsimony • Can assign costs to 4 types of events (like TreeMap 2)
• Problem to define costs • Tests (randomisation) • Global congruence: minimisation of global reconstruction cost
• Events: significant contribution to global congruence
111
• Ronquist says BPA and TreeMap 1 can be found in TreeFitter with some given costs
• Needs fully resolved trees • No graphical output nor possibility to
individuality identify significant events
• Still experimental 112
ParaFit (Legendre, Desdevises & Bazin, 2002)
• Assess congruence between distance matrices
(potentially) computed from phylogenies of hosts and parasites, via host-parasite associations
• Statistical tests (via permutations) congruence between two trees/matrices • Global (H : random associations) 0
• Contribution of each individual association to this congruence (structuring effect)
113 Host-parasite associations
Hosts
Parasites
B
A
C
Parasites tree
Host-parasite associations
Host tree
114
Princ. coordinates
Matrix A
Matrix B
Absence/presence of host-parasite associations (0/1 data)
Coordinates (col.) describing the parasite phylogenetic tree
Parasites
Parasites
Hosts
Matrix C Coordinates (rows) describing the host phylogenetic tree
Princ. Coordinates Host tree
Princ. coordinates
Hosts
Parasite tree princ. coordinates
Matrix D SSCP parameters to be estimated
115
Pocket gophers
T. talpoides T. bottae Z. trichopus P. bulleri O. hispidus O. underwoodi
T. barbarae T. minor G. nadleri G. chapini G. setzeri G. panamensis
O. cavator
G. cherriei
O. cherriei
G. costaricensis
O. heterodus
G. thomomyus
C. merriami
G. perotensis
C. castanops G. bursarius majus. G. bursarius halli
Chewing lice
G. trichopi
G. actuosi G. expansus G. geomydis G. oklahomensis
G. breviceps
G. ewingi
G. personatus
G. texanus
116
• Drawbacks • Do not consider events • Do not propose scenarios • Advantages • Tests, et tested via simulations • Adapted to complex problems • Various numbes of hosts/parasite and parasites/ host
• Use matrices: no problem with polytomies, or multiple trees
117
Examples Gophers and lice
• Highly studied system • Many cospeciations • Transfer in host contact zones: cospeciation
explain by impossibility of colonisation, not by “extreme” adaptation to host
118
Monogeneans - Hosts
• A priori: many cospeciations • High specificity • Direct cycle 119 Sparidae and Lamellodiscus spp. S. cantharus
Hosts
B. boops S. salpa L. mormyrus D. sargus D. cervinus O. melanura D. puntazzo D. annularis D. vulgaris S. aurata P. acarne P. bogaraveo D. dentex P. pagrus P. erythrinus
L. furcosus L. coronatus L. elegans
Parasites
F. echeneis L. mormyri L. verberis L. drummondi L. virgula L. impervius L. parisi L. mirandus L. gracilis L. bidens L. hilii L. ergensi L. fraternus L. knoeppfleri L. ignoratus L. baeri L. erythrini
100 µm
120
S. cantharus
• TreeMap
B. boops L. elegans S. salpa
L. knoeppfleri L. mormyrus
Frequencies
D. sargus L. ignoratus
Number of cospeciations inferred via TreeMap
350 300 250 200 150 100 50 0
*
1
L. mormyri D. furcosus cervinus L. ignoratus L. parisi coronatus L. verberis elegans O. ergensi melanura L. ignoratus coronatus
P = 0.317
2 3 4 5 6 7 Number of cospeciations
D. puntazzo L. mirandus L. L. gracilis elegans D. annularis L. D. furcosus vulgaris L. ergensi ignoratus coronatus L. L. gracilis elegans L. S. impervius aurata L. ergensi ignoratus L. bidens fraternus L. elegans F. echeneis P. hilii acarne L. ignoratus L. L. ergensi virgula L. gracilis fraternus P. bogaraveo L. virgula
8
L. D. drummondi dentex P. pagrus
L. baeri P. erythrinus L. erythrini
121
• ParaFit • P global = 0,243 • 2 significant links S. cantharus
Hosts
P = 0.243
L. elegans
S. salpa
L. mormyri
D. sargus
L. verberis L. drummondi
D. cervinus
L. virgula
O. melanura
L. impervius
D. puntazzo
L. parisi L. mirandus
D. annularis
L. gracilis
D. vulgaris
L. bidens
S. aurata
L. hilii
P. acarne
L. ergensi L. fraternus
P. bogaraveo
P. pagrus P. erythrinus
Parasites
F. echeneis
L. mormyrus
D. dentex
L. furcosus L. coronatus
B. boops
L. knoeppfleri
P = 0.028 P = 0.018
L. ignoratus L. baeri L. erythrini
122
• No cospeciation • Sympatric hosts (and parasites) • Other systems with geographically separated hosts (e.g. Polystomes/Amphibians): cospeciation
• Here, transfers not influenced by phylogeny but by hosts’ characteristics: association with other species, size
123