Microbial Diversity and Populations Sylvain BRISSE Lab. Genotyping of Pathogens and Public Health
1
http://genopole.pasteur.fr/PF8
Questions • What is the extent of the global microbial diversity? • Does microbial species really exist? • What are the amounts of genetic diversity inside species? • How can we describe clonal diversity inside species? • How did pathogenic strains evolved? • Examples of genotyping methods
2
The two ‘sciences’ of microbial diversity Species and above
TAXONOMY (Systematics) Classification Phylogeny Nomenclature Identification
3
Evolutionary history Common language
Below species level POPULATION GENETICS Genetic diversity Phylogenetic groups Clones, clonal families Recombination Natural selection
Microevolution Typing, molecular epidemiology
Phylum-level diversity ~ 100 bacterial phyla (only 12 in 1987) Many were discovered by direct 16S rRNA sequencing of environmental DNA
4
No cultivated strain Lopez-Garcia & Moreira, 2007 Cultivated
Classification of pathogens
‘malaria’
Some examples:
γ-Proteobacteria Escherichia coli Salmonella Yersinia pestis Vibrio cholerae Actinobacteria Mycobacterium tuberculosis Firmicutes Staphylococcus aureus 5 Streptococcus pneumoniae
No known pathogenic Archaea member
Taxonomy (Systematics) • Classification : circumscribing taxa • Nomenclature : giving names to taxa • Identification : assigning strains to taxa
6
Bacterial nomenclature • Taxonomic ranks Domain [Bacteria - Archaea - Eukarya] Phylum (Division) Class Order Family Genus species subspecies
• Binominal system (Linneus, 1753) Pasteurella multocida (agent of chicken cholera) 7
Genus
species
‘Biological esperanto’
Updated taxa lists - valid taxonomy • Jean Euzéby’s web site (Toulouse, France): http://www.bacterio.cict.fr •16S rRNA microbial classification: http://www.taxonomicoutline.org/ • Note that NCBI (GenBank) taxonomy has no official value and contains many non-validated names
8
How many species are there? • ~ 8,000 described species • > 1,000,000,000 existing species? • Many species are very heterogeneous (e.g. Escherichia coli)
9
• Strains within species can differ by genomic content and properties (pathogeny, ecology, epidemiology)
Pragmatic definitions of bacterial species… • Phenotype • DNA:DNA reassociation > 70% ΔTm > 5°C • Phylogeny (16S rRNA) • ‘Polyphasic’ approach • Ecotype • ‘Sequence cluster’ (MLSA) • Genomic distance (ANI) •… 10
Correlation 16S rRNA - DNA:DNA
11
(Rossello-Mora et Amann, 2001)
Pragmatic (but not absolute!) rules for defining species • DNA:DNA hybridization (> 70% reassociation; ΔTM < 5ºC) Wayne et al., 1987
• 16S rRNA < 97% similarity: distinct species > 97% similarity : same or distinct species (one cannot conclude) • Phenotype matters! (clinical, biochemical properties) 12
Genome sequencing and species borders: Average Nucleotide Identity (ANI) at orthologous genes DNA-DNA
Konstantinidis & Tiedje
ANI
5% ANI
13
70% réassociation
• Strains within a species can differ by up to 5% on average • Remark: distance man - chimpanzee = 1.2%
Phylogeny and recombination: do bacterial species exist? Donor
STOP
ATG
Allelic replacement Recipient
STOP
ATG
• Single-gene phylogeny: homologous recombination results in phylogenetic misplacement of recipients True phylogeny
Gene X-based phylogeny
Horizontal gene transfer of gene X
14
A
B C
D
A BD
C
Multilocus Sequence Analysis (MLSA) Chromosomal DNA PCR Amplification & sequencing Gene 1
Gene 2
Gene 3
Gene 4
Gene 5
Gene 6
5-7 housekeeping genes
• Comparison of individual gene phylogenies: detecting recombination • Phylogeny based on concatenated sequences: more reliable 15
Example of the pneumococcus (www.emlsa.net; Hanage, Fraser & Spratt) Individual genes
Concatenated genes
A
BG Spratt & colleagues
16
S. pneumoniae S. pseudopneumoniae S. mitis S. oralis
• Multiple genes buffer against recombination • Species do exist!
Microbial species: conclusions • Clearly demarcated sequence clusters are found in most/all bacterial groups, even those where recombination is frequent between closely-related species • These clusters often, but not always, correspond to previously-defined species (e.g. based on phenotype) • Microbial diversity is not a genetic continuum • ‘Microbial species exist’ 17
Do phylogenetic discontinuities really exist or are they due to sampling bias? Still an open question…
Cultivated non-cultivated (?)
18
Bacterial species are more or less diverse
‘Taxonomy’
‘Population genetics’ Recombination Diversity
Species
Phylogenetic group
Clone
B. anthracis
19
E. coli S. aureus
Y. pestis M. tuberculosis
Monomorphic species: taxonomy is often in disagreement with phylogeny Example: Shigella and E. coli belong to the same phylogenetic group (species) Evolution of Shigella from E. coli strains by acquisition of virulence factors
Ochman et al., 2000
20 (Lan and Reeves, 2001)
Monomorphic taxonomic ‘species’ Tableau 1. Exemples d'espèces 'taxonomiques' qui sont en réalité des clones hautement pathogènes issus d'espèces 'génomiques' Espèce
Infection causée
Espèce ancestrale
Référence
"Salmonella typhi" (a)
Fièvre typhoïde
Salmonella enterica
Selander et al. 1990 Infect. Immun. 58: 2262
Yersinia pestis
Peste
Yersinia pseudotuberculosis
Achtman et al. 1999 PNAS 96: 14043
Shigella flexneri, S. boydii, S. dysenteriae, S. sonnei
shigelloses
Escherichia coli
Pupo et al. 2000 PNAS 97: 10567
Bacillus anthracis
Charbon
Bacillus cereus
Priest et al., 2004 J. Bact. 186: 7959
Burkholderia mallei
Morve
Burkholderia pseudomallei
Godoy et al. 2003 J. Clin. Microbiol. 41: 2068
Bordetella pertussis
Coqueluche
Bordetella bronchiseptica
Diavatopoulos et al. 2005 PLoS Pathog 1(4): e45
Bordetella parapertussis
Coqueluche
Bordetella bronchiseptica
Diavatopoulos et al. 2005 PLoS Pathog 1(4): e45
Mycobacterium ulcerans
Ulcère de Buruli
Mycobacterium marinum
Stinear et al., 2007, Genome Res. 17:192
Mycobacterium tuberculosis
Tuberculose
Mycobacterium prototuberculosis
Gutierrez et al. 2005 PLoS Pathog 1(1): e5
(a) Les sérotypes de salmonelles ne sont maintenant plus considérés comme des espèces dans la nomenclature.
21
‘Nomen pericolosum’ taxonomic rule: do not change name if risk for health / medical consequences
Typical bacterial species are highly heterogeneous Comparison of Escherichia coli with primates Property
E. coli
Homo sapiens
Primates
Mol% G + C
48-52
42
42
16S-18S rRNA variability
> 15 bases
?
< 16 bases
DNA:DNA reassociation
>70%
98.6%
#
$
>70%
&
Age 125-160 Myr < 1 Myr < 65 Myr # Mouse 18S rRNA differs from humans by 16 bases $ Comparison between Homo sapiens and chimpanzee & Homo sapiens - lemurs Source: Staley JT, ASM News 65:681
• 70% DNA-DNA 5% nucleotidic divergence • Average divergence among humans < 0.1% • Distance man - chimpanzee: 1.2% 22 • Homo sapiens = one ‘strain’ within Primates !?
Describing and understanding withinspecies diversity Radial phylogeny (MEGA)
23
Network representation (Minimum Spanning Tree, eBURST)
MLST Multilocus Sequence Typing Maiden, Spratt, Feil, Achtman et al.
• Sequence of one gene (fumC, 500 bp)
1 1 2 3 3
Alleles:
abcZ
fumC
icd
gyrB
recA
mdh
ureI
1
1
1
7
22
39
7
Sequence type (ST)
24
•Allele
Alleles
1
1 1 1 7 22 39 7
2
1 2 1 7 22 39 7
3
1 3 1 7 22 39 7
eBURST approach • Find closely-related genotypes and group them into clonal complexes (clonal families) • Identify founder genotype of family Clonal Complex 1 ST4 : 1 - 8 - 1 - 3 - 1 - 1 - 10
Advantages: • Identifies founder genotype
ST3 : 1 - 1 - 1 - 1 - 1 - 1 - 10
• Shows genotype frequency ST1 : 1 - 1 - 1 - 1 - 1 - 1 - 1
• Links STs according to distance ST2 : 1 - 1 - 1 - 5 - 1 - 1 - 1
Clonal Complex 15 ST16 : 7 - 4 - 3 - 5 - 8 - 4 – 2 ST15 : 7 - 4 - 3 - 5 - 45 - 4 – 2
25
ST48 : 7 - 4 - 3 - 17 - 45 - 4 - 2
• Map other characteristics (e.g. serotype) onto graph as colors
Microbial microevolution : Clonal Expansion - diversification time Expanding clone
Founder genotype
26
Clones (clonal complexes) share a common ancestor
Expansion - diversification clonale: A multilocus view
27
eBURST analysis • eburst.mlst.net • http://goeburst.phyloviz.net/ Founder genotype
28
S. pneumoniae MLST data
Minimum spanning tree (e.g. software BioNumerics)
• Proposes links between clonal families • Links may be unreliable
Clonal diversification & recombinaison
Founder genotype
29
Single mutation Homologous recombinaison (horizontal transfer)
Phylogeny based on allelic profiles: more reliable than sequence-based phylogeny in case of recombination 1 2 3
ATCGATGCCAGGCGTACAGGCGTAGGGTTTACGGGTTTAC ............................A........... ...T.......T.......C...............A.... A
Mutation
1121111 B
1111111
HGT
3
1131111 C C Profiles
C
A B
A B C
Nucleotides
30
A B
A B
C
Escherichia coli: diversity of pathogenic potential 1. Commensal strains 2. Animal pathogenic strains 3. Intestinal human pathogens EPEC: EnteroPathogenic (attaching-effacing) ETEC: EnteroToxinogenic E. coli EIEC: Enteroinvasive (similar to Shigella) EAggEC: EnteroAggregative DAEC: Diffusely Adherent STEC/VTEC: ShigaToxin/VerocytoToxic includes EHEC: EnteroHemorr.
4. Extraintestinal human pathogens 31
ExPEC: Extraintestinal Pathogen UPEC: UroPathogenic NMEC: Neonatal Meningitis
Pathotypes = Clonal families?
Commensal EPEC ETEC EHEC APEC EIEC S. flexneri S. dysenteriae S. sonnei S. boydii
32
Escherichia coli: 700 strains, MLST
Pathotype ≠ pathogenic clone
Evolution from commensal to pathogen Acquisition of genomic islands in several phylogenetic lines (clones)
Ochman et al., 2000
33
Reid et al., 2000 Medini et al., 2008
Duality of bacterial genomes
34
Core genome
Flexible genome
Essential genes
Accessory genes
Neutral evolution
Adaptation, virulence
Stability
Rapid adaptation
Vertical inheritence
Horizontal transfer
Sequence variation
Gene content
MLST, SNPs
DNA arrays
Phylogeny, classification Definition of species, clones, genotypes
Biological understanding: phenotype, ecology, virulence, …
Gene content diversity within E. coli E. coli
3 strains completely sequenced: UPEC CFT073 O157:H7 EDL933 Commensal K12
CFT073: 21% specific genes
39%
Pathogenic clones have acquired genes by lateral gene transfer
35
Brzuszkiewicz et al., 2006
Pan-genome: sum of all genes present in all strains of a species
E. Coli pangenome
Touchon et al., PLoS Genetics 2009
36
Klebsiella pneumoniae Opportunistic nosocomial pathogen • 5 - 8% nosocomial infections Europe / USA • Urinary, respiratory, bacteriemia • ESBLs Community pathogen • Friedlander’s pneumonia; pyogenic liver abscess (PLA) • rhinoscleroma, ozena, granuloma inguinale • Infections in animals: metritis, mammitis Ubiquitous • Gut, throat carriage 37 • Environment: soil, water, plants; N2 fixation
Population structure of K. pneumoniae
• ‘Genetically compact species (π < 1%) • Frequent recombination • Two major clones of K1 & K2 • ‘K. rhinoscleromatis’, ‘K. ozaenae’:
K. ozaenae
K7
K4
K2-CC65 K5
K1-CC23
clones of K. pneumoniae
K2-CC14
K1-CC82
K25 K3 K. rhinoscleromatis
38
Brisse et al., PLoS One, 2009
Minimum Spanning Tree, MLST data, 297 K. pneumoniae strains
Virulence depends on clone, not only K-type
rmpA+
K. ozaenae
K7
K4
rmpAK2-CC65 K5
K1-CC23
K2-CC14
K1-CC82
Resp. infections K25
K3 K. rhinoscleromatis
39
Brisse et al., PLoS One, 2009
PLA
Conclusions: clinical importance of clonal diversity within bacterial species • A single species generally harbors clonal families within distinct properties (pathogenicity, ecology, epidemiology,…) • Virulence factors are distributed unequally among strains • Study of core genome (MLST) is complementary to function-associated genes ‘The unit of pathogenicity is the clone’ (see e.g. Musser 1996, Emerg. Inf. Dis. 2:1)
40
Microbes know no borders UK-1 UK-2 UK-3
DK-A DK-B DK-C
• Transmission of resistant strains among hospitals • Food-borne infections
? SP-I SP-II SP-III
• Travel-associated infections
α ,β, γ Need for common language on strain names
41
MLST databases: standard strain designation
www.pasteur.fr/mlst
pubmlst.org
www.mlst.net
http://mlst.ucc.ie/
42
Drawbacks of MLST
• No direct information on virulence, resistance, antigens,… • Discriminatory power is not sufficient in monomorphic pathogens
43
How to subtype monomorphic pathogens?
- MLVA: Multilocus VNTR Analysis (VNTR: variable number of Tandem Repeats) - SNPs (Single Nucleotide polymorphisms) - CRISPR (Clustered regularly interspaced short palindromic repeats) - PFGE (Pulsed Field Gel Electrophoresis)
44
Need to distinguish methods suited for local epidemiology (MLVA, CRISPR, PFGE) and global epidemiology/evolutionary studies (SNPs)
Example of Salmonella enterica serotype Typhi 105 Typhi strains
CC3Newport
CC1Typhimurium CC2Typhi
Typhi (Salmonella enterica ser. Typhi)
199 genes (88,739 bp)
88 SNPs
dHPLC 45
Roumagnac et al., Science, 2007
Global history of Typhi epidemiology • Since ~ 50,000 years • Several pandemic waves • Importance of long-term asymptomatic carriage
46
Roumagnac et al., Science (2006) Roumagnac, Brisse, and Weill, 2007. Médecine-Sciences
New sequencing technologies 454
SOLID
Solexa
47
Caracteristics of ‘Next-generation sequencing’ • 100 times quicker, 100 times cheaper (than Sanger) • Amplification on solid support, novel chemistry • Short read: 35/55/72 (Solexa), 100/320 (454) • Very high number of reads: 2-4 Gb (Solexa), 200-400 Mb (454) in a few days • Several bacterial genomes in a few days! 48
Bibliography: Bacterial Population Genetics Brisse, S. 2008. L'espèce bactérienne: néessaire, mais pas suffisante. Bull. Soc. Franç. Microbiol. 23:164-174. Feil EJ, Spratt BG.Recombination and the population structures of bacterial pathogens.Annu Rev Microbiol. 2001;55:561-90 Feil EJ. Small change: keeping pace with microevolution. Nat Rev Microbiol. 2004 2(6):483-95 Lan R, Reeves PR. When does a clone deserve a name? A perspective on bacterial species based on population 49 genetics. Trends Microbiol 2001;9: 419-424.
Bibliography: Bacterial species Cohan FM. What are bacterial species? Annu Rev Microbiol. 2002;56:457-87. Gevers D, Cohan FM, Lawrence JG, Spratt BG, Coenye T, Feil EJ, Stackebrandt E, Van de Peer Y, Vandamme P, Thompson FL, Swings J. Opinion: Re-evaluating prokaryotic species. Nat Rev Microbiol. 2005 3(9):733-9. Achtman M, Wagner M. Microbial diversity and the genetic nature of microbial species. Nat Rev Microbiol. 2008;6(6):431-40. Fraser, C., E. J. Alm, M. F. Polz, B. G. Spratt, and W. P. Hanage. 2009. The bacterial species challenge: making sense of genetic and ecological diversity. Science 323:741-6.
50