Marine Ecological and Evolutionary Genomics ... - Yves Desdevises

Open your nexus file (File ➙ Open Trees), in Solve Mode, define event costs via. Settings ➙ Set Costs, and host switch distance via the same menu (Set Host.
110KB taille 21 téléchargements 344 vues
Marine Ecological and Evolutionary Genomics Cophylogeny and lateral gene transfer Yves Desdevises

Computer Lab

If needed, you can download software, files, and instructions from the following link: http://desdevises.free.fr/MEEG_YD 1. Cophylogeny Software: CopyCat, Jane, TreeMap 3 -

You need 2 trees, such as trees for hosts and associated symbionts (in Nexus or Phylip format). We will use the host-symbiont system formed by the prasinophyte microalgae and their viruses (input files can be found on the course website).

Global fit method -

For a distance-based analysis (global fit method), start CopyCat

-

You need 3 files, with perfectly matching names: o An association file in text format, mentioning on each line symbiont (parasite, virus) and associated host (see tanglegrams in the Tanglegrams folder in the Cophylogeny Folder), separated by a tabulation. If any, generalist symbionts must be repeated, as well as host species with many symbionts: OtV5 OtV3 MicCV1 MicCV1 …

RCC745 RCC745 CCMP1545 RCC299

1/3

12/05/2017

o The symbiont tree in Newick (Phylip) format: ((((OtV5:0.024,OtV3:0.017):0.004,(OlV158:0.009,BpV132:0.018)… o The host tree in the same format -

Go directly to the second tab 2. Configuration and Execution of AxParafit o Make sure that executables are selected at Step 1 o Select the number of permutations (999 is a good number) at Step 2 o You can correct principal coordinates for negative values. You are not supposed to have negative values from patristic distances, but you will be able to test various options in different analyses o Select the association file with the corresponding button at Step 3 o At Step 4, choose Create Distance Matrix From Host Tree (you can already have a distance matrix ready, or compute it from the sequences, but we will use our tree) and select the appropriate file via Select Input File... o Do the same thing for the parasite tree (Step 5) o Click Validate! in Step 6. Here you may have a message warning you that some hosts have no parasites, which is possible (but not the contrary!), and does not require to eliminate these hosts from the analysis o Click Start Local Analysis! in Step 7 and wait for the analysis to run.

-

Go to the third tab 3. Evaluation of AxParafit Results to see the results o Select the parasite distance matrix produced by AxParafit (Step 1): the file you want is generally the default selection, but it should be placed in the just created ID_[X] directory (generally in the DefaultWorkingDir), and be called [Name].parasites.dist o Do the same thing for the host distance file [Name].hosts.dist (Step 2) o Select the AxParafit output HostPara.out (Step 3. Be careful to rename successive output files to avoid overwriting them) o Click Evaluate Parafit Results at Step 4, and interpret the results

Event-based methods -

Jane uses special Nexus input files containing host and parasite trees, and the pattern of distribution of parasites on hosts. You can create an input file using as basis the example files given on the website, and the files used with CopyCat

-

Open your nexus file (File ➙ Open Trees), in Solve Mode, define event costs via Settings ➙ Set Costs, and host switch distance via the same menu (Set Host Switch Parameters). Population size and Number of Generations are parameters of the search algorithm (the more the better, but the longer) 2/3

12/05/2017

-

Once solutions are computed, open them by clicking on the corresponding line

-

To test the significance of the reconstruction, choose Stat Mode, enter a number of random reconstructions via Sample Size (i.e. 999 and add the observed value (Include original problem instance)) and set again the parameters of the genetic algorithm. Results can be interpreted using the histogram and the Statistics window

-

You can also open the same input file with TreeMap 3.0, which is still a Beta version written in Java that is supposed work on all platforms (but may not... Give it a try of you have time). TreeMap 3.0 can at least be used to draw the tanglegram of the association.

2. Lateral Gene Transfer Software: SeaView (or MEGA, etc.), Blast (via NCBI), Jane, TreeMap 1.0 -

The objective is to find if the 4 different genes contained in the file CandidatesLGT.rtf (in the Sequences_LGT folder) are potentially subjected to a lateral gene transfer (students can split in 4 groups to investigate the evolution of each gene). You need first to blast these genes against the nr dataset in GenBank (phyletic approach): go to http://blast.ncbi.nlm.nih.gov/ and choose protein blast

-

Keep the first best blast hits (BBH) different from the query (that will of course be the BBH) in a text file in Fasta format

-

Copy these sequences with the reference sequences potentially homologous of the same gene (from the 3 domains of life) that you will find as Fasta files in the Sequences_LGT folder. If you want you can add some reference sequences by looking in GenBank (http:// www.ncbi.nlm.nih.gov/protein/), and add them to this file

-

Open your resulting Fasta file in SeaView, align the sequences (Align ➙ Align all, ClustalW2 or Muscle can be selected in the options). [Of course you can align the sequences and make the phylogenetic trees using other softwares such as MEGA: File ➙ Open A File/Session…, then Align, save the alignment in MEGA format and use the menu Phylogeny]

-

Make a phylogenetic tree based on your alignment using ML or a distance methods in SeaView via Trees (phylogenetic approach). Note that you can easily select sites and/or species via Site ➙ Create set and Species ➙ Create group

-

By looking at this tree, is there any evidence of lateral gene transfer for the candidate sequence? From the host or another taxon?

-

You can use this tree and a reference tree (such as the tree built for the DNA polymerase gene, Ref_dpo_tree.tre (that can be opened in SeaView) from the alignment Ref_dpo.tree.fasta) and Jane or TreeMap to infer a LGT scenario. You may have to use a tree with selected taxa, that you can built from the alignment in SeaView. You can also directly use the input files already prepared for a LGT study using event-based cophylogenetic programs. These files are in the Cophylo_LGT folder

3/3

12/05/2017