MacClade and Maximum Parsimony Trees
Introduction:
Evolutionary relationships between species are often diagrammed as trees. The trees consist of a root that represents a single common ancestor for the whole tree, internodes that represent either real or hypothetical common ancestors for specific lineages within the tree, branches that describe relationships between ancestral species and their descendents, and terminal nodes that represent the taxa being studied.
DNA, RNA and protein data can all be used to generate such trees, (as can more traditional characters, such as the presence or absence of a placenta). The assumption that is made in the molecular analyses is that closely related species will have more similar gene or protein sequences than distantly related species. Furthermore, since mutations are both rare and random, it is assumed that the same mutation is unlikely to arise in two different lineages. Thus, if you have four species with the following DNA sequences: GAATTC, GATTTC, GATCTC, and GAGTTC, it would be more parsimonious to arrange the tree as shown in “A”, where the AàT mutation at the third position only happens in one lineage, and the tree itself only requires three mutations, than in “B”, where it there are four total mutations required to generate the tree, and the AàT mutation happens in two separate lineages.
GATTTC GATCTC GAGTTC GATCTC GAGTTC GATTTC
TàC AàT AàG
AàG TàC AàT
AàT
GAATTC GAATTC
In this exercise you will use MacClade to generate cladograms or phylograms (you can choose) that have the smallest tree length (analogous to the total number of mutations needed to account for the tree) while minimizing the number of times that a single mutation arises in multiple lineages. You will then compare your MacClade tree to the one automatically generated for you by the Phylogenetic analysis in the Rarely Reclusive exercise.
Procedure:
Save the National Biomedical Research Foundation (NBRF) Format File from the syllabus (filename mitochondrial.NBRF) to your desktop by typing control+click on the file name.
MacClade Analysis of Mitochondrial Sequences
Open MacClade
1. Double Click on the MacClade Icon.
2. Choose ‘Open File’ (Your computer may open a Finder Window instead of allowing you to choose Open File..this simply takes you directly to step 3, so proceed!).
3. Open ‘mitochondrial.NBRF’ from your desktop.
4. Click ‘OK’ to verify that your file is an NBRF DNA file.
5. You should now see that most of your sequence files (minus a few that had poor sequence quality) have been uploaded into the MacClade program as seen in the picture below:
6. While this window contains all of your sequences (the taxa in this case are the individual students, and the characters are the individual DNA bases at each position of the sequence), the sequences are not yet aligned, so we need to do that right after you save your data.
Save Data
1. Under FILE choose SAVE FILE AS, and give your file a name and save it to the desktop. This will make sure you do not lose your data if something goes wrong during later manipulations. Don’t forget to throw away all of your files and to empty the trash before you put your computer away for the semester.
Aligning DNA Sequences
1. In order to analyze how closely related these gene sequences are using MacClade, you need to make sure that each sequence is properly lined up.
2. MacClade’s Alignment Tool can be seen in the toolbox in the bottom left of your MacClade window, as indicated by the arrow below:
3. Click and hold on the Alignment Tool. In the popup box, select ‘slow method using less memory’.
4. To align mito124 with mito126, make sure that the alignment tool is selected. Now, click on mito126, and drag up to mito124. Release. The computer will now line up the two sequences so that they match as closely as possible. Save your data. I recommend you save after every alignment step below (sometimes this program crashes)
5. Next, align mitoTP10 with mito126 by dragging the alignment tool from mitoTP10 up to mito126 and releasing. Note that the mitochondrial sequences have to shift and insert gaps to find their best matches with the other sequences.
6. Repeat, until you have finished aligning all of the sequences to the sequence above them in a pairwise fashion (yes it’s a pain to only align them pairwise…but that is all this program can do).
7. When you finish, you should be able to see (the bases are each color-coded) that the sequences line up really nicely now. You are ready to generate a phylogenetic tree.
Manipulate Data
1. Generate tree
a) Under WINDOWS choose TREE WINDOW
b) Choose DEFAULT LADDER - a tree should now appear! This tree makes no assumptions at all about how your microbes are related. In fact, if you look at the tree, it simply places the organisms in the same order in which you entered the sequences.
c) Under TRACE choose TRACE ALL CHANGES. This will provide a color visualization of the number of nucleotide changes between a common ancestor and the next ancestor or terminal taxon. If you place the cursor arrow on a tree branch, the number of unambiguous nucleotide changes in that branch will be written in the small box on the bottom right.
d) Under Display choose ‘Tree Shape and Size”. The choices, from left to right below are either an angled or a square branch cladogram, or a phylogram.
2. Minimize Tree
a) Find the most parsimonious tree by clicking on branches, and then dragging them to new tree locations, remembering as you do so that you are changing the assumptions about how the species are related. Your goal is to search for the tree that gives the lowest number of “steps” (analogous to the total number of nucleotide changes necessary to account for the proposed evolutionary relationships - so lower is better). If you want to know what the theoretical smallest tree is, you can choose S à S minimum possible from the upper program bar. Both the current tree length and the minimum possible tree length will be displayed in a small box in the lower right hand corner of your tree. Try to get as close to the minimum as possible.
b) Try manipulating the tree so it looks like the ‘ideal’ tree from your Dolan DNA Center analysis. Have you found the minimum yet?
c) Now that you’ve played around with the tree a bit, it is time to tell you a little secret – the computer will actually help find the smallest tree for you. Simply click on the ‘search above’ icon in the toolbox, and then move the cursor to the root of the tree. When you click on the root, the program will search for, and display, the shortest tree above that spot.