XII. Seed Plant Phylogeny

In this laboratory, you will become familiar with the problems facing systematic biologists who want to develop ideas about the evolutionary history of groups of living things. Though you might think that figuring out the evolutionary history of groups like the dinosaurs or the seed plants would be easy, it turns out that it’s not. Both theoretical and practical problems make inferring evolutionary history one of the most challenging of the life science disciplines.

Like other disciplines in biology, the study of evolution proceeds through the experimental cycle, depending on the construction of hypotheses from observations and the rejection or retention of these ideas based on experimental work. What’s unique is that these hypotheses address events in the past, often millions of years ago, but they are based on observations made today - a process known as remote inference. We shall in all likelihood never see major groups of organisms evolving over long periods of time, so we are stuck with guessing at the origin of the major groups through reconstructing history from the fragments of evidence still unobscured by later events.

Perhaps the most important thing to understand about constructing evolutionary hypotheses is that the work is quite simply a comparison of groups of living things. Consequently, all of this work depends on defining what groups of living things you are studying and what characters of those living things you are going to use in your analysis. Thus, at the heart of any evolutionary study is a table of characters for different groups of organisms - you should fix this table in your mind as a central feature of this work - it symbolizes the set of observations from which the hypothesis can be constructed.

Over the weeks since spring break, you have spent a lot of time learning about the morphology of the various groups of seed plants. These observations are the characters that can be used to construct a hypothesis about evolutionary history. You will construct these hypotheses using two programs commonly used by systematists, PAUP and MacClade. Finally, we will consider alternatives to our hypothesis to learn more about the quality of our ideas.

I. Introduction to the Study Group - the Seed Plants

The Seed Plants are a group of 58 orders of plants, including familiar groups like the pine trees and bizarre and unusual plants, like Welwitschia, a plant from the Namibian desert in southwestern Africa with an underground stem and two leaves that pile up like giant piles of ribbons as they grow. Most of the diversity is in the flowering plants, which have seeds enclosed in fruits - there are 40 orders and 250,000 species of flowering plants. Of the other eighteen orders, most (eleven) are extinct - leaving seven orders and about 600 species of living seed plants excluding flowering plants.

Eleven major lineages of extant seed plants are included in these seven orders and the angiosperms:

Cycadales - the cycads, palm-like plants from tropical regions

1. cycads - Cycadaceae

Ginkgoales - ginkgo, a bizarre tree from Japan and China

2. ginkgo - Ginkgo

Taxales - the yews, evergreen trees and shrubs with red fleshy seeds

3. yew - Taxaceae

Coniferales - the pines, cedars, redwoods, podocarps, Norfolk Island pines, and their allies

4. redwoods and cedars - Taxodiaceae

5. Norfolk Island pines – Araucariaceae

6. pines – Pinaceae

7. podocarps – Podocarpaceae

Ephedrales – leafless desert shrubs

8. Ephedra

Welwitschiales - the bizarre Namibian plant mentioned above

9. Welwitschia

Gnetales - woody vines and a tree from tropical rain forests

10. Gnetum

Angiosperms - 40 orders of flowering plants

11. angiosperms

The big questions have always been: 1) what has been the evolutionary history of these plants? and 2) from what group of seed plants did the angiosperms arise? (Charles Darwin himself called the origin of the flowering plants “an abominable mystery.”) These questions are interesting because it is clear that several seed plant groups have evolved in concert with major groups of animals: the cycads, gingkoes, and conifers with the vegetarian dinosaurs like Apatosaurus, Triceratops, and Stegosaurus, and the angiosperms with the insects. Understanding the evolution of the major animal groups is directly dependent on understanding the evolution of the seed plant groups.

The goal of this lab is for you to work on the problem of seed plant evolution using data from the recent literature, but in the process, familiarizing yourself with the plants, some of their characters, and the techniques available for constructing hypotheses about evolutionary history.

Building the Evolutionary Tree via Computer and Manipulating It to Test Ideas

One recent analysis of the seed plants, done by Kevin Nixon and coworkers (Nixon et al., 1994) included 103 characters for 14 living groups and 14 extinct groups - a data table of 103 x 28 cells! No one in their right mind would attempt to build a tree by hand from the huge pile of data in this data set! What we do instead is leave it to computers to do the job. You must still study the plants and collect the character states, build the data table, and tell the computer how to do the analysis, but most of the time is in finding the shortest tree, and this process the computer can do.

In this exercise we will use Nixon’s data set, though only a subset of the characters and only for the living seed plants, to infer phylogenies (that is build trees that hypothesize evolutionary history) using computer programs. Though there is an array of available programs, most do basically the same thing - they build networks based on similarities, then root the trees using the outgroup criterion (that is, a character shared by the outgroup and some of the members of the study group is primitive, see. p. 8). For this exercise, we will use two different programs, because they each have different capabilities: the programs are ---

-- PAUP (Phylogenetic Analysis Using Parsimony) - to find the tree with the fewest character-state changes (the shortest, or most parsimonious,tree).

-- MacClade - to 1) show the tree, 2) learn about the character distributions on the shortest tree, and 3) consider alternative trees that are longer but assume possibly more appealing evolutionary histories.

A. Finding the Shortest Tree using PAUP

For this part of the exercise, you need to get used to Macintosh computers if you have not used them before. All the files you need should be in your folder - but make sure you have them; there are three –

A copy of PAUP* 4.0b10

A copy of MacClade version 4.05

A copy of the seed plant data file, “seedplant.nxs”

Okay, let’s begin.

1. Locate the “seedplant.nxs.” file and make a copy. Give the file a name, which makes it your own copy of the data file. Call it “yourname.nxs” and save it on the desktop.

2. Open “yourname.nxs” in Microsoft word. Now look at the structure of the file. The design of this file is a consequence of the programmer’s approach to inferring phylogenies using a computer and the particular programming language he chose. Let’s take a tour of the file.

a. At the top of the file is the file-type label, “#NEXUS”. The third line, “BEGIN DATA;” is a signal to the computer that the data are about to be fed to it. Then, under “DIMENSIONS”, the file describes the number of groups we are going to provide data for (NTAX=11), then the number of characters provided for each of these groups (NCHAR=103).

b. Then, in the fifth line, comes some language defining characters in the data table. These are useful to understand.

FORMAT MISSING=? --- If you look at the MATRIX (data table) below, you will see a number of question marks. These may mean one of three things:

i. truly missing data (for instance if no one has ever studied that part of the plant)

ii. data missing because a structure has not been invented by a group (for instance fruit structure for a pine tree makes no sense because pines don’t have fruits.)

iii. in the case of our data (from Nixon et al., 1994) a character that

- varies within a group

- is confusing as to its homology (for instance it may be unclear exactly what a leaf is in a group)

GAP=. doesn’t apply to this data set.

SYMBOLS= “0,1,2,3” defines the numbers to be used to represent character states. In this larger data set, some characters have more than two character states. For instance, consider character 27, vein orders. This character represents the number of times that veins branch, and the number varies. Nixon et al. chose to divide the character into three character states - not branched, branched once, and branched twice or more. In the matrix these character states are represented by 0,1, and 2.

The vein-order number character also introduces the idea of defining primitive versus derived character states. This particular character is considered as “ordered”, that is unbranched veins are primitive, singly branched veins are derived from unbranched veins, and the more complex branching (branched twice or more) is in turn derived from the singly branched veins.

Sometimes, scholars do not want to define character states as primitive or derived, in which case the character is labeled “unordered”.

c. Next comes the MATRIX command, which tells the computer that the data are next in the file. It ignores the numbers in brackets, i.e. [10 20...], which are there to help you tell which character number you are looking at in the matrix below.

The matrix itself includes a number or a question mark for each character for each of the eleven living groups (evolutionary lineages) for which we are asking the program to build a tree.

d. Down at the bottom of the file is a block of commands under “BEGIN ASSUMPTIONS”. These tell the computer about our particular choices for analyzing these data.

i. TYPESET * mixed distinguishes between the unordered and ordered characters in the data set. These are Nixon et al.’s opinions about which is which, from the original article.

ii. EXSET * exclude is the list of characters we are excluding in this exercise. These are removed so that just enough characters (32) are included to give us the same answer as all 103 characters, and it will make it simpler for you to deal with the trees when you make them.

The final characters are ------{13 17 20 22 33-35 38-40 47 57-61 64 65 70 71 73 74 83-92} --- Here are the characters and their character states:

CHARACTER / KIND / STATE 0 / STATE 1 / STATE 2
13.Vessels / absent / present
17.Lignin Subunits / vanillan / syringal groups
20.Resins / absent / present
22.Leaf Base / simple / sheathing
33.Stomates / haplo-cheilic / some or all syndeto-cheilic
34.Astro-sclereids in leaf / absent / present
35.Strobili / additive / unisexual / bisexual / functionally unisexual
38.Microsporophylls / spiral / whorled/opp-osite
39.Microsporophylls / free / basally fused
40.Micro-sporangia per unit / many / 1-4
47.Leptomate Aperture / absent / present
57.Micro-gametophyte / additive / more than four-nucleate / 4-nucleate / 3-nucleate
58.Pollen Tube / suspended / penetrating
59.Ramiform Pollen Tubes / absent / present
60.Stalk Cell / present / absent
61.Sperm / flagellate / Non-flagellate
64.Woody Cones / absent / present
65.Compound Cone Units / many / few
70.Ovules / ortho-tropous / anatropous
71.Micropyle / normal / tubular
73.Ovule Growth / pachy-chalazal / endo-chalazal
74.Outer Seed Envelope / absent / present
83.Megaspore Tetrad / non-additive / tetrahedral / linear / isobilateral
84. Megaspore Wall / thick / thin/absent
85.Megagametophyte / monosporic / tetrasporic
86.Megagametophyte / alveolar / nonalveolar
87.Archegonia / present / absent
88.Egg / cellular / free-nuclear
89.Early Embryogeny / free-nuclear / cellular
90.Embryo Maturity / postshed / preshed
91.Embryo Feeder / absent / present
92.Seed Germina-tion / crypto-cotylar / phanero-cotylar

iii. ANCSTATES allzero = 0:ALL. This is the way in which we actually root the tree, in this case by defining all the zero character states in the data table as most primitive. Nixon et al. decided on which character states were most primitive by the outgroup-comparison method we discussed last week, and coded the character states so that 0 is most primitive for each one.

3. It’s time to run the program. Click on the PAUP icon to launch the program. Type inExecute yourname.nxs; to run PAUP, the program that will build your phylogenetic tree for you, using your file.

a. PAUP will first read the data for our 11 groups using the 32 selected characters. PAUP reports on its work in a display buffer. You should see the following messages:

Processing of file “yourfile.nxs” begins...

Data matrix has 11 taxa, 103 characters

Valid character-state symbols: 0123

Missing data identified by ‘?’

Gaps identified by ‘-’

Character-exclusion status changed:

71 characters excluded

Total number of characters now excluded = 71

Number of included characters = 32

Processing of “yourfile.nxs” completed.

PAUP counts all the characters in the data file to come up with 103, but will only use 32, because of the EXSET * exclude command we included. Taxa are our groups.

b. Next we need to ask PAUP to search for the shortest tree. Type in HSEARCH; This sort of search for trees is the fastest, but it may not find the shortest tree of all - we choose this kind of search to save you time.

Notice that the program tried 416 rearrangements and found one shortest tree that is 37 steps long.

c Next we want to view the tree. Type SHOWTREES; to view a primitive version of the shortest tree in the display buffer window. But to manipulate this tree, the other program, MacClade, is better. So save this tree –type SAVETREES;The name of your PAUP tree file will be ‘yourfile.tre’.

Now quit PAUP by typing END;and clicking the red X at the top left of the terminal screen.

B. Next you’ll work with the MacClade program to understand the structure of your phylogeny (hypothesis of evolutionary history) for the living seed plants. Find the MacClade program and click on it twice to start it running.

1. Choose the File menu and select Open File... Choose the Desktop from the column at left. Now choose the data file you named, “yourfile.nxs” (not the trees file). MacClade and PAUP recognize each other’s files, so, when you click Open, the file will open right up. This time, the data are really easy to see. The 11 groups are named in the first column, and the character states for each character are listed in the column under each character number. Scan across the right to view all 103 characters, but remember, we are only using the 32 above.

2. To see how MacClade excludes characters, choose the Display menu and select Shade Character Sets. Then choose Excluded. This command shades the excluded characters in the table so you can easily see which are being used in the analysis and which are not.

3. Now it’s time to open the tree file using MacClade. Choose the Windows menu and select Tree Window. You will get a message that says that there are no tree files stored for this data file - choose the button that says “open tree file”.

A menu appears that allows you to choose your tree file, ‘yourfile. tre’. Click on the file and then tap the button to choose it (You can also click twice on the file name).

MacClade now opens your tree fine and a window called Trees. Close the Trees window. The tree itself will appear. Now you can get down to working with the tree.

4. Having gotten your tree ready to manipulate, there are three basic things to do:

First, trace a single character’s history and show the outcome in the character states typical of each group, and

Second, customize the tree to show the character changes,

Third, change the tree around to understand the effects of choosing a different history for the evolutionary groups.

Notice there is also a mini-window showing the tree information. There is also a Tools palette at the lower left of the screen. From the Tools palette, choose the symbol that looks like this:

This tool allows you to rotate the two branches at a node - try it out. Move the cursor on to the tree image and click on a line bearing two branches. You will see the branches rotate. One thing should become clear: the relative position of the groups around a branch point doesn’t matter!

a.Tracing the History of Single Characters

For this section, here are two characters from the data table for reference.

Character 0 1

13.Vessels absent present

20.Resins absent present

i. Choose Trace Character from the Trace menu. The changes in character 13 (vessels) are shown in color, with yellow primitive and blue derived.