RESULTS FROM PRIOR NATIONAL SCIENCE FOUNDATION SUPPORT
James R. Manhart
AWARD: BSR 8906126, $200,000, 15 July 89-31 Mar 93, "Molecular Systematics of Green Algae and Primitive Land Plants".
Project: The main research emphasis was the elucidation of phylogenetic relationships between green algae and primitive land plants using molecular comparisons of chloroplast genomes. These comparisons are being done at several levels to include: (1) genome gross structure, (2) genome fine structure and, (3) comparative gene sequencing.
Results: In contrast to land plants, green algal cpDNAs show tremendous variation, there is more variation in genome size in five charophyte genera than in most land plant cpDNAs examined. The gene that encodes polypeptide elongation factor Ef-tu (tufA) is absent from the chloroplast genomes of land plants, Spirogyra, and Sirogonium. It is present in the cpDNAs of all the other green algae surveyed. Introns are present in the tRNA Ala and Ile genes in both Coleochaete and Nitella cpDNA and in Spirogyra the intron is present in the tRNA Ala gene only. This indicates that these green algae are members of the sister group to land plants and that Coleochaete and Nitella are closer to the land plant lineage than Spirogyra. The rps12 gene in Spirogyramaxima is organized in the same manner as in land plants . A 1813 bp group II intron is present in the rbcL gene in Codium. It has been demonstrated that the rbcL stem loops in non-flowering plants and the algae are highly divergent from angiosperms and each other at the primary sequence level. DNA and derived amino acid sequences of the rbcL gene from selected photosynthetic bacteria, green algae, bryophytes, "fern allies", ferns, and seed plants were used to construct phylogenetic trees. Some clades on all the trees are supported both internally and externally but all of the trees are characterized by unsupported relationships of the bryophytes and "fern allies". These results indicate that rbcL sequences alone are not adequate to test phylogenetic relationships among major groups of green plants.
PUBLICATIONS:
Manhart, J.R., K. Kelly, B.S. Dudock, and J.D. Palmer. 1989. Unusual characteristics of Codiumfragile chloroplast DNA revealed by physical and gene mapping. Mol. Gen. Genet. 216: 417-421.
Manhart, J.R., R.W. Hoshaw, and J.D. Palmer. 1990. Physical and gene mapping of Spirogyra cpDNA reveals a unique chloroplast genome. Journal of Phycology 26:490-494.
Baldauf, S., J.R. Manhart, and J.D. Palmer. 1990. Different fates of the chloroplast tufA gene following its transfer to the nucleus in green algae. Proc. Natl. Acad. Sci. USA 14:5317-5321.
Manhart, J.R. and J.D. Palmer. 1990. The gain of two chloroplast tRNA introns marks the green algal ancestors of land plants. Nature 345:268-270.
Manhart, J.R. and R.A. VonderHaar. 1991. Intron revealed by nucleotide sequence of large subunit of ribulose-1,5-bisphosphate carboxylase/oxygenase from Codiumfragile (Chlorophyta). J. Phycol. 27:613-617.
Lew, K.A. and Manhart, J.R. 1993. The rps12 gene in Spirogyra maxima (Chlorophyta): its evolutionary significance. Jour. Phycol. 29: 500-505.
Calie, P.C. and Manhart, J.R. 1994. Extensive sequence divergence in the 3' inverted repeat of the chloroplast rbcL gene in non-flowring land plants and algae. Gene 146: 251-256.
Albert, V.A., Bremer, K. Chase, M.W., Manhart, J., Mishler, B.D. and Nixon, K.C. 1994. rbcL gene sequences and phylogenetic studies of vascular plants. Ann Mo Bot Gard. 81: 534-567.
Manhart, J.R. 1994. Phylogenetic analysis of green plant rbcL sequences. Molec. Phylo. and Evol. 3: 114-127.
RESULTS FROM PRIOR NATIONAL SCIENCE FOUNDATION SUPPORT
Hugh D. Wilson
AWARD: BSR-8818018, $150,000, 1 June 1989 to 31 May 1994, "Domesticated Chenopodium of Mexico: Genetic Variation and Systematic Relationships", with James R. Manhart as co P.I. and ROA supplement (BSR-8818018-2, BSR-9144947, $24,244) for M. J. Warnock, Sam Houston State University.
Results: Project field work produced specimens and germplasm from throughout species ranges from Mexico to Alaska. Information gathered from populations examined during this field work provided the foundation for a general discussion of C. quinoa and its North American allies (Wilson, 1990). A rare instance of potential genetic contact between C. quinoa and C. berlandieri was encountered during field work in Washington State. Analysis of progeny from both cultivated C. quinoa plants and wild C. berlandieri plants from this area revealed high levels of interspecific hybridization at the site and the potential for crop/weed gene flow when C. quinoa is cultivated within the North American range of C. berlandieri (Wilson and Manhart, 1993). Phylogenetic analysis, using restriction site mutations of the chloroplast genome, of samples representing C. quinoa, C. berlandieri, and a suite of other Chenopodium species provided an overview of relationships among elements of the genus that refined our understanding of biological connections between the two species (Wilson and Manhart, in revison).
PUBLICATIONS:
Wilson, H. D. 1990. Quinua and relatives (Chenopodium sect. Chenopodium subsect. Cellulata. Economic Botany - Symposium Issue 44: 92-110.
Wilson, H. D. and James R. Manhart. 1990. Origin and dispersal of domesticated Chenopodium - evidence from chloroplast DNA restriction site mutations. Annual Meeting, Botanical Society of American, Richmond, VA. American Journal of Botany 77 [6]:166. (abstract)
Wilson, H. D. 1991. Chenopodium quinoa/C. berlandieri - Crop/Weed Gene Flow. Annual Meeting, Botanical Society of America, San Antonio, TX. American Journal of Botany 78[6]:229. (abstract)
Wilson, H. D. and J. R. Manhart. 1992. Origin and systematic relationships of Chenopodium oahuense (Meyen) Aellen. Annual Meeting, Botanical Society of America, Honolulu, Hawaii. American Journal of Botany 79[6]:167. (abstract)
Wilson, H. D. and J. R. Manhart. 1993. Crop weed gene flow: Chenopodium quinoa Willd. and C. berlandieri Moq. Theoretical and Applied Genetics 86: 642-648.
Wilson, H. D. and J. R. Manhart. Chloroplast DNA restriction site variation and infrageneric classification of Chenopodium. (accepted - Systematic Botany - in revision)
PROJECT DESCRIPTION
OVERVIEW
Development of Internet information services including the World Wide Web (WWW), Hypertext Markup Language (HTML) and associated Web browser software is producing revolutionary, massive change in the global arena of information development and dissemination. The levels and rates of change in this area present the scientific community with both opportunity and challenge. We not only have the obvious opportunity for on-line electronic publication and dissemination of scientific materials but, indeed, the opportunity and challenge to fundamentally change the conduct of science.
We now have the capability to provide scientific data for immediate use by everyone who has access to the Internet. Emerging technologies allow this information to be presented using combined and/or linked media (visual, audio, text). As a result, information can be presented at different levels to enhance understanding by a broader base of clients. This is a significant opportunity in applying this new technology to Plant Systematics: a larger user base, greater exposure and, as a result, increased relevance and greater appreciation for the data and the discipline. The new medium also provides an opportunity to tap a broad base of taxonomic expertise via on-line input from individual specialists and data centers. This unique aspect of WWW information management could produce data systems that, once established electronically, evolve via input from the world community of plant taxonomists. This opportunity opens the potential to enhance communication within the community, perhaps even resulting in a move from localized or centralized ÔauthorityÕ to consensus-based data systems that are constantly current and developing.
Plant Systematics is an established scientific discipline with strong functional links to the past and Ôpre-electronicÕ traditions that will persist well into the future. If the discipline is to establish a presence in this new informatics arena of enhanced information flow, and thereby make full use of the opportunities offered, fundamental, traditional perspectives on the nature of information management and presentation will have to change. While traditional methods and approaches for systematic data presentation will, through cultural inertia, extend into the Web as Ôelectronic publicationÕ, this new medium offers unique opportunities that are relatively unexplored.
Full, efficient development of this new technology as a medium for the presentation of plant systematic data cannot be accomplished by either botanists or computer scientists working in isolation. The complex nature of both systematic botany and networked informatics, accentuated by the rapidly changing and shifting substrate of the Web, requires close, collaborative interaction involving individuals who are firmly established in both areas. In addition, those involved in this type of collaborative research, an activity that involves disparate academic worlds of engineering and life sciences, must communicate and interact to reach an intellectual Ôcommon groundÕ and establish working goals and products that are mutually satisfying. As an example, the computer scientists must have a basic understanding of nomenclatural systems and synonymy, whereas the botanists must be familiar with HTML and database manipulation. Indeed both botanist and computer scientist must be willing to go beyond the Ôtraditional wisdomÕ in achieving results and develop an understanding of the practices, assumptions, and capabilities of each of the involved areas is needed in formulating effective solutions.
SPECIFIC PROBLEMS TO BE ADDRESSED
We assume the eventual development of an electronic national flora that will provide relevant biological information to a broad base of users. It will not be an electronic replication of a traditional flora, nor will it be a ÔcentralizedÕ, Ôone-schema-fits-allÕ database. While it will eventually function as a ÔliveÕ system, in that qualified specialists will update/enhance the database on-line, its initial development will evolve from a structured base of information. In terms of priorities, visualization of taxon distributions is, in our view, a primary objective. While plant taxon distribution maps will eventually be produced from firm data (herbarium specimens), the initial, prototypic product will, again, be based on an extant base of information that will be enhanced when an on-line community input system is in place.
The proposed research concentrates on the three following areas/questions:
1. Mapped visualization (taxon geographic distributions with ancillary data color-coded) allows species distribution range andcenters of diversity to be readily assimilated by the user. Can a system be devised that allows mapped visualization quickly, flexibly and efficiently using current WWW-based systems? Can this system be structured allow adaptive change as new Internet technologies emerge? How can we maximize the information content of mapped taxon distributions?
2. There can be no standard ÔacceptedÕ nomenclatural system at the national level. This is precluded by differing taxonomic opinion and constant nomenclatural change resulting from on-going research. Can we devise systems that preserve local autonomy in this realm and also express national taxon distributions/data in a standard yet malleable form?
3. Extant herbarium data are taken within closed regional/state/local systems using various standards. How can we establish an electronic data system that develops through input from the national/global scientific community? How can ÔkeepersÕ of community data interact with those providing data to insure data integrity and credit to the source?
HISTORY OF COLLABORATIONS AND RATIONALE
Collaborative work with WWW-based presentation of plant systematic data started at Texas A&M in August of 1994 with initial contacts between Hugh Wilson, Stephan Hatch, and Leland Ellis. Subsequent interaction involved the conversion of extant electronic files for WWW access via the ÔTexas Plant Diversity Information CenterÕ (URL 1). This initial effort, which included checklist, specimen, and critical taxa data, was completed in October of 1994. This was followed by collaborative work with the Texas A&M Institute for Scientific Computation (ISC) and Biology plant systematics faculty (Hugh Wilson and James Manhart) to establish WWW-based teaching/information systems on the ISC server (URL 2, URL3). A Texas A&M Interdisciplinary Research Program project, funded in June, 1995 ($25,000), led to the formation of the current working group (Hatch, Manhart, Wilson, Leggett, and Furuta). This project focused on the Texas A&M contribution, as a charter member, to the Flora of Texas Consortium (FTC) (URL 4) and associated computerization of campus herbarium specimens. Development of plant name look-up tables for this enterprise produced an association with John Kartesz and the Biota of North America Project (BONAP) database (URL 5). Subsequent interaction between BONAP and the Texas A&M group produced research targets and broader collaboration that would be supported by this proposal.
Collaborative research of the Texas A&M working group (URL 6 - this page carries all URLs referenced in this proposal), supported by two grants (Wilson/Hatch, $144,635; Furuta/Leggett, $173,410) to three elements of the Texas A&M University System (Science, Agriculture and Life Sciences, and Engineering) from the Texas Higher Education Coordinating Board, will further develop the informatics/data contribution from Texas A&M to the Flora of Texas Project during 1996/97. We believe that the work supported by this funding, directed to issues surrounding the Texas flora, provides excellent leverage for expansion of our efforts to tackle similar problems at the national level. Doing so will permit coordination of both state-level and continent-level data and perspectives, with a corresponding increase in significance. Thus, support requested here would allow work with North American data,and full collaboration with BONAP, as a coordinated element within on-going WWW-based informatics research at Texas A&M.
CURRENT RESEARCH
MAPPED VISUALIZATION
The general problem to solve here is how to maximize taxon distribution mapping in terms of expanded data display, speed, efficiency, and operability with regard to emerging Internet systems.
Mapping applications that we have developed include: state/province-level mapping of distributions of species, genera, and families in the families Cactaceae (URL 7) and Chenopodiaceae (URL 8) in North America with vegetation zone maps for Texas taxa, county-level mapping of plant specimens in the Texas A&M Department of Biology Herbarium (URL 9), county-level mapping of the coverage by various Texas herbaria specimen collections for the genera Bouteloua(URL 10) and Helianthus (URL 11), and county-level mapping of Arkansas taxa (URL 12).
The software components of the current mapping system are divisible into three classes. The components in one class take a bilevel image of a map and produce intermediate data files that drive the other two component classes. Components in the second class take these intermediate files and produce maps with the same borders as the bilevel original but with the connected regions shaded in arbitrary ways. Components in the third class allow one to use WWW browsing programs to trigger actions based on the direct selection of a region from a map.
Components in the first class take as their input a bilevel image in the public-domain Portable Bitmap (PBM) format. One component ÔexploresÕ this PBM file to discover the four-connected regions (sets of pixels adjacent in the horizontal and vertical directions) of the image. The output of this component drives another, which produces a run-length encoding of the image's regions and borders. This run-length encoded file is used to drive both the image display and region selection components. Other components of the first class are Common Gateway Interface (CGI) programs, invokable by HTML documents displayed in WWW browsers, which allow a user to provide a numerical designation of their own choosing to the regions of the image as opposed to the arbitrary numbering generated by the exploration program. These tools also allow the user to designate that a group of disconnected regions (e.g., the disjoint areas of the state of Michigan on a map of the U.S.) are part of the same logical region for display purposes.
Components in the second class take a run-length encoded file of a map and output an image of the same map with certain regions colored according to program direction. A description of the regions to be colored and the colors to be assigned to them are passed to a program which reads in the run-length encoded file and, for each run, outputs a same-sized run of red-green-blue triples, either colored (if the run belongs to a colored region) or in a background color (if not). Conveniently, this list of red-green-blue triples is in the representation scheme used by the PBM format. This constructed PBM image is passed to the public domain ÔpbmtogifÕ program to convert it into a Graphics Interchange Format (GIF) image, which is much compressed over the original (saving transmission time when viewing takes place over a network) and is displayable by graphical WWW browsers.
Components in the third class take an (X,Y) pair, representing a selected pixel from the displayed map, and convert it into a single number value corresponding to that pixel's location in the image considered as a right-to-left, top-to-bottom list of pixels. The components can then with little computational effort read in the corresponding run-length encoded file, identify which region the selected pixel is in, and take appropriate action. Currently these components are CGI programs which are passed the results of mouse clicks in a WWW browser and output additional HTML documents for the browser.