Physarum Genome Project Update by Jonatha Gott; December, 2004

Background:

On August 4th, 2004, the NIH announced that Physarum polycephalum was one of 18 new model organisms whose genomes will be sequenced as part of the Comparative Genome Evolution project of the National Human Genome Research Institute (NHGRI). The anticipated end-product of this whole genome shotgun sequencing project is an assembled genome sequence with ~6x coverage. The press release can be viewed at http://www.genome.gov/12511858. NIH has designated Jonatha Gott (Case Western Reserve University, Cleveland, OH USA) as its contact person within the Physarum community.

An initial working group comprised of many of the people that participated in gathering information for the NHGRI panel met in Innsbruck, Austria on July 12th, 2004 to discuss which strain should be used for DNA preparation, how the data should be handled, and how additional funding needed for mining the genome could be acquired. Participants were Sandie Baldauf (University of York, UK), Jonatha Gott (Case Western Reserve University, Cleveland, USA), Eggehard Holler (University of Regensburg, Germany), Wolfgang Marwan (University of Hertfordshire, UK), Gérard Pierron (Institut André Lwoff, Villjuif, France), Pauline Schaap (University of Dundee, UK), and Georg Golderer, Stefan Leitner, Ernst R. Werner, Gabriele Werner-Felmayer (University of Innsbruck, Austria). Since that time the email discussion list has been expanded significantly; please contact Jonatha Gott (see the membership list) if you would like to be added to the email list.

Strain to be used as the source of DNA for genomic sequencing:

The working group initially decided that DNA should be prepared from diploid nuclei of LU897 x LU898 plasmodia or from haploid nuclei of LU897 amoebae, with flow cytometry testing of the strains in two different laboratories (Villejuif and Innsbruck) to assess the uniformity of each DNA preparation. The initial choice of LU897 x LU898 was made, in part, because this strain was used for the construction of a normalized cDNA library that is being used to identify ESTs, which would make the gene annotation easier. However, upon further reflection and subsequent discussion via email, a consensus emerged from the working group that the axenic haploid amoebal strain LU352 should be used. Use of a haploid strain will simplify genome assembly, while use of an axenic strain will avoid problems caused by contaminating bacterial DNA. Gérard Pierron is currently growing LU352 obtained from Roger Anderson and will be carrying out flow cytometry on that strain soon. If the DNA content appears consistent, Gérard will isolate DNA from this strain for sequencing.

Information from Roger Anderson regarding LU352: LU352 was derived as a progeny of the cross CLd-AXE x LU213 and is largely isogenic with all of the Colonia-inbred LU strains. It is haploid and contains mutations in at least 2 genes that allow it to grow in liquid culture. The CLd-AXE parent was the result of prolonged axenic culture, so LU352 may contain additional mutations. LU352 has the genotype matA2 gadAh npfC5; the gad mutation promotes apogamic development, but the npfC5 mutation prevents it.

Physarum database:

In order to maximise the benefits of the Physarum genome project, the Physarum research community needs a fully functional database that is accessible to all. The construction and maintenance of such a database is a major undertaking, requiring a significant amount of effort and funding, and Pauline Schaap suggested partnering with the Dictyostelium group. Discussions with Rex Chisholm, one of the lead developers of the Dictyostelium database (http://dictybase.org), were extremely helpful and led to this suggested course of events:

1. Partner with the Dicty database, using their servers/hardware and "database schema". Rex suggests making cosmetic changes to the "look" of our site to clearly delineate the Physarum section of the database, but this is apparently trivial to do if we give them an idea of what we want.

2. Once the assembled sequence is available, have the sequencing center run an automated annotation to localize likely genes and send to the Dictybase group. Rex estimates that it would take them about one month to go from an empty database to an annotated genome coupled to known Genbank and PubMed entries and any available EST sequences!!! There will also be automatic links set up with Genbank and PubMed to link relevant new material to the Physarum site. This they could do largely without input or funding from us. At this point the site is fully functional, but it would be more accurate and useful if active curating kicks in, so

3. Hire a curator to compile data from available databases and the literature regarding ESTs, in situ hybridizations, phenotypes, available mutants, microarray data, etc. and information from the Physarum community for manual annotation of each gene. The curator is responsible for quality control and will be the contact person for the entire Physarum community. All data to be added to the site will be sent to the curator to provide uniformity and continuity. The Dictybase group have offered to train the curator we hire, most likely a post-doc level person with computer savvy and knowledge/understanding of the Physarum literature. We will have to come up with funding to pay our curator and part of a computer developer's salary (see below).

This isn't entirely altruistic on the part of the Dictybase group, as they would get some support for personnel and hardware updates, plus they become even more indispensable in terms of their own grant renewal. In return, we get a functional database and a means disseminating the genome data, two key elements to getting downstream funding for Physarum genome mining. This appears to be a win-win situation.

Funding:

As the NIH contact person, Jonatha Gott would likely write a grant (with heavy input and letters of support from the Dictybase group and the Physarum community) asking NIH and/or NSF for roughly $100,000/year for salaries, fringe benefits, hardware updates, etc. to fund the Physarum database. Rex thought that we would have a very good shot at getting funding since the costs are VERY minimal relative to starting from scratch (i.e. buying the hardware, hiring the computer people to develop the software, etc.)

The initial working group also agreed that a world-wide and concerted effort to raise additional funding for various projects will be necessary in order to provide the background information for annotating the genes (such as sequencing EST libraries) as well as for retrieving functional information once the assembled sequence is available. Those of the Physarum community who are interested in participating in one of these “mining” projects should contact Jonatha Gott, Center for RNA Molecular Biology, Case Western Reserve University, School of Medicine, Cleveland, Ohio, 44106 USA; e-mail: , who has agreed to co-ordinate these joint research efforts.

Thoughts on resource sharing:

We are requesting that sequence data from the Physarum genome project be released to the public as soon as it is generated and hope that everyone in the Physarum community will freely share all of the resources that have been generated in their respective laboratories. Although a number of valuable resources exist in the Physarum community, including cDNA and genomic libraries, strain collections, plasmids, antibodies, etc., there is currently no single place that anyone interested in accessing these resources can go to find out what is available. The Physarum genome project would seem to provide the perfect context for organizing such a database. Would anyone be willing to undertake a survey of members of the Physarum community to find out what is available?