Supplementary Information

Rat History.

The rat remains a major pest contributing to famine (rodents eat about one-fifth of the world’s food supply annually) yet its contribution to human health cannot be overestimated, from new drugs (most are tested in rats), to understanding essential nutrients, to increasing knowledge of the pathobiology of human disease. Humans invest billions of dollars each year to exterminate rats, yet raising rats for research is a $1 billion per year industry. In many parts of the world the rat remains a source of meat. We loathe seeing them, but are fascinated by their feats: they survived nuclear testing in Engebi atoll, their bite exerts 24,000 lbs per square inch, a single pair could produce 15,000 descendents in a year, and an adult can squeeze through a hole the size of a quarter.

The laboratory rat (Rattus norvegicus) originates from central Asia and its success at conquering the world can be directly attributed to its relationship with humans1. The rat eats what we do and travels well with us. Opening trade routes spread the various species of rat and its scourge around the world. Yet today’s plethora of rat clubs, breeders, newsletters, and commercial goods with logos of rats, indicates many people are fans of rats.

The history of the rat is obscured by confusing nomenclature. For many years there was not a distinction between mice and rats. In Latin, the word for ‘rat’ is mus; modern Chinese similarly does not differentiate between the two. Samuel Johnson’s 18th century dictionary2 defines rat as “an animal of the mouse kind that infests houses and ships”. Clearly, rats and mice were considered virtually synonymous. Even the history of the laboratory rat is confusing. “Norways have norvegicus as their species name because, in his "Outline of the Natural History of Great Britain" of 1769, J. Berkenhout used it in the first formal, Linnaean description of the species. The full, formal name of the species is therefore: Rattus norvegicus Berkenhout." He was mistaken in thinking the species came from Norway, but this does not invalidate the name. While the black rat (Rattus rattus) was part of the European landscape from at least the third century and is the species associated with the spread of the bubonic plague, Rattus norvegicus probably originated in northern China and migrated to Europe somewhere around the mid-1500s3. They may have entered Europe as hordes crossing the Volga River, a phenomenon observed by the naturalist Pallas in 1727. The species are great swimmers and exhibit horde migration when their density outstrips their food supply.

The rat in research.

The use of animal models for research into human disease is a recent innovation in the history of medicine4. The first recorded breeding colony for rats was established in 1856 (Hedrich chapter). Historically, rat genetics had a surprisingly early start. Hugo De Vries, Karl Correns and Erich Tschermak rediscovered Mendel’s laws at the turn of the century, and Bateson used these concepts in 1903 to demonstrate that rat coat color is a Mendelian trait. These inbred strains of rats are used for research in numerous areas. (http://RGD.mcw.edu). Current rat research is very prolific with 28,049 manuscripts published in 2003 (search terms: rat NOT mouse AND 2003), exceeding 27,059 publications using mouse in 2003 (search terms: mouse NOT rat AND 2003).

The Rat Genome Project

Prior to the decision to sequence the rat genome, there was much discussion about the value of having the rat genome sequence, as well as the utility of the rat as a major model organism. A major limitation was the naïve belief that the rat and mouse were so similar morphologically, and so close evolutionarily, that it was redundant to sequence both rodents. Nevertheless, wisdom prevailed affording the first 3-way whole genome comparison as well as the first whole genome comparison of near evolutionary neighbors.

The Rat Genome Sequencing Project Consortium (RGSPC) was formed in response to this commitment of resources. A network of centers took responsibility for data and resource generation led by the Baylor College of Medicine Human Genome Sequencing Center (BCM-HGSC) and including Celera Genomics, Genome Therapeutics Corporation, British Columbia Cancer Agency Genome Sciences Centre, The Institute for Genomic Research, University of Utah, Medical College of Wisconsin, The Children’s Hospital of Oakland Research Institute, and Max-Delbruck-Center for Molecular Medicine (Berlin). After assembly of the genome at the BCM-HGSC, analysis was performed by an international group, representing over 20 groups in 6 countries and relying largely on gene and protein predictions produced at Ensembl. In this paper and the companion papers, investigators have taken an initial slice through the analysis of the rat genome sequence, both by itself and in context with the mouse and human genome sequences. Although the rat is not a member of the Security Council of Model Genetic Organisms (Fink ref), this publication of the draft sequence and the compendium papers (Genome Research volume) shows the value of this organism.

Overview of sequencing strategy

The goal of the RGSP was to produce a draft sequence of the rat genome without the intention of moving to a final higher quality ‘finished’ sequence5. The quality of the draft rat sequence was thus more critical than the comparable human and mouse genome projects, where errors were ultimately corrected in a finished sequence. Despite the considerable progress in assembling draft sequences of large genomes6-14 the question of which method produced the highest quality draft sequence was unresolved. The most significant issue was the choice between logistically simpler whole genome shotgun (WGS) approaches versus more complex approaches employing BAC clones. In the extreme is the BAC-by-BAC approach, such as used for the NIH Human Genome Project6, requiring individual sequencing of a set of BACs comprising a tiling path covering the whole genome.

The principal challenge in assembling large genomes is correctly dealing with repeated sequences, comprising up to half of the genome. In pure WGS assembly this is confounded by having to correctly align reads from all over the genome whereas in a BAC-based assembly the problem is confined to dealing with only those reads in the region covered by a BAC. Thus BACs reduce the assembly problem to a local one, simplifying the repeat problem. Although still debated15-18 the sense from the human genome project was considerable benefit from either sequencing individual BAC clones or including BAC clones in a mixed assembly with WGS sequences. Although the draft mouse genome sequence was a pure WGS approach7, the project planned full use of BAC clones in constructing the final finished sequence. The loss of segmental duplication regions due to ‘collapses’ in the draft mouse genome assembly7,19-21 suggested serious limitations on the quality of a draft sequence based only on WGS sequences, and this type of defect could not be tolerated in a draft sequence that would not be taken to the higher finished grade.

Mammalian X chromosome evolution

The assignment of the accelerated activity to the rodent branch, following the primate-rodent divergence, is consistent with previous studies at significantly lower resolution, showing complete conservation of marker order between the X chromosomes of human and cat22, human and dog23, and human and lemur24, as well as similar karyotypes of the X chromosomes in human, chimpanzees, gorillas, and orangutans25 (the karyotypes have deletions in telomeres, but no rearrangements). Other studies showed only small rearrangements between the X chromosomes of human and pig26, and human and horse27. All of these species except the primates serve as evolutionary outgroups28 to human, mouse, and rat, and all the primates29 have consistent order in the X chromosome, thus suggesting, independently of the current report, that the marker order on the human X chromosome is ancestral for the primate-rodent ancestor. Indeed, Lahn and Page30 showed evidence that the marker order on the human X chromosome has not changed since the ancestor 240-320 million years ago.

Rat single nucleotide polymorphisms

The rat cSNP pilot is based upon sequences generated from randomly chosen clones from cDNA libraries from three different rat strains: spontaneously hypertensive rat stroke-prone (SHRsp), Wistar-Kyoto (WKY), and Sprague-Dawley (SD). These data so far show that the average density of cDNA derived SNPs between the BN sequence and each of the three strains is approximately 1 SNP/1,100 bp31. To date, over 10,000 unique SNPs have been identified and will be publicly available from the Ensembl web site. This collection is expected to grow over the coming year, to include cSNPs in the majority of rat genes.

The value of this dataset is illustrated in an ongoing study that is searching for genes involved in blood pressure regulation in the SHRsp.127,32,33, an important model for identifying genetic factors responsible for predisposition to cardiovascular disease. We screened all transcripts containing non-synonymous changes to identify potential candidate genes for blood pressure regulation. Among these was a variant (R300H) in the transcript for kallistatin, a protease inhibitor that acts as a potent arterial vasodilator via endothelium independent mechanisms34. Subsequent genetic analysis showed a strong correlation (p<0.0002) with diastolic blood pressure in response to dietary sodium loading. This observation was further linked to biochemical observations of reduced binding of kallistatin to its substrate kallikrein35. This variant therefore likely represents the underlying cause for this diminished protein-protein interaction, and may thus be causally related to enhanced sodium sensitivity and elevated blood pressure in genetically hypertensive rats.

Further studies of kallistatin through transgenic approaches36,37 are underway. In the meantime these data demonstrate the general power of a comprehensive genetic discovery approach that couples a genomic reference sequence, subsequent sequence based discovery of genetic variation, tightly coupled to precise phenotyping of an important disorder. The analysis of other genetic determinants for blood pressure control in rats is an ongoing effort.

1. Robinson, R. Genetics of the Norway Rat (Pergamon Press, Oxford, 1965).

2. Johnson, S. Dictionary of the English Language (1755).

3. Barnett, S. A. The story of Rats. Their impact on us, and our impact on them (Allen and Unwin, Crows Nest, Australia, 2002).

4. Rust, J. H. Animal models for human diseases. Perspect Biol Med 25, 662-72 (1982).

5. National Institutes of Health. (http://grants2.nih.gov/grants/guide/rfa-files/RFA-HG-00-002.html, 2000).

6. Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860-921 (2001).

7. Waterston, R. H. et al. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520-62 (2002).

8. Aparicio, S. et al. Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science 297, 1301-10 (2002).

9. Yu, J. et al. A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296, 79-92 (2002).

10. Goff, S. A. et al. A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296, 92-100 (2002).

11. Adams, M. D. et al. The genome sequence of Drosophila melanogaster. 287, 2185-2195 (2000).

12. Dehal, P. et al. The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins. 298, 2157-2167 (2002).

13. Myers, E. W. et al. A whole-genome assembly of Drosophila. 287, 2196-2204 (2000).

14. Holt, R. A. et al. The genome sequence of the malaria mosquito Anopheles gambiae. 298, 129-149 (2002).

15. Myers, E. W., Sutton, G. G., Smith, H. O., Adams, M. D. & Venter, J. C. On the sequencing and assembly of the human genome. 99, 4145-4146 (2002).

16. Waterston, R. H., Lander, E. S. & Sulston, J. E. On the sequencing of the human genome. Proc Natl Acad Sci U S A 99, 3712-6 (2002).

17. Waterston, R. H., Lander, E. S. & Sulston, J. E. More on the sequencing of the human genome. Proc Natl Acad Sci U S A 100, 3022-4; author reply 3025-6 (2003).

18. Green, P. Whole-genome disassembly. Proc Natl Acad Sci U S A 99, 4143-4 (2002).

19. Cheung, J. et al. Recent segmental and gene duplications in the mouse genome. Genome Biol 4, R47 (2003).

20. Eichler, E. E. Masquerading repeats: Paralogous pitfalls of the Human Genome. Genome Res. 8, 758-762 (1998).

21. Eichler, E. E. Segmental duplications: what's missing, misassigned, and misassembled- and should we care? Genome Res. 11, 653-656 (2001).

22. Murphy, W. J., Sun, S., Chen, Z. Q., Pecon-Slattery, J. & O'Brien, S. J. Extensive conservation of sex chromosome organization between cat and human revealed by parallel radiation hybrid mapping. Genome Res 9, 1223-30 (1999).

23. Kirkness, E. F. et al. The dog genome: survey sequencing and comparative analysis. Science 301, 1898-903 (2003).

24. Ventura, M., Archidiacono, N. & Rocchi, M. Centromere emergence in evolution. Genome Res 11, 595-9 (2001).

25. Yunis, J. J. & Prakash, O. The origin of man: a chromosomal pictorial legacy. Science 215, 1525-30 (1982).

26. McCoard, S. A. et al. An integrated comparative map of the porcine X chromosome. Anim Genet 33, 178-85 (2002).

27. Raudsepp, T. et al. Conservation of gene order between horse and human X chromosomes as evidenced through radiation hybrid mapping. Genomics 79, 451-7 (2002).

28. Springer, M. S., Murphy, W. J., Eizirik, E. & O'Brien, S. J. Placental mammal diversification and the Cretaceous-Tertiary boundary. Proc Natl Acad Sci U S A 100, 1056-61 (2003).

29. Thomas, J. W. et al. Comparative analyses of multi-species sequences from targeted genomic regions. Nature 424, 788-93 (2003).

30. Lahn, B. T. & Page, D. C. Four evolutionary strata on the human X chromosome. Science 286, 964-7 (1999).

31. Zimdahl, H. et al. A SNP map of the rat genome generated from transcribed sequences. Science 303, 807 (2004).

32. Hilbert, P. et al. Chromosomal mapping of two genetic loci associated with blood-pressure regulation in hereditary hypertensive rats. Nature 353, 521-9 (1991).

33. Jacob, H. J. et al. Genetic mapping of a gene causing hypertension in the stroke-prone spontaneously hypertensive rat. Cell 67, 213-24 (1991).

34. Chao, J. et al. Kallistatin is a potent new vasodilator. J Clin Invest 100, 11-7 (1997).

35. Chao, J. et al. Tissue kallikrein-binding protein is a serpin. I. Purification, characterization, and distribution in normotensive and spontaneously hypertensive rats. J Biol Chem 265, 16394-401 (1990).

36. Chen, L. M., Ma, J., Liang, Y. M., Chao, L. & Chao, J. Tissue kallikrein-binding protein reduces blood pressure in transgenic mice. J Biol Chem 271, 27590-4 (1996).

37. Chen, L. M., Chao, L. & Chao, J. Adenovirus-mediated delivery of human kallistatin gene reduces blood pressure of spontaneously hypertensive rats. Hum Gene Ther 8, 341-7 (1997).