Supplemental Text forPolytene Chromosomal Maps of 11 Drosophila species: The order of genomic scaffolds inferred from genetic and physical maps

Tables of Contents

1. Supplemental Materials and Methods / …………………………………..1
2. Supplemental Results / …………………………………..7
3.Notes on Chimeric Assembly Scaffolds / …………………………………11
4. File Formats for Supplemental Tables / …………………………………12
5. List of Supplemental Tables / …………………………………15
6. Supplemental Literature Cited / …………………………………17

1. Supplemental Materials and Methods.

General Strategy for Mapping Scaffolds to Polytene Chromosomes. A variety of approaches were used to map genome scaffolds to polytene chromosomes. The simplest approach involved the four members of the melanogaster subgroup (D. simulans, D. sechellia, D. erecta and D. yakuba). Prior investigation of these species revealed that the banding pattern of the polytene chromosomes was sufficiently conserved that the inversion complexes that characterize differences in gene order among these species were apparent even when it was not possible to recover F1 hybrids (Ashburner and Lemeunier 1976; Lemeunier and Ashburner 1976; Lemeunier et al. 1978). Thus, a simple alignment of the D. melanogaster orthologs in the scaffolds of these species allows one to not only unequivocally assign a majority of scaffolds to chromosome arm but also to order and orient the scaffoldswithin the arm. Interestingly, these alignments were largely congruent with the older purely cytological observations confirming the basic accuracy of the earlier observations (ibid).

For the more distantly related species, this was not possible because, while the syntenic relationships of the Muller elements could be used to assign scaffolds to arms, their order and orientation could not be as easily determined due to the number of overlapping inversions that rearrangedgenes within the arms. In some cases, this difficulty was overcome by prior mapping of clones to the polytene chromosomes of the non-melanogaster group species. Examples of this are the mapping of P1 clones in D. virilis(Lozovskaya et al. 1993; Vieira et al. 1997)and the position of transposon-induced mutations in D. ananassae(Matsubayashi et al. 1992). When this was not possible, for example in D. willistoni and D. mojavensis, probes were synthesized based on the sequence of the scaffolds and in situ hybridizations were performed. A unique approach was adopted for the Hawaiian species D. grimshawi. In this case, there were several prior localizations of individual genes, but these were done on Hawaiian species other than D. grimshawi(Davis et al. 1998). However, the phylogeny of the Drosophila species endemic to the Hawaiian chain is known and well established (see Powell 1997). Moreover, one of the data sets used to establish the phylogeny is the set of inversion polymorphisms within and between species that are associated with their evolution (Carson 1992; Carson et al. 1992). Using the various localizations in non-D.grimshawi species and the inversion genealogy from those to D.grimshawi, it was possible to deduce the positions of the scaffolds in this species. These varied approaches have informed the organization of the major scaffolds on the chromosome maps of all 11 species and these analyses of the assembled genome sequences have provided novel insights into possible underlying causes and mechanisms of chromosomal evolution in the genus.

D. melanogasterSpecies Group Chromosome Map Preparation. All of the analyses reported used the CAF1 assemblies of D. simulans, D. sechellia, D. erecta and D. yakuba and version 4.3 of the D. melanogaster assembled and annotated genome. The annotated versions of the four melanogaster group species were retrieved from FlyBase. These were derived from the community assemblies and annotations posted on the AAA web site ( The orthology calls made by V. Iyer and M. Eisen, which were also posted at the AAA site, were also used in this analysis. The FlyBase inferred cytological map locations were assigned to all of the orthologs called in the four species. These associations were then ordered and sorted according to their scaffold assignments and molecular coordinates for each species. These simple alignments based on a correlation of D. melanogaster cytology, scaffold linkage and gene order by molecular coordinate proved remarkably congruent and allowed ready alignment of the major scaffolds to the polytene chromosome maps. An added feature resulting from the alignments was that the known inversion constellations that differentiate the four species from D. melanogaster were easily discerned and mapped to the sequence of the assembled scaffolds.

D. ananassaeChromosome Map Preparation. The stock of D. ananassae (AABBg1) was maintained at 23oC in a corn meal-yeast-glucose and agar medium. Well-fed larvae ready for pupation were dissected in a solution containing lactic acid: distilled water: acetic acid (1:2:3). Salivary glands were immediately transferred to the same solution and were squashed after 10 minutes for chromosome preparation. The photographic maps (Tobari et al. 1993) were used to anchor assembles scaffolds to the cytological map. The positions of the genetic and physical markers within the assembled scaffolds were obtained from the Synpipe output (Bhutkar et al. 2006). The cytological position of each molecular marker was determined by in situ hybridization to the polytene chromosomes with minor modification of the procedures described in Biemont et al. (2004). The DNA fragments of PCR product were labeled with digoxigenein-11-dUTP (PCRDIG Labeling Mix, Roche) as a probe for the hybridization. The linkage maps of morphological mutants constructed by Hinton (unpublished in 1991 in Tobari 1993) were used for the analysis.

D. pseudoobscuraSpecies Group Chromosome Map Preparation. The stocks of D. pseudoobscura(MV2-25, 14011-0121.94) and D. persimilis(MSH3, 14011-0111.49) were maintained at18oC in a corn meal molasses agar culture medium. Third instar larvae were collected and placed in Drosophila Ringers solution for 5 minutes. Salivary glands were dissected from larvae and squashed according to the procedure developed by Harshman (1977). This technique helped to obtain chromosomes that tended to be linearized. A 700 gram weight was set on the coverslip to aid in flattening the chromosomes (Ballard and Bedo 1991). The chromosomes were viewed at 1,000x and digital images of linear chromosomes were collected for the six chromosomal arms.

Adobe Photoshop was used to build mosaic images of each of the six chromosomes. We minimized the number of different polytene chromosomes that were used to build the mosaic images, but in all cases, we made sure that the different chromosomal sections blended in the mosaic were of similar scales. Section designations were available for each of the six chromosomes except for XR (Muller A•D). Sub-sections were assigned using the approach of Bridges (1935). We made every effort to begin each sub-section at an easily recognizable landmark such as a band or boundary to a puff. New ideograms for the six chromosomes of the two species were drawn by tracing the original maps developed by Tan (1936; 1937) in Adobe Illustrator. These new ideograms are superior to the old reproductions because they are drawn in vector graphics and allow infinite scalability in web based applications. There was not an ideogram available for Muller A•D so the photomicrograph image of this chromosome was used to develop the representation for this map.

We used the locations of previously mapped genetic markers (Anderson 1993; Beckenbach 1981; Donald 1936; Kovacevic and Schaeffer 2000; Lancefield 1922; Levine and Levine 1955; Noor et al. 2000; Orr 1995; Ortiz-Barrientos et al. 2006; Prakash 1974; Sturtevant and Novitski 1941; Sturtevant and Tan 1937; Tan 1936; Yardley 1974) and physical markers (Aquadro et al. 1991; Babcock and Anderson 1996; Bondinas et al. 2002; Dobzhansky and Sturtevant 1938; Hamblin and Aquadro 1999; Machado et al. 2002; Moore and Taylor 1986; Papaceit et al. 2006; Schaeffer and Aquadro 1987; Schaeffer et al. 2003; Segarra and Aguade 1992; Segarra et al. 1996) to anchor the assembled scaffolds to the cytological map. The positions of the genetic and physical markers within the assembled scaffolds were obtained from the Synpipe output (Bhutkar et al. 2006).

D. willistoniChromosome Map Preparation. The stocks of D. willistoni were maintained in 25oC incubators on the culture medium of Marques et al.(1966).Salivary gland cells of third instar, well-nourished larvae were prepared using a modification of the technique of Ashburner (1967), the glands were fixed with acetic acid (45%) and stained with acetic orcein (2%).

The Gd-H4-1 strain of D. willistoni from Guadeloupe (16° 15’ N 61° 35’ W) was chosen for genomic sequencing from a set of six strains because it was chromosomally monomorphic. Gd-H4-1 contains the standard chromosomal order, except for two small fixed inversions in arm IIL and one inversion in XL.

The genetic maps of morphological mutant alleles (Spassky and Dobzhansky 1950) and allozyme variants (Lakovaara and Saura 1972) were used to anchor assembled scaffolds to the cytological map. These maps were developed using a strain of D. willistoni that was chromosomally monomorphic for standard gene arrangements. The method of Engels et al.(1986)was used for the in situ hybridization assays to physically map genes to the polytene chromosomes of D. willistoni. Probes were labeled with biotin-7-dATP by nick translation with the GIBCO BRL kit, the hybridizations were detected using the BCIP, SAP and NBT while chromosomes were stained with 0.1 % lacto-aceto-orcein. The positions of the genetic and physical markers within the assembled scaffolds were obtained from the Synpipe output (Bhutkar et al. 2006).

To construct the new photomap, we used as the standard X-chromosome order those patterns of the XL and XR arms with the widest geographical distribution within all the populations analyzed(according to Rohde 2000). These standard gene arrangements (designed XL-A and XR-A) are fixed in an old laboratory population-WIP4, collected by A. Cordeiro and H. Winge in the 1960s in Ipitanga, Bahia Northeast State in Brazil and have the closest phylogenetic relationship with the remaining arrangements. Microscopic analysis of X chromosomes in offspring from crosses between the WIP4 population and southern Brazilian wild populations was used to confirm the gene arrangement on the X chromosome. The X-chromosome pattern of the WIP4 population probably corresponds to the Standard arrangementfrom the Belém population described by Dobzhansky (1950).

Until recently, the status of the dot chromosome was unclear in D. willistoni(Sturtevant and Novitski 1941). Many authors considered that this small chromosome might be fused toanother chromosome. Papaceit and Juan(1998) solved this mystery when they used probes for genes on the fourth chromosome of D. melanogaster and found that they hybridized to the most basal section of the third chromosome. Thus, the dot chromosome or Mullerelement Fhas apparently fused to the E element in D. willistoni.

D. virilisChromosome Map Preparation. The sequenced strain of D. virilis contains visible mutations at loci on each of the large autosomal elements. The original strain appears to have been constructed at The Institute for Developmental Biology (Moscow, USSR), and it was subsequently placed in the species collection currently maintained at the TucsonDrosophilaStockCenter. A derivative of this original strain was inbred more than 14 generations by single pair sib-mating in the laboratories of Brian Charlesworth and Bryant McAllister, and it underwent two additional generations of sib-mating immediately prior to expansion for isolating genomic DNA for sequencing.

Physically mapped positions in the genome of D. virilis were identified from Flybase records, published literature, and unpublished data. Cytological map positions determined by in situ hybridization are relative to the nomenclature of the standard photographic chromosome map(Gubenko and Evgen'ev 1984) and the corresponding graphic map(Kress 1993). Loci on the linkage map, where a clear relationship exists between the locus and a reference DNA sequence of either D. virilis or D. melanogaster, were also identified (Alexander 1976; Gubenko and Evgen'ev 1984). Reported positions of these markers within the physical and/or linkage maps, coupled with an associated DNA sequence, provided reference points to anchor the assembled genome sequence along the chromosomal arms. Most of the reported mapped positions within the genome of D. virilis were obtained through a series of analyses that used in situ hybridization to localize large-insert P1 clones along the chromosomes (Lozovskaya et al. 1993; Vieira et al. 1997). End sequences from 593 of these P1 clones that map to unique sites within the genome were generated to anchor the assembly onto the polytene chromosome map.

In cases where a reference sequence of D. virilis was available for the in situ localized probe, position of the sequence in the CAF1 assembly was determined using local alignment. Large sequences were localized using MEGABLAST (e < 1E-50), and small microsatellite loci were localized using the best score obtained from BLASTN searches. Otherwise, the transcript of the putative ortholog of D. melanogaster was used as a query in a BLASTN search (e < 1E-10). Identification of each mapped position, its associated reference sequence, and its identified position in the CAF1 assembly is included in SupplementalTable 24.

Overall organization of the assembled genome sequence on the chromosomal arms of D. virilis was inferred through a comparison of the order and orientation of scaffolds indicated by the anchored positions in the physical and linkage maps and scaffold joins identified from the Synpipe analysis of conserved syntenic blocks of orthologous genes. Position and orientation of scaffolds was mostly supported by both approaches, thus providing a high level of confidence in the inferred order and orientation of these scaffolds. A greater degree of uncertainty exists for the placement of the sequence in regions where only one of the approaches supports a particular arrangement, and this uncertainty is clearly demarcated.

A set of “orphan” scaffolds was identified as being present on particular chromosomal elements based on their content of putative orthologs. However, these scaffolds were not placed within the polytene chromosome map with marker data. These orphan scaffolds potentially represent “islands” of assembled genome sequence localized within the pericentromeric heterochromatin of D. virilis. Heterochromatic regions of the Drosophila genome are known to be enriched with repetitive sequences (Smith et al. 2007). To determine the proportion of each orphan scaffold that is represented by repetitive elements, each sequence was analyzed with Repeat Masker to identify interspersed repeats using the “Drosophila fruit fly genus” repeat library ( Three scaffolds of similar size (~1 Mb each) that mapped to medial positions of the X chromosome were used for comparison of element content.

Physically mapped positions in the genome of D. virilis were localized within the assembled genome sequence and used to anchor the scaffolds to specific chromosomal positions. The orientation of these scaffolds was compared with the syntenic blocks identified through computational analysis to support scaffold joins based on both approaches, and to identify unique scaffold positions and orientations revealed by a single approach.

D. mojavensisChromosome Map Preparation. The D. mojavensis genome stock (15081-1352.22) was maintained at 17 C. Third instar larvae were dissected in a 45% acetic acid solution and their salivary glands removed. Salivary glands were then placed in fixative (1 volume lactic acid, 2 volumes water, 3volumes acetic acid) for 5 minutes. A siliconized cover slip was placed on the salivary glands and compressed. Slides were then stored at -80 C for 24 hours. After freezing the cover slips were removed and slides were placed in a -80 C ethanol bath and allowed to come to room temperature. At this point slides were removed from the ethanol, air dried and stored at 4 C until hybridization.

Immediately prior to hybridization, slides were incubated for 30 minutes at 65oC in 2X SSC, pH 7, cooled to room temperature, denatured with 0.14M Sodium Hydroxide in 2X SSC, washed multiple times in 2X SSC and dehydrated in successive washes of 70% and 95% ethanol, then air dried. Amplicons of 800-1100 bp in length were produced by standard PCR methods using appropriate primers (Supplemental Table 1). Amplified DNA was purified using Qiagen QIAquick PCR Purification kits (Qiagen Inc., ValenciaCA), then biotinylated using Roche Biotin-Nick Translation Mix (Roche Applied Science, Mannheim, Germany). Briefly, 500g of full-length amplicon was nick translated at 15oC for 30-40 minutes to produce biotinylated probes of 200-500 bp in length.

Probes were precipitated in ethanol and sodium acetate in the presence of denatured salmon sperm DNA, washed once in 70% ethanol and dried. Probes were reconstituted in a hybridization buffer consisting of 2X SSC, 10% dextran sulfate, 50% formamide, pH 7, denatured at 65oC, and hybridized to slides containing prepared chromosomes for 15-20 hours at 37oC. Following a brief wash in 1X PBS, slides were incubated in 50mM Tris, pH 7.6 containing 4% BSA and a 1/100th volume of Vectastain (Vectastain Elite ABC, Vector Laboratories, Burlingame, CA) for 30 minutes at room temperature. Slides were washed 3 times with 1X PBS and developed in a solution of 50mM Tris, pH 7.6 containing 0.15mg/ml Diaminobenzidine and 0.08% Hydrogen Peroxide.

The linkage map was determined from microsatellite markers identified by Ross and Markow (2006), Staten et al.(2004), and by Reed et al. (unpublished). Linkage relationships among 25 informative microsatellite markers were determined using 304 members of an F2 population were genotyped at 25 informative microsatellite markers (Reed et al.unpublished). Mapmaker 3.0 from the Whitehead Institute was used to calculate the linkage map (Lander et al. 1987). We assigned markers to chromosome-based groups and then performed multipoint analysis on each chromosome group. Maximum likelihood marker order was calculated for each chromosome and the Haldane mapping function was used to assign linkage distances between markers in centiMorgans (cM).

Markers were assigned to chromosomes using one or more of the following methods. 1) using BLAST (Altschul et al. 1994)of the flanking sequence against the D. melanogaster genome sequence to identify which Muller element and thereby to what chromosome in D. mojavensis it belonged (Staten et al. 2004); 2) using allozyme markers assigned to chromosome by Zouros (1981; 1991) that differ between D. mojavensis and D. arizonae and looking for linkage between microsatellite and allozyme genotypes using non-recombinant male intermediaries; 3) looking for close linkage between assigned markers and unassigned makers. Chromosome assignments were then confirmed with BLAST against the D. mojavensis sequence.

D. grimshawiChromosome Map Preparation. Polytene chromosome maps (Carson 1992) of the G1 line of Drosophila grimshawi served as the basis of our anchored scaffold figures. This strain was established in 1963 from an isofemale line and was recently sequenced (Drosophila 12 Genomes Consortium 2007). A total of 31 genes were localized via in situ hybridization to the polytene chromosomes of D. grimshawi, D. silvestris, D. nigribasis, or D. heteroneura using standard techniques (Edwards et al. 2001). Polytene band assignment (Supplemental Table 26) for non-homosequential species were made by using information of known inversion difference between species (see Results and Carson et al. 1992). BLAST (Altschul et al. 1997) was used to determine genomic locations of each gene localized by in situ hybridization. Genome location was equated with polytene band assignments to place scaffolds on Muller elements by searching the D. grimshawi genome with the D. melanogaster homolog of each localized gene.