1

Supporting Text: Detailed Methods

Sample collections.For the functional metagenomic library, we used a pooled sample of guts of 150 workers from a single colony (MD 374) fromthe USDA bee laboratory in Beltsville, Maryland. Another pooled sample of gut bacteria from 150 workers from a colony (AZ 176) from the USDA bee laboratory in Tucson, Arizona (AZ) was used for a separate metagenomic sequencing study (13), and the scaffolds from the assembly were used for sequence-based analyses in our study. In addition, we sampled individual adult workers representing colonies in these two locations and in five additional locations in the USA: West Haven, Connecticut (CT), Titusville, Florida (FL), Whidbey Island, Washington (WA), southern Utah (UT), and southeastern Arizona (D-AZ). CT and WA colonies were sampled during the first and second seasons following establishment using commercial bees. The UT samples were from established feral colonies. The D-AZ samples were from managed colonies in remote desert locations and were selected as representing colonies with known histories of having no antibiotic treatment or input of bees from treated colonies for more than 25 years. We sampled one location each in Switzerland (SUI), the Czech Republic (CZ), and New Zealand (NZ) (Table S1). For each location, 5 individual bees were collected from each of 1-4 bee colonies for screening tetracycline resistance genes. The 5 bumblebee females were captured at flowers in West Haven, CT and were identified as Bombusbimaculatus, Bombus impatiens and Bombusperplexis using a barcoding protocol based on mitochondrial DNA sequences (Hebert et al. 2004). Collected bee samples were immediately used for preparing large-insert metagenomic libraries (MD374) or stored in 95% ethanol until use. Several CT samples were used for isolation of strains. A summary of samples is given in Table S1.

Functional metagenomic library and screen.For construction of a fosmid clone library, 3.7 µg of metagenomic DNA was isolated from bacterial cells isolated from 150 MD bee guts. DNA preparation followed Martinson’s protocol with minor modifications (10). The resulting DNA was analyzed for quality, quantified, and checked to verify high representation of bacterial DNA. The library was then constructed using the CopyControl™ HTP Fosmid Library Production Kit with pCC2FOS™ Vector and Phage T1-Resistant EPI300™-T1R E. coli Plating Strain (Epicentre, Madison, WI, USA) according to manufacturer’s instruction. The packaged libraries were titered by plating out serial dilutions of packaged library onto LB agar plates containing 12.5 µg/mL chloramphenicol to determine the number of fosmid clones. The total initial library was estimated to contain 0.8-1.0 x 105 independent inserts, with average size of 34 kb, as determined using gel electrophoresis of a sample of inserts. The rest of the packaged fosmid-infected E. coli EPI300 strain was plated on LB agar containing 12.5 µg/mL chloramphenicol. To enable functional selections for resistance of multiple antibiotics, plates were cultured at 37°C overnight, then scratched and collected into 10 mL LB medium with 12.5 µg/mL chloramphenicol and 15~20 % glycerol and kept at‐80 °C for subsequent screening.

To evaluate which bacterial species were sampled during library construction, 220 clones were selected randomly, and inserts were sequenced at both ends, using Sanger sequencing services at Yale University. After trimming, 305 high quality end sequences (mean 1254 bp, range 394 bp - 1401 bp) were retained from 162 clones. These end-reads were used as queries in blastx searches against the NCBI non-redundant protein database. Reads corresponding to bacterial genes with top hits to the close relatives of species known from honey bee guts (9), were binned accordingly. Reads corresponding to transposases, plasmid-associated sequences, or antibiotic resistance genes were not used for species binning, since these highly mobile sequences are not reliable indicators of species. The bacterial species of the bee gut microbiome were further confirmed by using end-sequences as queries in local blastn searches against a bee gut metagenomic database previously binned into species (13).

For screening for resistance to a panel of antibiotics, 100 µL of a 10-5 serially diluted solution of 100 µL frozen library stock were plated on LB agar plates containing binary combinations of 12.5 µg/mL chloramphenicol and one of 7 different antibiotics, then incubated at 37 °C for 16 hours. The concentration of each antibiotic used for the screen was determined by a gradient dilution or reference as 50 µg/mL carbenicillin, 50 mg/mL ampicillin, 12.5µg/mL tetracycline, 12.5 µg/mL oxytetracycline (Terramycin), 5 µg/mL ceftazidime, 10 µg/mL gentamycin, or 100µg/mL rifampicin.

Numbers of resistant clones were counted and compared to total numbers of clones, to obtain frequencies of resistance among fosmid inserts. For example, at least 103 clones for tetracycline/oxytetracycline were obtained per plate of 106-107 total clones, giving a frequency of about 0.1%. For ampicillin/carbenicillin, ~2x102 clones per plate were observed. Approximate calculations of frequencies of sampled bacterial genomes displaying resistance were made, based on the assumption of one resistance determinant per genome and an average genome size of 3 megabases, implying that a single fosmid insert has an approximate 1% chance of including a chromosomally encoded resistance locus from a sampled genome.

The taxonomic sources of fosmid inserts from presumptive tetracycline-resistant clones and ampicillin-resistant clones were identified by end-sequencing, as described above for randomly picked colonies. Clones in which one or both end-sequences were identical were considered as duplicates.

Tetracycline resistance loci for all the selected resistance patterns except A3-11 were identified through diagnostic PCR as described below. An ampicillin resistance gene was also amplified using primers designed based on characterized ampicillin resistance genes.

Sequencing and assembly of fosmid-inserts.Several strategies were used to sequence inserts for fosmids selected to represent different resistance loci and different bacterial sources (Table S2). First, the ampicillin and tetracycline double-resistant clone A3-16 (S. alvi, TetB/Amp) was sequenced using transposon mutagenesis, with the EZ-Tn5TM<KAN-2> Insertion Kit (Epicentre, Madison, WI, USA) and Sanger sequencing, followed by targeted PCR to close the three remaining gaps. A finished 38,823 bp circular fosmid containing a 30,642bp insert was assembled (Table S3). Second, fosmid-inserts from ampicillin and tetracycline double-resistant clone A3-15 (Beta, TetD/Amp) were partially sequenced using primer walking, to yield a 7,699 bpfosmid fragment containing a 5,169bp core region and flanking regions on both ends (1,231bp and 1,299 bp) (Table S3). Third, fosmid inserts from the six representative tetracycline resistant clones T4, TA1, TA7, T3_2 (S. alvi /G. apicola, TetC), and T3_18 (Alpha, TetC), and T3_7 (Alpha, TetL) were fully sequenced as part of a multiplexed Illumina library on the paired-end module of the Genome Analyzer II following the Illumina protocols for 2×75 bp reads at the Yale Center for Genome Analysis. Following data filtering and assembly, full length sequences of inserts (34,144 bp for T4, 35,043 for T3_2, 30,860 bp for TA1, 29,584 bp for T3_18, 42,578 bp for T3_7, and 36,335 for TA7) were obtained (Table S3).

Retrieval of scaffolds containing resistance genes from AZ bee gut microbial metagenomic sequence data. An assembled metagenomic dataset from DNA of the gut bacteria of honeybees sampled from one AZ colony was available (13). This dataset, for which scaffolds represented near-complete genomes for most of the typical species of the honeybee gut microbiota, was queried for resistance genes using sequences obtainedfrom PCR products and from fosmids, as described above (Table S4).

Annotation and analysis of fosmid sequences and metagenomic sequence scaffolds. The fosmid insert/scaffold sequences were annotated using online RAST prokaryotic genome annotation server (Aziz et al. 2008). The predicted ORFs and annotated genes were used asqueries in blastx searches against the NCBI nr protein database. The sequences of top scoring hits with annotated features were used as references to identify transfer elements in fosmid inserts/scaffolds with tetracycline resistance genes. Global and partial sequence comparisons on the amino acid or nucleotide level were carried out with blast, and the relationships between sequences were visualized using Mauve (Darling et al. 2008). Taxonomic assignments of each fosmid insert or scaffold (not including transferring units) were determined as described above.

Comparison and diversity analysis of transfer elements. To investigate diversity and distribution of transfer elements for tetB, tetC and tetL, primers were designed to amplify specific transposon regions (Table S8). The transposon Tn10 containing tetB gene was identified by amplifying a gltS gene fragment (1105bp) with the primers glts_F and glts_R. The fragments were amplified using the Expand Long Template PCR System (Roche Applied Science, Indianapolis, IN, USA). The presence of a Tn3 insert within Tn10 (~7kb) was determined by the size of PCR product. For tetC, two sets of primer pairs were designated to distinguish T3_2 and T3-18 type transfer elements. TetC_F and TetC3_2R_4338 were used to amplify 1198bp fragments for the transfer element identified in fosmid insert T3_2. TetC_F and TetC3_18F_429 were used for amplification of a 1216bp fragment for the transfer element in T3_18. The tetL gene was identified in diagnostic PCR analysis.

The reactions were performed using the conditions described above for full length ORFs of resistance genes and annealing temperatures as in Table S8. To determine types of mobility elements for tetB, tetC and tetL in bee samples from MD, AZ and WA (USA), and from Europe, DNA samples for 3 or 4 bees per colony were used as templates in PCR diagnostic screening.

Diagnostic PCR screening for tetracycline resistance determinants. To obtain DNA for screening resistance genes from gut microbiota of individual bees, whole guts from ventriculus to rectum were aseptically dissected from randomly selected workers for each colony. The dissected guts were placed in a sterile 1.5 mL pestle tube with 750 µl buffer AG (200 mM NaCl, 200 mM Tris, 20 mM EDTA, plus 6% SDS) and were homogenized by maceration with scissors and then crushed with a disposable sterile pestle (Bel-Art Products). The homogenate was then added to a sterile bead tube containing 500 µl of phenol/chloroform/isoamyl pH 7.9 (Ambion) along with ~500 µl of 0.1 mm silica zirconia beads (BioSpec Products, Bartlesville, OK). The bead tubes were placed in a BioSpec high speed bead beater, beaten at the maximum setting for 3 min, then spun at 11,000 RPM for 2 min. The resulting aqueous phase was extracted with a second 400 µl phenol/chloroform/isoamyl preparation in a Light Phase Lock Gel tube (5 Prime). The aqueous phase of this extraction was collected and combined with 1/10 volumes sodium acetate pH 5.5 (American Bioanalytical) and an equal volume of isopropyl alcohol (American Bioanalytical). The samples were then allowed to incubate at -20°C overnight and then spun at 14,000 RPM for 30 min in a 4°C microcentrifuge. The pellets were washed with 70% ethanol and dried for 5 min in an unheated vacuum evaporator. The pellets were resuspended in 100 µl TE (10mM Tris pH 8 and 1mM EDTA) and incubated for 30 min at 37°C with 2 µl RNAseA (Qiagen). These extracts were then further purified with a QiagenQIAquick column and eluted in 30 uL Buffer EB (Qiagen).

To identify tetracycline resistance genes present in the bee gut microbiota, diagnostic PCR primers for detection of 21 tetracycline resistance determinants, including 12 tetracycline efflux pump genes, 8 ribosomal protection protein genes and 1 inactivating enzyme gene, were designed or obtained from published references (Table S8). These primers amplified partial sequences of these genes, and were designed to be diagnostic of each gene category. Extracted genomic DNA from each bee gut was screened using these diagnostic PCR primers to determine the resistance gene content. For each colony, two bees were screened initially, and, if these were inconsistent, a third or fourth was screened. PCR screens were performed using protocols previously described (Aminov et al. 2001, 2002, 2004; Szczepanowski2009); PCR assays included positive (bacterial 16S rDNA with primers 926f and 1492r) and negative (ddH2O) controls. Examples of screening results are shown in Fig. S2. For tetL, published primers were unsuccessful, and a new primer pair was designed based on our fosmid sequence containing tetL, which amplifies fragments of either 1.5kb or 3kb depending on whether the locus is chromosomalor plasmid-borne. Allpositive amplicons were purified by using the QIAquick PCR purification kit (Qiagen Inc., Valencia, CA, USA), and were sent for sequencing at the DNA Analysis Facility of Yale University.

To confirm screening results for single bee DNA samples, 8 pooled DNA samples,each from guts of 150 workers from one MD bee colony, were used to detect tetracycline efflux genes using diagnostic PCR and sequencing as described above for individual bee samples (Fig. S1).

Quantitative PCR (qPCR) for tetracycline resistance determinants.Quantitative PCR assays were performed on samples consisting of DNA pooled from guts of 5 individual bees, except for CT samples, for which guts of individual bees were used. Diagnostic PCR primers previously mentioned (Table S8) were used for 6 identified tetracycline resistance loci, including genes encoding efflux pump proteins (tetC, tetD, tetH, and tetY) and genes encoding ribosomal protection proteins (tetM and tetW); a diagnostic primer pair for qPCR was redesigned for tetB. Reliable diagnostic qPCR primers for tetL were not identified; thus, tetLassays were not performed. PCR amplicons were cloned into plasmids and quantified, then used in quantitative PCR in dilution series to make a standard curve for each primer set, allowing estimation of absolute copy numbers within each sample. Quantitative PCR was performed with the LightCyclerFastStart DNA MasterPLUS SYBR Green I Kit (Roche Applied Science, Indianapolis, IN, USA). Each assay was performed with a standard dilution of the plasmid prep (1x) and a negative control (ddH2O), in a touchdown program with an annealing temperature descending from 68°C. In addition to quantitative analysis for tetracycline resistance determinants, the number of bacterial 16S rRNA copies was also quantified for each sample to enable estimation of the number of tetracycline resistance gene copies relative to the number of bacterial genomes present, for different samples.

Susceptibility testing and identification of resistance genes in bacterial isolates.Several bacterial species typical of the bee gut microbiota, including Alpha1, Gilliamellaapicola (previously called Gamma1), Gamma2, Snodgrassellaalvi (previously called Beta), Bifido and Firm5were recovered from a colony collection isolated from bee guts from a CT colony, using culturing methods described in Engel et al. (13). Bacterial DNA was extracted as described above, using a standard phenol/chloroform protocol. Tetracycline resistance genes in isolates were identified, and near-complete ORFs were amplified and sequenced using the protocols described below.

Isolates were screened for tetracycline resistance by plating on TSA blood agar (tryptic soy agar + 5% sheep’s blood) containing 12 μg/mL oxytetracycline. Minimum inhibitory concentrations (MICs) were determined via the Etest system (bioMérieux) from cells grown on Mueller-Hinton agar + 5% sheep’s blood (BD BBL) or their optimal culture media (Table S5).

Retrieval of full-length ORFs of resistance genes from multiple sample types. To confirm the identification and sequencing results from diagnostic PCR and to compare the tetracycline genes from the AZ metagenomic sequencing data, primers were designed for amplification of the near-complete ORFs of the eight identified tetracycline genes (tetB, tetC,tetD,tetH, tetL, tetM, tetW, tetY) (Table S8). These were used to amplify from representative fosmid clones corresponding to each gene family (tetC for TA2; tetB for A1, A2_5 and T3_24), from the MD metagenomic sample MD216, and from cultures of the major bacterial species isolated in culture from the bee gut microbiota(tetBfor wkB1, tetCfor wkB2, wkB4, and wkB5). The tetL ORF was also amplified from fosmid clone A3_7 (Table S3).

Phylogenetic analysis of tetracycline resistance genes.Phylogenetic analyses were performed for inferred amino acid sequences from tetracycline resistance genes retrieved from PCR amplification and sequencing for bacterial isolates from the bee gut microbiota (Table S5), from fosmid inserts and from the AZ metagenomic sequencing assembly. In addition, sequences of tetracycline efflux protein-branched major facilitator superfamily (MFS) genes that were identified in the metagenomic sequence analysis were included in phylogenetic analyses. The reference tetracycline resistance gene families were recovered from the NCBI and ARDB database (Liu and Pop 2009). The amino acid sequences for each resistance gene were aligned in ClustalX2 (Larkin et al. 2007) with the default parameters and edited manually to remove regions of ambiguous alignment. Positions that contained gaps were retained except when a single base insertion was present only in a limited number of taxa. A phylogenetic tree was generated from alignments profile, using the neighbor-joining algorithm of Saitou and Nei based on the principle of minimum evolution, along with bootstrap analysis (1000 replicates) in Mega 5 (Tamura et al. 2011).

A more detailed phylogenetic analysis was used with nucleotide sequences for thetetracycline resistance gene tetL, reported to confer resistance to the bee pathogen Paenibacillus larvae (25). The best-hit sequences from blastn with the tetL sequence from fosmid insert A3_7 were recovered from the NCBI nr database and aligned in ClustalX2. The tree was constructed using the GTR model along with bootstrap analysis (100 replicates) in PHYML program (Guindon and Gascuel 2003).

Submission of sequences to GenBank.Nucleotide sequences from this study have GenBank accession numbers: JQ966977-JQ966984 for fosmid inserts; JQ966985-JQ966992 for full-length ORFs of genes from PCR amplicons, and JS807327-JS807645 for fosmid insert end-sequences.

Supplementary Detailed Results

As described in the main text, eight types of tetracycline loci were found in the bee gut microbiota. Detailed results and discussion of each follows.

tetB. For tetB, sequences identical to those found in many species in the database, including an E. coli plasmid and Neisseria meningitidis Q8, were obtained from a metagenomic scaffold and from amplicons derived from pooled guts, from isolates and from fosmids; these were assigned to both S. alviand to G. apicolaand originated from AZ, CT, and MD (Fig. 3C).

Suggestive evidence of transfer between species of the bee gut microbiota, or recent transfer from the same source species, was found for tetB. Fosmid A3_16, representing a chromosomal fragment of S. alvi, exhibited resistance to both tetracycline and ampicillin; sequencing revealed a full-length tetB and ampicillin resistance gene blaTEM-1. The tetBoccurs on a Tn10 transposon, and blaTEMis associated with a Tn3 transposon inserted into gltS within the Tn10 transposon, near tetB(Fig. 3C).A unit containing tetB and Tn10 is found in a metagenomic scaffold (NODE_608118) that also encodes plasmid replication protein RepC and an intact transposon Tn10, suggesting derivation from a conjugative plasmid. This scaffold, which had uncertain taxonomic assignment, contains intact gltS with no Tn3 insert. Intact gltS with identical sequence was amplified from a G. apicolaisolate (strain wkB1) and from USA and European samples with tetB; in contrast, the gltS fragment with Tn3-insert could not detected in most samples, suggesting that the chromosomal arrangement in fosmid A3_16 is rare. Of 7 tetracycline-resistant fosmids carrying tetB, 2 were chromosomal loci associated with Tn3-insert transposons, and 5 were associated with plasmid loci other than Tn3. No Tn3-insert Tn10 conjugative transposon of this type was retrieved in blastn searches of the NCBI database, suggesting that this is a new type of conjugative transposon.