Electronic Supplementary Material 1

Materiel and method

DNA amplifications

PCR amplifications were conducted in 25 µl reaction mixtures containing, 1X enzyme Buffer manufacturer Qiagen®, (containing 1.5 mM Mgcl2), 0.6 unit of Taq polymerase, 17.5 pmol of each primer, 25nM of each dNTP and 4µl of DNA extract. After an initial denaturation step at 94°C for 3 min, samples were submitted to 35 cycles of 30s at 94°C, 1 min between 50°C and 56°C depending of the fragments amplified and 1min at 72 °C. PCR products were then sent to sequencing services (Macrogen, South Korea). The amplification primers were also used as sequencing primers. Sequences were cleaned and assembled using Seqscape v2.5 software (Applied Biosystems).

Obtention of ultrametric trees for the Species delimitation method: Multidivtime procedure

Parameters of the substitution model used by Estbranches (F84 + Γ) were estimated with the baseml program of the PAML package (Yang 1997) for each locus separately. The output from baseml was then used for the first step of the multidistribute package: paml2modelinf was run to convert these outputs into data useable by Estbranches. This program produces ML estimates of branch lengths within the optimal tree topology estimated from the combined data and a variance-covariance matrix for each locus. These output files are then employed in Multidivtime to estimate divergence times. We used the default setting for the Markov chain Monte Carlo analyses (100000 cycles in which the Markov chain was sampled 10000 times every 100th cycle following burnin).

Although several aphid fossils have been described (Heie 1967; Heie 2004), none of them are recent enough to calibrate the Brachycaudus phylogenetic tree. There are obviously no fossil for Buchnera. As our aim was simply to obtain ultrametric trees for the species delimitation analysis,we arbitrarily assigned prior ages of 1.0 (SD = 1) to both lineages (see Hughes et al. 2007 for a similar approach). Following the manual recommendation, rtrate (mean of the rate of molecular evolution at the ingroup root node), was estimated by calculating the median of the branch lengths from root to ingroup tips.

Buchnera Phylogenetic reconstruction

The results of MP analysis were used to determine the most suitable evolutionary model for ML analysis and BI.

We first performed MP analyses with PAUP* v. 4.0b10 (Swofford 2003), on the combined DNA dataset. We conducted heuristic searches with the tree bisection–reconnection branch swapping algorithm, 500 random addition sequences and a Maxtrees value of 10000. Gaps were treated as missing data. Character congruence between the three DNA partitions was then tested using the incongruence length difference test (ILD; Farris et al. 1995), by performing 500 replicate MP searches on the randomly partitioned dataset with all invariant characters excluded (Cunningham 1997).

For ML reconstructions, the model of nucleotide substitution was selected in Modeltest v. 3.7 (Posada & Crandall 1998). The MP tree with the highest Ln score was used to estimate the model parameters (gamma shape, base frequencies and substitution matrix). A ML heuristic search, using a starting tree obtained by MP, was then conducted in PhyML (Guindon & Gascuel 2003), using the selected model.

For both MP and ML analyses, node support was assessed with the bootstrap technique, using 500 replicates.

Bayesian phylogenetic analyses were conducted in MrBayes v. 3.1.2 (Ronquist & Huelsenbeck 2003). Different partition schemes were compared to optimize the fit of evolutionary models to the sequence data (Nylander et al. 2004; see table S3 of the electronic supplementary material). We used the GTR+I+G model, which was identified as the best-fit model for all DNA fragments. The parameters of the model were treated as unknown variables with uniform prior probabilities and were estimated during the analysis; they were allowed to vary across partitions. Two replicate analyses were run for three million generations. We ran one cold chain and three hot chains of the Markov chain Monte Carlo simulation, using a random starting point and sampling trees every 100 generations. The point of stationarity was determined as the point at which the distribution of likelihoods reached a plateau and trees preceding this point (2000–3000 trees, depending on the DNA partition) were discarded. The remaining trees were used to generate 50 per cent majority rule consensus trees. Posterior probabilities (pp) were summarized accordingly.

(i)Reconciliation analyses (Page 1994)

This topology-based method, implemented in TreeMap v. 1 and TreeMap v. 2.02b, aims to identify optimal reconstructions of the history of a host–parasite association by mapping the parasite tree onto the host tree and maximizing cospeciation events. Heuristic searches are generally used to find optimal solutions in TreeMap v. 1, whereas TreeMap v. 2.02 uses the Jungle algorithm (Charleston 1998). This algorithm explores all possible mappings of one tree onto another, assigning different costs to diversification events (cospeciation, host switching, lineage sorting and duplication) and finds optimal (i.e. yielding minimal costs) solutions. We used the default cost settings for analyses. The probability of obtaining the observed number of cospeciation events is then estimated by randomizing the parasite trees and generating a null distribution of the number of cospeciation events.

(ii)ParaFit (Legendre et al. 2002)

This distance-based method tests the null hypothesis that the diversification of hosts and parasites has been independent, using distance matrices rather than tree topologies. The null hypothesis is tested by permuting a host–parasite association matrix. Each individual host– parasite association can also be tested. ParaFit tests were carried out with ML trees, using Copycat (Meier-Kolthoff et al. 2007). Tests of random association were performed with 9999 permutations.

(iii)Likelihood ratio tests

This method tests the null hypothesis that the likelihoods of host and symbiont datasets do not differ significantly under the same model (including tree topology). If the null hypothesis is rejected, it is assumed that diversification events, such as host switching in the symbiont, caused the observed incongruence.

We first used the Shimodeira–Hasegawa (SH) test, as described by Peek et al. (1998) and Clark et al. (2000), to compare the likelihood score of the best ML topology for the Buchnera combined dataset with the score for the best ML topology obtained with the Brachycaudus dataset. Similarly, the score of the best host tree was compared with that of the alternative Buchnera tree based on the aphid dataset. The trees and datasets compared excluded specimens for which sequences were not obtained, for all aphids and all Buchnera DNA fragments, and a single outgroup sequence was kept. The SH tests were conducted in PAUP* v. 4.0b10 with resampling estimated log-likelihood optimization and 10000 bootstrap replicates. We optimized the model parameters for each dataset constrained to each alternative tree.

We then used the LRT proposed by Huelsenbeck & Bull (1996) to test for heterogeneity of trees obtained with different data partitions, to assess the conflict between Buchnera loci and the combined Brachycaudus dataset (see Huelsenbeck et al. 1997; Clark et al. 2000; Hughes et al. (2007) for applications of the LRT to cospeciation studies). Again, the best topology for a given dataset was compared with the alternative topologies obtained with other datasets. We used the SH test to conduct pairwise comparisons between scores for alternative Buchnera topologies obtained with alternative loci and the combined Brachycaudus dataset. We then calculated the statistic Δ=2(ln L1–ln L0), which measures the likelihood difference between each dataset being allowed to have a different topology and all datasets being constrained to have the same topology. Under the null hypothesis of a common topology underlying all datasets, the topology chosen to establish L0 is that with the highest summed likelihood across datasets. As the tested hypotheses were not nested, the significance of Δ was assessed by generating a distribution of Δ under the null hypothesis that datasets have the same topology. Likelihood parameters and branch lengths for each Buchnera locus and the Brachycaudus combined dataset were optimized under the assumption of shared topology (that with the highest summed likelihood across datasets). One hundred sequence datasets were simulated using SEQGEN v. 1.3.2 (Rambault & Grassly 1997) with the graphical interface SG Runner v. 2.0 (T. P. Wilcox, for each Buchnera locus and the aphid combined dataset with these new parameter estimates, the length and nucleotide composition of the original dataset and the constrained topology and branch lengths. The statistic Δ was calculated for each of the 100 simulated datasets. We also examined the contribution of individual Buchnera loci to the heterogeneity of the observed dataset, by excluding individual genes from the calculation of Δ.

Reconciliation analyses and ParaFit analyses were conducted on both specimen-based phylogenies (including 56 samples) and the different species-based phylogenies obtained with species delineation methods. The LRT method was used only for specimen-based phylogenies. The main advantage of this method is that it makes it possible to detect heterogeneity between data partitions and this property should not be affected by phylogenies including fewer sequences.

References

Charleston, M. A., 1998 Jungles: a new solution to the host/parasite phylogeny reconciliation problem. Math. Biosci. 149, 191–223 (doi:10.1016/S0025-5564(97))

Cunningham, C. W. 1997 Can three incongruence tests predict when data should be combined? Mol. Biol. Evol. 14. 733–740.

Farris, J. S., Källerjso, M., Kluge, A. G. & Bult, C. 1995 Constructing a significance test for incongruence. Syst. Biol. 44. 570—572. (doi:10.2307/2413663)

Gomez-Valero, L., Silva, F. J., Simon, J. C. & Latorre, A. 2007 Genome reduction of the aphid endosymbiont Buchnera aphidicola in a recent evolutionary time scale. Gene389, 87-95.

Guindon, S. & Gascuel, O 2003 A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52. 2003. 696–704. (doi:10.1080/10635150390235520)

Heie, O. E. 1967 Studies on fossil aphids (Homoptera: Aphidoidea). Spolia Zool. Musei Hauniensis26, 1-273.

Heie, O. E. 2004 The history of the studies on aphid palaeontology and their bearing on the evolutionary history of aphids. In Aphids in a new millennium (ed. J.-C. Simon, C.-A. dedryver, C. Rispe & M. Hullé), pp. 151-158. Paris: INRA Editions.

Huelsenbeck, J. P. & Bull, J. J. 1996 A likelihood ratio test to detect conflicting phylogenetic signals. Syst. Biol.45, 92–98. (doi:10.2307/2413514)

Huelsenbeck, J. P., Rannala, B. & Yang, Z. 1997 Statistical tests of host–parasite cospeciation. Evolution. 51, 410–419. (doi:10.2307/2411113)

Hughes, J., Kennedy, M., Johnson, K. P., Palma, R. L. & Page, R. D. M. 2007 Multiple Cophylogenetic Analyses Reveal Frequent Cospeciation between Pelecaniform Birds and Pectinopygus Lice. Syst. Biol.56, 232-251.

Johnson, K. P. & Clayton, D. H. 2004 Untangling coevolutionary history. Syst. Biol. 53, 92–94. (doi:10.1080/10635150490264824)

Kergoat, G. J., Silvain, J. F., Delobel, A., Tuda, M. & Anton, K. W. 2007 Defining the limits of taxonomic conservatism in host-plant use for phytophagous insects: Molecular systematics and evolution of host-plant associations in the seed-beetle genus Bruchus Linnaeus (Coleoptera : Chrysomelidae : Bruchinae). Mol. Phyl. Evol.43, 251-269.

Meier-Kolthoff, J. P., Auch, A. F.., Huson, D. H. & Göker, M. 2007 Copycat: cophylogenetic analysis tool. Bioinformatics. 23, 898–900. (doi:10.1093/bioinformatics/btm027)

Nylander, J. A. A. Ronquist, F., Huelsenbeck, J. P, & Nieves-Aldrey, J.-L. 2004 Bayesian phylogenetic analysis of combined data. Syst. Biol. 53, 47–67. (doi:10.1080/10635150490264699)

Page, R. D. M. Tangled trees: phylogeny, cospeciation and coevolution. Chicago, IL: University Chicago Press.

Peek,A. S., Feldman, R. A., Lutz, R. A. & Vrijenhoek, R. C. 1998 Cospeciation of chemoautotrophic bacteria and deep sea clams. Proc. Natl Acad. Sci. USA. 95, 9962–9966. (doi:10.1073/pnas.95.17.9962)

Posada, D. & Crandall, K. A. 1998 ModelTest: testing the model of DNA substitution. Bioinformatics.14, 817–818. (doi:10.1093/bioinformatics/14.9.817)

Rambault, A. & Grassly, N. C. 1997 . Seq-gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comp. Appl. Biosc. 13, 235–238.

Ronquist, F. & Huelsenbeck, J. P. 2003 MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 19, 1572–1574. (doi:10.1093/bioinformatics/btg180)

Swofford, D. L. 2003 PAUP*. Phylogenetic analysis using parsimony (*and Other

Yang, Z. 1997 PAML: a program for package for phylogenetic analysis by maximum likelihood. CABIOS 15, 555-556.

Table S1: Sample information

Species / Voucher / Collectors / Collection site / Host plant
B. aconiti (Mordvilko, 1928) / 1790 / Coeur.& Jous. / France, Ariège (09), Mijanes, Col de Pailhères / Aconitum sp.
B. amygdalinus / 1688 / Coeur.& Jous. / France, Var (83), Fayence / Prunus dulcis
(Schouteden, 1905) / 1694 / Coeur.& Jous. / France, Bouches-du-Rhône (13), St-Martin-de-Crau / Prunus dulcis
1710 / Coeur.& Jous. / France, Gers (32), Saint-Clar / Prunus dulcis
B. ballotae (Passerini, 1860) / s338 / G. Cocuzza / Germany, Berlin / Ballota nigra
B. bicolor (Nevsky, 1929) / 1458 / Coeur d'Acier / Greece, Lakonia, Lagada / Boraginaceae
B. cardui / 1709 / Coeur.& Jous. / France, Haute-Garrone (31), Grenade / Asteraceae sp.
1746 / Coeur.& Jous. / France, Haut-Rhin (68), Colmar / Prunus domestica
1765 / Coeur.& Jous. / France, Gard (30), Le Vigan / Arctium sp.
B. cerinthis / 1772 / Coeur.& Jous. / France, Hautes-Alpes (05), Villar-d'Arene / Cerinthe glabra
B. divaricatae (Shaposhnikov, 1956) / s242 / G. Cocuzza / Lithuania, Vilnius, Bratoskies / Prunus divaricata
B. helichrys i(Kaltenbach, 1843) / 1608 / Coeur d'Acier / Greece, Ahaia, Kalavrita / Achillea sp.
1600 / Coeur.& Jous. / Greece
1716 / Coeur.& Jous / France, Tarn-et-Garonne (82), Gramont / Prunus domestica
1809 / Coeur.& Jous. / Australia, Western Australia, Denison / Helianthus annuus
1681 / Coeur.& Jous. / France
1749 / Coeur.& Jous / France
B. jacobi Stroyan, 1957 / s145 / G. Cocuzza / Italy, Sicily, Itala / Myosotis sylvatica
B. klugkisti (Börner, 1942) / 1290 / Coeur d'Acier / France, Creuse (23), Peyrat-la-Noniere / Silenesp.
1747 / Coeur.& Jous. / France, Haut-Rhin (68), Ste-Marie-Aux-Mines / Silene dioica
2063 / Jousselin / France, Haute-Savoie (74), La Roche sur Foron / Silene dioica
2064 / Coeur.& Jous. / France, Pyrénnées Orientales, / Silene dioica
B. lamii (Koch, 1854) / s328 / G. Cocuzza / Italy, Sicily, Montalbano Elicona / Lamium flexsuosum
B. lateralis (Walker, 1848) / 1027 / Coeur d'Acier / France, Finistère (29), Cleden-Cap-Sizun / Senecio jacobaea
1741 / Coeur.& Jous. / France, Drôme (26), St-Marcel-les-Valence / Senecio sp.
1751 / Coeur.& Jous. / France, Haut-Rhin (68), Colmar / Arctium sp.
s117 / G. Cocuzza / Italy, Sicily, Salina / Chrysanthemum coronarium
1794 / Coeur.& Jous. / France, Lozère (48), La Bastide-Puylaurent / Senecio sp.
B. linariae (Stroyan, 1950) / 1938 / Coeur.& Jous. / France, Hérault (34) Prades le lez, CBGP / Linaria repens
2047 / Coeur.& Jous / Italy, Sicilia, Zafferana / Linaria purpurea.
B. lucifugus (Müller, 1955) / s249 / G. Cocuzza / Italy, Trentino Alto Adige, Ala / Plantago lanceolata
B. lychnicola(Hille Ris Lambers, 1966) / s317 / G. Cocuzza / CzechRepublic, South Bohemia, Lužanská Udolí / Silene flos-cuculi
B. lychnidi s(Linnaeus, 1758) / 1324 / Coeur d'Acier / France, Morbihan (56), Saint-Pierre-Quiberon / Silenesp.
1698 / Coeur.& Jous. / France, Hérault (34) St-Guilhem-le-Desert / Silenelatifolia
1752 / Coeur.& Jous. / France, Haut-Rhin (68), Colmar / Silene latifolia
1762 / Coeur.& Jous. / France, Gard (30), Le Vigan, Col de Faubel / Silene dioica
B. malvae (Shaposhnikov, 1964) / s125 / G. Cocuzza / Italy, Lazio, Roma / Malva sylvestris
B. mordvilkoi (Hille Ris Lambers, 1931) / s248 / G. Cocuzza / Italy, Trentino Alto Adige, Ala / Echium vulgare
B. napelli (Schrank, 1801) / s316 / G. Cocuzza / CzechRepublic, South Bohemia, Lužanská Udolí / Aconitum callybotrium
B. persicae (Passerini, 1860) / 1077 / Coeur d'Acier / France, Aude (11), Quillan, La Forge / Prunus spinosa
1696 / Coeur.& Jous. / France, / Prunus sp.
1736 / Coeur.& Jous. / France, Drôme (26), St-Marcel-les-Valence / Prunus sp.
B. populi (del Guercio, 1911) / 1483 / Coeur d'Acier / Greece, Lakonia, Mystra / Silene vulgaris
1760 / Coeur.& Jous. / France, Gard (30), Le Vigan, Col de Faubel / Silene vulgaris
B. prunicola (Kaltenbach, 1843) / 1267 / Coeur d'Acier / France, Creuse (23), Vallieres, La Prades / Prunus sp.
B.rumexicolens (Patch, 1917) / 1764 / Coeur.& Jous. / France, Gard (30), Le Vigan, Col de Faubel / Rumex acetosella
1982 / Coeur.& Jous / Italie, Sicile, Linguaglossa / Rumex acetosella
B. salicinae (Börner, 1939) / s307 / G. Cocuzza / CzechRepublic, South Bohemia, Českỳ Krumlov / Inula salicina
B. schwartzi (Börner, 1931) / 1717 / Coeur.& Jous. / France, Tarn-et-Garonne (82), Gramont, Hameau de Géran / Prunus persica
1730 / Jousselin / France, Centre, Loiret (45), Germigny-Des-Pres / Prunus persica
1738 / Coeur.& Jous. / France, Drôme (26), St-Marcel-les-Valence / Prunus persica
B.spiraeae (Börner, 1932) / 1775 / Coeur.& Jous. / France, Hautes-Alpes (05), La Grave / Spiraea sp.
2143 / Coeur.& Jous. / Scotland, Kinlochewe, / Spiraea salicifoliae
B. tragopogonis (Kaltenbach, 1843 / 1378 / Coeur d'Acier / Greece, Korinthia, Némea / Tragopogon sp.
1715 / Coeur.& Jous. / France, Tarn-et-Garonne (82), Gramont, Hameau de Géran / Tragopogon sp.
1773 / Coeur.& Jous. / France, Hautes-Alpes (05), Villar-d’Arene / Tragopogon sp.
Outgroups
Myzus persicae (Sulzer, 1776) / 1948 / Coeur.& Jous. / France
Myzus persicae (Sulzer, 1776) / 1956 / Coeur.& Jous. / France

Table S2: Name, sequences and references of primers used for Buchnera PCR andsequencing.

DNA fragment / Name of primer / Sequence of primer / References
TrpB / TrpBF / ACWGGHGCTGGWCAACATGGWGT / This study
TrpBRlg / CAACCAAGCATGTTCAGGACCA / This study
HupA rpoC intron / hupAF / DTTAATTAATTGAGTTTTATTCAT / (Gomez-Valero et al. 2007)
rpoC / ACWGGATATGCATATCAYAAARAACG / (Gomez-Valero et al., 2007)
Sbb-dnaB intron / sbbF / CGAACWTCVGGATCTTGWC / Carletto et al.unp.
dnaB R / ATCCCATTGTTCATTATCTAACAT / Carletto et al.unp

Table S3: We chose to partition the combined dataset according to DNA fragments identity, coding (i.e. TrpB) and non coding regions and codon position in the coding region.We compared partitioning strategies using Bayes factors (Kergoat et al. 2007), the Bayes factors (2 ln (Bp)) are figured on the left side of the matrix. Critical values of the χ2 distribution (P < 0.001) are given on the right side of the matrix, (ddl) refer to the number of additional parameters required for the most complex strategies between the two strategies being compared.

Partitioning strategy / Harmonic mean / P1 / P2 / P3 / P4 / P5
P1. Non partitioned dataset / 13994.34 / - / 36,123 (ddl= 14) / 55,476
(ddl=27) / 73,402
(ddl=40) / 90,573
(ddl=53)
P2. Trpb + (introns) / 13812.27 / 364.14 / - / 34,528
(ddl=13) / 54,052
(ddl=26) / 72,055
(ddl=39)
P2. TrpB + intron 1 + intron 2 / 13636.43 / 715.82 / 351.68 / - / 34,528
(ddl=13) / 54,052
(ddl=26
P4: TrpB codon 1, 2 ,3 + intron / 13641.26 / 706.16 / 342.02 / -9.66 / - / 34,528
(ddl=13)
P5: TrpB codon 1,2,3 + intron 1 + intron 2 / 13587.00 / 814.68 / 450.54 / 98.86 / 108.52 / -

Table S4: Results of cospeciation tests between aphids and Buchnera species trees, maximum numbers of cospeciation events are given for Treemap analyses and numbers of significant links are given for ParaFit analyses

Brachycaudus tree (Nbr of species) / Buchnera tree (Nbr of species) / Treemap 1 / TreeMap 2.02b / ParaFit
taxonomic species (27) / phylogenetic species, clustering method (21) / 13 P < 0.001 / 30 P < 0.01 / 28 (all) P < 0.001
phylogenetic species, clustering method (21) / phylogenetic species, clustering method (21) / 14 P < 0.001 / 34 P < 0.01 / 22 (all) P < 0.001
phylogenetic species Pons et al. mehod (22) / phylogenetic species, Pons et al. method (24) / 16 P < 0.001 / 34 P < 0.01 / 24 (all) P < 0.001