MEFs / LEDGF+ human cells
HIV-1
E1 (f/+) / HIV-1
E2 (-/-) / Random2 / HIV-1 / MLV / SFV / ASLV / AAV / Random
_s2 / Random
_M2 / Random
_ANS2
Total sites1 / 326 / 408 / 857 / 1269 / 1231 / 3088 / 577 / 432 / 10,000 / 1299 / 884
RefSeq TUs (%) / 62.3 / 44.1 / 32.3 / 72.0 / 47.1 / 30.4 / 47.7 / 47.9 / 34.1 / 37.1 / 36.2
Ensembl TUs (%) / 68.7 / 47.3 / 39.6 / 74.0 / 49.6 / (33.7) / 49.6 / 50.9 / 35.7 / 38.7 / 38.0
all TUs (all_mRNA) (%) / 81.9 / 60.5 / 52.4 / 79.5 / 59.2 / (45.2) / 58.4 / 61.6 / 44.0 / 47.0 / 47.9
±2.5 kb of transcription start site (Refseq) (%) / (0.9) / 9.8 / 3.6 / (2.7) / 20.1 / 11.4 / 6.1 / (4.9) / 3.2 / 2.7 / 3.1
±5.0 kb of transcription start site (Refseq) (%) / (6.4) / 16.8 / 8.1 / (7.4) / 25.7 / 17.6 / 11.6 / (8.8) / 6.0 / 5.2 / 6.8
±2.5 kb of CpG island (%) / (3.4)3 / 9.6 / 4.1 / (4.5)3 / 23.8 / 15.3 / 9.7 / (7.2) / 4.7 / 2.9 / 5.2
±5.0 kb of CpG island (%) / (9.2)3 / 15.9 / 8.1 / (10.0)3 / 28.8 / 21.6 / 15.9 / (11.3) / 8.2 / 6.5 / 9.2

Supplementary Table S1. Frequencies of viral integration into genomic features. All differences in the observed integration frequencies in E1(f/+) versus E2(/), and E2(-/-) versus Random were statistically significant (P < 0.01). Frequencies given in parentheses were not statistically different from those in the random datasets (P > 0.01). HIV1 integration sites from MEFs were generated in this work. Retroviral (HIV-1, MLV, SFV, ASLV) and parvoviral (AAV) integration sites from normal (LEDGF+) human cells were from: Schroder et al. 2002; Mitchell et al. 2004; Lewinski et al. 2006; Wu et al. 2003; Narezkina et al. 2004; Trobridge et al. 2006; Nowrouzi et al. 2006; Miller et al. 2005; GenBank accession numbers: BH609398-BH609980, BH610075-BH610086, CL529240-CL529767 (HIV-1 integration sites); DX598305-DX598906, AY515855-AY516880 (MLV); DU796690-DU799518, DQ192669-DQ193287 (SFV); CL528303-CL528772, AY653309-AY653534 (ASLV); DU709854-DU711025 (AAV).

1Total number of integration sites from each dataset used in the analyses. All sites (except Random_s set) were selected to unambiguously match a single location in the host genomic sequence based on Blat searches.

2Random integration sites were computer-generated. The MEF dataset simulated the cloning/matching procedure used in this work. Human Random_s sites were selected indiscriminately along the human genome. Human Random_M and Random_ANS simulated LM-PCR using MseI or AvrII/NheI/SpeI digestion, respectively. Only sites that could be unambiguously located within the human genome using Blat were used for the analysis.

3The negative effect of CpG-rich regions on HIV-1 integration becomes clearer after discounting the islands located within TUs, which account for ~50% of the CpG islands in current genomic annotations.


Supplementary Table S2. Oligonucleotides used in this study

Oligo / Sequence 5'-3' / Usage / Reference
AE2270 / gcactgaaacgccaagcatccag / pCP75KO construction / This study
AE2271 / atacagctaggaacaagattagg / pCP75KO construction / This study
AE2336 / tgggatccaacttactatatagacccagttg / pCP75KO construction / This study
AE2338 / cttcccgcggagcctgtgtttatatactctaatc / pCP75KO construction / This study
AE2382 / CCCTGATATCTCTGAGTTCACTCAGGTGTAC / pCP75KO construction / This study
AE2290 / ctgaattcaatacagaatgacattctatctc / pCP75KO construction / This study
AE2285 / ctgcgtcgaccaacaccaaacagtgctctattc / pCP75KO construction / This study
AE2329 / gaatcgatctagagtaaagagccccaaatatgtc / pCP75KO construction / This study
AE2341 / CGATGCATGCATAACTTCGTATAATGTATGCTATACGAAGTTATGAT / pCP75KO construction / This study
AE2342 / ATCATAACTTCGTATAGCATACATTATACGAAGTT ATGCATGCAT / pCP75KO construction / This study
AE2722 / ATGGGCCCACTGGAAGGGCTAATTCACTCCG / pCR2.1-U3 construction / This study
AE2723 / CGAATTCAAGCTTTATTGAGGCTTAAGCAG / pCR2.1-U3 construction / This study
AE2724 / CGAATTCAAGCTTGCCTTGAGTGCTCAAAG / pCR2.1-U5 construction / This study
AE2725 / ATGGGCCCTACTGCTAGAGATTTTCCACACTG / pCR2.1-U5 construction / This study
AE2817 / GCGCGGCCACAACCATGGTGAGCAAGGGCGAGGAGC / pIRES2-eGFP construction / This study
AE2828 / CCGTCGCGGCCGCTACTTGTACAGCTCGTCCATGCCG / pIRES2-eGFP construction / This study
AE2647 / cgtgaattccaccatgactcgcgatttcaaacctggagac / Ledgf/p52 construction / This study
AE2680 / taatcgatgcggccgcctaagcgtaatctggaacatcgtatgggtagaagccaccctggattaatgctttgagctgctc / Ledgf/p52 construction / This study
AE2829 / ACTCACTTTTAGATTAACAGATGC / DATh construction / This study
AE2830 / GCATCTGTTAATCTAAAAGTGAGTATGGTAAAACAGCCCTGTCC / DATh construction / This study
AE2822 / GAGAAGAAGCGAGAAACATCAGGTGAAGGAGATTCCGTGATC / DIBD construction / This study
AE2823 / TGATGTTTCTCGCTTCTTCTCCAC / DIBD construction / This study
AE2331 / gagatatcgaggcagaaagaagactgggatag / Genomic flox site detection / This study
AE2334 / tggaattctatctcaaacaaaccaaagagc / Genomic flox site detection / This study
AE2802 / gcatggtggcacaatggcaactgggtc / Exon 3 deletion / This study
AE2772 / tattcttaatctttgctgtagag / Psip1 Southern blot probe / This study
AE2773 / tctggagaagtacaaaggttatc / Psip1 Southern blot probe / This study
AE2507 / GGGATAACAGCGCAATCCTATTC / Mitochondrial PCR / This study
AE2508 / ACGTAGGACTTTAATCGTTGAAC / Mitochondrial PCR / This study
AE2697 / TGTTAGGCATGTTGGAGACCTGG / Sod1 PCR / This study
AE2698 / AATGCTCTCCTGAGAGTGAGATC / Sod1 PCR / This study
MH531 / TGTGTGCCCGTCTGTTGTGT / LRT PCR / Butler et al. 2001
MH532 / GAGTCCTGCGTCGAGAGAGC / LRT PCR / Butler et al. 2001
LRT-P / FAM-CAGTGGCGCCCGAACAGGGA-TAMRA / LRT PCR / Butler et al. 2001
AE2621 / TAACTAGAGATCCCTCAGACC / 2-LTR PCR / This study
AE2622 / CAGGCTCAGATCTGGTCTAAC / 2-LTR PCR / This study
AE2623 / FAM-AAAATCTCTAGCAGTACTGGAAGGGCTAAT-TAMRA / 2-LTR PCR / This study
AE2257 / tttcaggtccctgttcgggcgccac / BBL-PCR; PIC integration assay / Lu et al. 2005
AE2604 / agtgagttccaggacagccagg / BBL-PCR / This study
AE2605 / ggctggcctcgaactcagaaatc / BBL-PCR / This study
AE2606 / gtctgaagacagctacagtgtac / BBL-PCR / This study
AE2607 / gtgagccaccatgtggttgctgg / BBL-PCR / This study
AE2608 / ctcaccatcatcagaacgcagcact / BBL-PCR / This study
AE2609 / tcgccatctggtaatctctgaag / BBL-PCR / This study
AE989 / TCTGGCTAACTAGGGAACCCA / BBL-PCR; PIC integration assay / Julias et al. 2001
AE990 / CTGACTAAAAGGGTCTGAGG / BBL-PCR; PIC integration assay / Julias et al. 2001
AE995 / FAM-TTAAGCCTCAATAAAGCTTGCCTTGAGTGC-TAMRA / BBL-PCR; PIC integration assay / Julias et al. 2001
AE2624 / CCTCAAACATGACTCGCGATTTC / Psip1 RT-PCR / This study
AE2625 / GCTCCATCAGGAACTTCATCTAC / Psip1 RT-PCR / This study
AE2844 / GTAATACGACTCACTATAGGGCACGCGTGGTCGACGGCCCGGGCTGC / LM-PCR linker / This study
AE2845 / PO4-CTAGGCAGCCCG-AmC7Q / LM-PCR linker / This study
AE2814 / GACTCACTATAGGGCACGCGT / Integration site PCR / This study
SB-76 / GAGGGATCTCTAGTTACCAGAGTCACA / Integration site PCR / Schroder et al. 2002
AE2815 / GTCGACGGCCCGGGCTGCCTA / Integration site PCR / This study
ASB-1 / AGCCAGAGAGCTCCCAGGCTCAGATC / Integration site PCR / Schroder et al. 2002
AE2685 / CCAAAGGTCAGTGGATATCTG / Integration site sequencing / This study
AE2413 / gttgttccagtttggaacaagagtc / PIC integration assay / Lu et al. 2005
AE2414 / actcaaccctatctcggtctattc / PIC integration assay / Lu et al. 2005