Supplementary Experimental Procedures

Genomic DNA Isolation

Rice leaves were ground in liquid N2 with a mortar and pestle, placed in a 2-ml Eppendorf tube, and suspended in 1 ml of urea extraction buffer (containing 7 M urea, 0.3 M NaCl, 50 mM Tris-HCl, pH 8.0, 20 mM EDTA, and 1% sarkosine) and extracted with 1 ml of phenol:chloroform:isoamyl alcohol (25:24:1) at room temperature for 15 min. The mixture was centrifuged at 8000 rpm and 4°C for 10 min. The aqueous phase was mixed with 200 µl of 3 M sodium acetate (pH 5.2) and 1ml of isopropanol. DNA was spooled out, placed in 70 % ethanol, and centrifuged at 8000 rpm for 10 s. The DNA pellet was serially washed with 70 % and 100 % ethanol and air-dried. The genomic DNA was then resuspended in TE buffer and stored at 4C.

PCR and RT-PCRanalysis

Total RNA was purified from rice suspension cells and leaves using a Trizolreagent (Gibco BRL). Twenty g RNA was treated with 1 unit of RNase-free DNase I (Promega, Madison, WI) in a 20 l solution containing 5 m DTT and 40 units RNasin (Promega) and incubated at 37°C for 15 min. The RNA sample was then incubated at 80°C for 3 min and placed on ice. cDNA synthesis was performed in a 20 l reverse transcriptase reaction buffer (50 mm Tris-HCl, pH 8.3, 75 mm KCl, and 3 mm MgCl2) containing 10 g of the RNA preparation described above, 50 pmole oligo (dT)18 primer, 5 mm DTT, 0.5 mm each dNTP, 40 units RNasin, and 200 units Maloney murine leukaemia virus reverse transcriptase (MMLV-RTase, Promega). The reaction was carried out at room temperature for 10 min, then at 37°C for 1 h, and finally terminated by heating 94°C for 5 min. The sample served as a cDNA stock for PCR analysis and was stored at -70°C.

PCR was carried out in a 50 l solution containing 2.5 l cDNA prepared as described above, 10 mm Tris-HCl, pH 8.3, 50 mm KCl, 3 mm MgCl2, 0.1 mm each dNTP, 100 pmole each primer, 10% DMSO, and 3 units of super Taq DNA polymerase (Promega). Cycling was controlled by a programmable thermal cycler (Biometra, Göttingen, UK). The temperature for DNA denaturation was set at 94°C for 0.5 min, primer annealing at 56°C for 0.5 min, and primer extension at 72°C for 1 min. The reaction was allowed to run for 30 cycles before a final 10 min primer extension at 72°C. The RT-PCR products were fractionated in a 2 % agarose gel and visualized by ethidium bromide staining.

BioinformaticsAnalysis and Establishment of the FSTDatabase

Identification of T-DNA and binary vector sequences as well as assignment of T-DNA insertion sites on rice BAC/PAC sequences were carried out using BLASTN (Basic Local Alignment Search Tool). Rice genomic sequences were obtained from the Rice Genome Automated Annotation System(RiceGAAS), Rice Genome Research Program ( and TIGR ( Identification of T-DNA inserted genes in the rice genome was carried out as follows: Matching of FSTs to BAC/PAC sequences wasperformed using BLASTN, with expected values of <e-1, HSP length > 30bp, and identity > 90%. The integration site of a given FST in the rice genome was defined as the highest hit in a BLASTN search having an identity to rice genome sequence higher than 96 %. The description of each insertion event was retrieved from our annotation database,itself obtained from NCBI and RiceGAAS and updated bi-weekly using PHP scripts. Tos17 FSTs {Miyao, 2003 #306}, Oryza TagLine FSTs {Sallaud, 2004 #274}, and rice callus ESTs were downloaded from GenBank at NCBI, and their similarities to BAC/PAC sequences were obtained following the same procedure as that for T-DNA FSTs. Data analysis and storage were performed using two IBM @ServerxSeries 335 computers runningaLinux operating system. Information on mutant lines, plant phenotype data of T2 mutants, BAC/PAC sequences surrounding each T-DNA insertion site, results of homology searches and annotations of T-DNA tagged genes are stored in the relational database management system MySQL ( Data can be retrieved through the World Wide Web server program Apache ( and a PHP script. Users may search for genes of interest by three methods: BLAST search, keyword search or mutant line number through the TRIM website (

1

Table S1 T-DNA integration loci in T0 Tainung 67 transgenic rice plants

No. of loci

/

No. of lines

/ Ratio (%) /

Total integration loci

1 / 116 / 58 / 116
2 / 46 / 23 / 92
3 / 22 / 11 / 66
4 / 8 / 4 / 32
≧5 / 8 / 4 / 40
Total / 200 / 100 / 346
Average / 1.73

Ratio = number of lines / total number of lines analyzed (200).

Total integration loci = number of loci x number of lines.

Average = total integration loci (346) / total number of lines (200)

1

Table S2 Tos17 copy number in T0 Tainung 67 transgenic rice plants

Copy no. / No. of lines / Ratio (%) / Total copy no.
3 / 91 / 91 / 273
4 / 6 / 6 / 24
5 / 3 / 3 / 15
Total / 100 / 100 / 312
Average / 3.12

Ratio = number of lines / total number of lines (100).

Total copy number = copy number x number of lines.

Average = total copy number (312) / total number of lines (100).

1

Table S3 Summary of characterization of T-DNA flanking sequences

Type of sequences /

No. of lines

/

No. of read

sequences / Ratio (%)
High quality

With T-DNA border sequence

/ 17764 / 20497 / 96.02
Rice genomic sequence
/ 10774 / 11992 / 56.18
T-DNA tandem / 6588 / 7244 / 33.93
Vector only / 614 / 659 / 3.09
No-hit / 562 / 602 / 2.82

Without T-DNA border sequence

/ 805 / 849 / 3.98

Subtotal

/ 18569 / 21346 / 100.00
Low Quality / 140 / 187
Total / 18709 / 21533

Text for Table S3

By December31, 2005, 20,497 high quality FSTs were obtained from 18,709 transgenic lines. Among them, 11,992 FSTs (56.18 %), with an average of 272 bp (ranged from 30 to 729 bp, medium at 236 bp and mode at 73 bp) in length, were obtained and assigned to exact positions on the rice genome, and a total of 33.93 % and 3.09 % belong to T-DNA tandem repeats and binary vectors, respectively.

1

Table S4 Distribution of T-DNA insertions over 12 rice chromosomes

Chromosome
no / Predicted chromosome size (Mb) / No. of T-DNA inserts / Insertion density (per Mb) / Frequency (%)
1 / 43.2 / 1610 / 37.27 / 13.4
2 / 35.9 / 1487 / 41.42 / 12.4
3 / 36.1 / 1686 / 46.70 / 14.1
4 / 35.4 / 1126 / 31.81 / 9.4
5 / 29.7 / 920 / 30.98 / 7.7
6 / 30.7 / 893 / 29.09 / 7.4
7 / 29.6 / 768 / 25.95 / 6.4
8 / 28.4 / 907 / 31.94 / 7.6
9 / 22.6 / 768 / 33.98 / 6.4
10 / 22.6 / 617 / 27.30 / 5.1
11 / 28.3 / 594 / 20.99 / 5.0
12 / 27.5 / 616 / 22.40 / 5.1
Total / 370 / 11992 / 379.83 / 100
Average / 31.65

Predicted chromosomal size from the IRGSP database ()

Insertion density = number of T-DNA inserts / total length of each chromosome.

Frequency = number of T-DNA inserts / total number of T-DNA inserts (11992).

Average = total insertion density / total number of chromosomes (12).

1

Table S5 Comparison of integration frequencies of T-DNA and Tos17 over 12 rice chromosomes

Chr Sec / Chr 1 / Chr 2 / Chr 3
Density (%) / Frequency (%) / Density (%) / Frequency (%) / Density (%) / Frequency (%)
Gene / nonTE / TE / T-DNA / Tos17 / Gene / nonTE / TE / T-DNA / Tos17 / Gene / nonTE / TE / T-DNA / Tos17
1 / 5.35 / 5.47 / 4.78 / 5.16 / 6.98 / 5.69 / 6.15 / 3.57 / 6.47 / 27.14 / 5.01 / 5.30 / 3.51 / 7.44 / 10.19
2 / 5.59 / 6.00 / 3.60 / 5.23 / 15.41 / 5.29 / 5.33 / 5.10 / 5.37 / 13.11 / 5.13 / 5.42 / 3.62 / 6.23 / 5.39
3 / 4.60 / 4.89 / 3.19 / 6.87 / 5.45 / 5.01 / 5.46 / 2.96 / 4.15 / 3.00 / 5.09 / 5.75 / 1.65 / 8.05 / 8.34
4 / 5.01 / 5.44 / 2.93 / 4.46 / 3.65 / 4.89 / 5.46 / 2.24 / 4.78 / 5.13 / 4.58 / 5.00 / 2.38 / 4.67 / 6.64
5 / 4.57 / 4.27 / 6.04 / 4.06 / 1.85 / 4.41 / 4.64 / 3.37 / 3.68 / 0.82 / 5.03 / 5.06 / 4.86 / 7.29 / 6.09
6 / 4.93 / 4.51 / 6.96 / 3.47 / 0.59 / 4.79 / 3.93 / 8.78 / 3.30 / 2.66 / 4.80 / 5.28 / 2.27 / 8.01 / 7.69
7 / 5.11 / 4.34 / 8.89 / 2.27 / 2.93 / 4.87 / 4.28 / 7.55 / 1.86 / 1.40 / 4.61 / 4.98 / 2.69 / 5.24 / 2.65
8 / 4.95 / 3.79 / 10.65 / 1.50 / 0.36 / 4.92 / 3.97 / 9.29 / 2.28 / 0.39 / 4.58 / 4.23 / 6.40 / 2.77 / 1.35
9 / 5.07 / 4.46 / 8.05 / 3.33 / 1.80 / 4.78 / 3.80 / 9.29 / 3.38 / 0.24 / 4.88 / 4.69 / 5.89 / 3.53 / 0.90
10 / 4.74 / 3.64 / 10.14 / 2.01 / 1.76 / 5.09 / 4.31 / 8.67 / 2.58 / 1.64 / 4.88 / 4.43 / 7.23 / 2.92 / 1.70
11 / 4.91 / 4.99 / 4.53 / 3.25 / 2.21 / 4.67 / 4.57 / 5.10 / 5.08 / 1.45 / 5.41 / 3.49 / 15.50 / 3.15 / 0.50
12 / 4.65 / 4.94 / 3.27 / 5.34 / 4.77 / 4.78 / 4.64 / 5.41 / 4.31 / 3.14 / 5.31 / 4.63 / 8.88 / 4.10 / 2.00
13 / 4.63 / 4.63 / 4.61 / 4.86 / 3.74 / 5.01 / 4.97 / 5.20 / 4.78 / 0.87 / 5.31 / 4.86 / 7.64 / 1.97 / 3.60
14 / 5.27 / 5.80 / 2.68 / 6.87 / 7.84 / 4.94 / 5.11 / 4.18 / 7.45 / 6.77 / 4.32 / 4.21 / 4.86 / 1.94 / 2.55
15 / 5.17 / 5.50 / 3.52 / 7.35 / 3.29 / 4.83 / 5.06 / 3.78 / 6.09 / 3.87 / 4.78 / 4.63 / 5.58 / 3.65 / 1.90
16 / 5.02 / 5.59 / 2.26 / 5.27 / 10.54 / 4.72 / 4.97 / 3.57 / 7.95 / 1.74 / 4.86 / 5.12 / 3.51 / 5.13 / 3.85
17 / 5.15 / 5.32 / 4.36 / 7.68 / 2.25 / 5.25 / 5.73 / 3.06 / 5.41 / 3.58 / 4.95 / 5.16 / 3.82 / 7.71 / 7.09
18 / 5.14 / 5.42 / 3.77 / 7.64 / 9.50 / 5.29 / 5.59 / 3.88 / 9.98 / 4.69 / 5.01 / 5.38 / 3.10 / 6.04 / 5.34
19 / 4.91 / 5.52 / 1.93 / 7.97 / 9.05 / 5.21 / 5.77 / 2.65 / 5.67 / 3.97 / 5.59 / 6.10 / 2.89 / 5.70 / 10.74
20 / 5.22 / 5.50 / 3.86 / 5.41 / 6.04 / 5.56 / 6.26 / 2.35 / 5.41 / 14.37 / 5.87 / 6.28 / 3.72 / 4.44 / 11.49
Chr Sec / Chr 4 / Chr 5 / Chr 6
Density (%) / Frequency (%) / Density (%) / Frequency (%) / Density (%) / Frequency (%)
Gene / Non-TE / TE / T-DNA /

Tos17

/ Gene / Non-TE / TE / T-DNA /

Tos17

/ Gene / Non-TE / TE / T-DNA /

Tos17

1 / 5.09 / 4.66 / 6.23 / 3.60 / 2.61 / 5.45 / 6.25 / 3.09 / 5.50 / 4.51 / 5.85 / 6.20 / 4.70 / 6.80 / 11.15
2 / 5.04 / 3.51 / 9.09 / 1.48 / 0.52 / 5.13 / 5.65 / 3.59 / 5.16 / 12.98 / 5.64 / 6.98 / 1.37 / 7.56 / 7.31
3 / 5.13 / 4.54 / 6.69 / 1.94 / 1.31 / 4.75 / 4.91 / 4.26 / 4.75 / 3.96 / 5.44 / 6.18 / 3.08 / 5.72 / 18.89
4 / 5.29 / 4.78 / 6.62 / 2.28 / 1.44 / 4.60 / 4.40 / 5.18 / 4.41 / 3.35 / 5.03 / 5.86 / 2.39 / 7.18 / 6.81
5 / 5.18 / 3.70 / 9.09 / 1.48 / 1.31 / 4.70 / 5.28 / 3.01 / 3.19 / 1.50 / 4.58 / 5.48 / 1.71 / 6.16 / 1.49
6 / 4.95 / 3.02 / 10.06 / 1.71 / 0.07 / 5.06 / 4.68 / 6.18 / 1.90 / 1.02 / 4.81 / 4.15 / 6.92 / 5.53 / 1.36
7 / 4.88 / 3.38 / 8.83 / 1.88 / 0.33 / 4.83 / 2.75 / 10.94 / 2.85 / 0.00 / 4.69 / 3.85 / 7.35 / 3.94 / 4.40
8 / 4.88 / 4.05 / 7.08 / 3.60 / 0.26 / 4.60 / 2.75 / 10.03 / 2.17 / 0.55 / 4.99 / 4.65 / 6.07 / 4.96 / 0.87
9 / 4.56 / 4.19 / 5.52 / 2.00 / 0.39 / 4.62 / 2.95 / 9.52 / 1.49 / 0.41 / 4.89 / 4.92 / 4.79 / 2.80 / 0.19
10 / 4.77 / 4.49 / 5.52 / 2.74 / 0.39 / 4.66 / 4.17 / 6.10 / 2.85 / 1.50 / 5.15 / 3.69 / 9.83 / 1.97 / 0.62
11 / 4.70 / 5.52 / 2.53 / 3.48 / 2.22 / 5.06 / 4.26 / 7.44 / 2.44 / 0.89 / 4.58 / 3.99 / 6.50 / 2.41 / 0.37
12 / 4.57 / 5.37 / 2.47 / 6.34 / 3.72 / 4.87 / 5.00 / 4.51 / 3.60 / 3.01 / 4.77 / 3.26 / 9.57 / 2.92 / 0.12
13 / 5.04 / 6.01 / 2.47 / 6.45 / 8.88 / 4.77 / 5.22 / 3.43 / 6.24 / 3.14 / 5.56 / 4.95 / 7.52 / 2.10 / 0.68
14 / 5.46 / 6.72 / 2.14 / 7.36 / 12.21 / 4.43 / 5.08 / 2.51 / 6.92 / 1.23 / 4.36 / 4.25 / 4.70 / 4.07 / 1.42
15 / 5.38 / 6.35 / 2.79 / 7.02 / 5.75 / 5.28 / 5.79 / 3.76 / 8.89 / 4.03 / 5.03 / 4.87 / 5.56 / 3.49 / 2.29
16 / 4.98 / 5.81 / 2.79 / 6.91 / 4.51 / 5.32 / 6.10 / 3.01 / 10.04 / 16.67 / 4.54 / 4.81 / 3.68 / 4.83 / 6.75
17 / 4.81 / 5.62 / 2.66 / 11.19 / 9.21 / 5.34 / 5.85 / 3.84 / 6.65 / 3.07 / 4.75 / 4.95 / 4.10 / 5.65 / 3.16
18 / 4.59 / 5.52 / 2.14 / 11.64 / 27.11 / 5.87 / 6.64 / 3.59 / 5.63 / 8.06 / 4.71 / 5.24 / 2.99 / 7.94 / 3.28
19 / 5.20 / 5.93 / 3.25 / 6.91 / 8.82 / 5.36 / 6.36 / 2.42 / 8.55 / 23.70 / 5.38 / 5.80 / 4.02 / 8.77 / 20.43
20 / 5.52 / 6.84 / 2.01 / 9.99 / 8.95 / 5.30 / 5.88 / 3.59 / 6.78 / 6.42 / 5.26 / 5.91 / 3.16 / 5.21 / 8.42
Chr Sec / Chr 7 / Chr 8 / Chr 9
Density (%) / Frequency (%) / Density (%) / Frequency (%) / Density (%) / Frequency (%)
Gene / nonTE / TE / T-DNA / Tos17 / Gene / nonTE / TE / T-DNA / Tos17 / Gene / nonTE / TE / TDNA / Tos17
1 / 5.67 / 6.25 / 3.75 / 6.79 / 15.09 / 4.84 / 5.92 / 1.70 / 6.20 / 5.03 / 4.82 / 3.74 / 7.88 / 3.75 / 1.46
2 / 5.21 / 5.98 / 2.66 / 5.27 / 3.60 / 5.14 / 5.15 / 5.09 / 4.90 / 3.02 / 4.94 / 4.02 / 7.55 / 2.41 / 0.27
3 / 5.35 / 6.45 / 1.74 / 4.81 / 5.86 / 5.00 / 5.61 / 3.22 / 5.22 / 11.76 / 5.02 / 3.82 / 8.44 / 5.00 / 0.37
4 / 4.72 / 5.53 / 2.01 / 5.80 / 9.38 / 5.12 / 5.58 / 3.75 / 2.68 / 7.97 / 4.62 / 3.90 / 6.66 / 2.23 / 0.64
5 / 4.40 / 4.10 / 5.40 / 2.29 / 0.55 / 5.09 / 3.74 / 9.03 / 3.52 / 10.98 / 5.08 / 3.82 / 8.66 / 1.70 / 0.37
6 / 5.27 / 4.98 / 6.23 / 2.21 / 0.94 / 5.00 / 4.33 / 6.97 / 2.74 / 0.31 / 4.79 / 4.33 / 6.10 / 1.97 / 3.02
7 / 4.63 / 3.46 / 8.52 / 1.30 / 0.70 / 4.64 / 4.45 / 5.18 / 3.46 / 1.08 / 4.44 / 3.67 / 6.66 / 2.32 / 1.55
8 / 5.25 / 4.12 / 8.97 / 2.37 / 0.23 / 5.12 / 4.72 / 6.26 / 3.07 / 0.62 / 4.04 / 2.42 / 8.66 / 1.34 / 1.01
9 / 4.48 / 3.65 / 7.23 / 2.21 / 0.16 / 5.44 / 4.08 / 9.38 / 2.61 / 0.39 / 4.91 / 4.91 / 4.88 / 1.61 / 2.10
10 / 4.67 / 3.27 / 9.34 / 1.60 / 0.94 / 5.09 / 3.40 / 10.01 / 2.42 / 0.08 / 5.25 / 4.84 / 6.44 / 2.32 / 1.28
11 / 4.55 / 4.10 / 6.04 / 3.74 / 1.49 / 5.07 / 4.42 / 6.97 / 2.15 / 0.93 / 4.94 / 5.38 / 3.66 / 4.47 / 2.47
12 / 4.59 / 3.98 / 6.59 / 2.52 / 0.39 / 5.16 / 4.42 / 7.33 / 3.33 / 1.08 / 5.25 / 5.23 / 5.33 / 5.36 / 6.40
13 / 4.84 / 4.29 / 6.68 / 3.74 / 2.11 / 4.70 / 4.66 / 4.83 / 2.74 / 1.08 / 4.94 / 5.30 / 3.88 / 11.62 / 0.64
14 / 4.76 / 5.01 / 3.94 / 5.04 / 5.00 / 4.61 / 5.03 / 3.40 / 4.11 / 2.78 / 4.50 / 5.03 / 3.00 / 8.94 / 2.19
15 / 5.33 / 5.78 / 3.85 / 5.57 / 5.00 / 4.38 / 4.72 / 3.40 / 4.05 / 2.17 / 5.02 / 5.93 / 2.44 / 8.04 / 2.10
16 / 5.18 / 5.95 / 2.66 / 8.55 / 2.66 / 4.75 / 5.03 / 3.93 / 7.70 / 2.17 / 5.22 / 6.20 / 2.44 / 8.22 / 3.84
17 / 5.42 / 5.92 / 3.75 / 8.93 / 7.74 / 4.64 / 5.34 / 2.59 / 10.70 / 0.77 / 5.51 / 6.86 / 1.66 / 7.24 / 5.58
18 / 5.52 / 6.14 / 3.48 / 10.00 / 10.56 / 5.28 / 6.20 / 2.59 / 5.94 / 7.42 / 5.14 / 6.51 / 1.22 / 8.40 / 9.14
19 / 5.10 / 5.23 / 4.67 / 7.48 / 14.54 / 5.07 / 6.04 / 2.23 / 15.21 / 21.42 / 5.57 / 6.90 / 1.78 / 6.43 / 40.04
20 / 5.04 / 5.81 / 2.47 / 9.77 / 13.06 / 5.87 / 7.15 / 2.14 / 7.25 / 18.95 / 6.00 / 7.18 / 2.66 / 6.61 / 15.54

1

Chr Sec / Chr 10 / Chr 11 / Chr 12
Density (%) / Frequency (%) / Density (%) / Frequency (%) / Density (%) / Frequency (%)
Gene / Non-TE / TE / T-DNA /

Tos17

/ Gene / Non-TE / TE / T-DNA /

Tos17

/ Gene / Non-TE / TE / T-DNA /

Tos17

1 / 4.88 / 4.55 / 5.87 / 3.26 / 2.02 / 6.07 / 7.20 / 2.67 / 10.19 / 11.27 / 5.85 / 7.27 / 2.47 / 8.41 / 23.47
2 / 5.01 / 5.14 / 4.61 / 3.56 / 6.15 / 5.45 / 6.59 / 2.02 / 10.83 / 15.15 / 5.87 / 7.17 / 2.78 / 7.19 / 3.27
3 / 4.82 / 4.62 / 5.45 / 3.56 / 1.44 / 4.79 / 5.49 / 2.67 / 4.82 / 15.32 / 5.00 / 6.20 / 2.15 / 5.43 / 8.89
4 / 5.06 / 4.62 / 6.39 / 2.34 / 1.35 / 5.29 / 5.46 / 4.78 / 5.64 / 0.69 / 5.42 / 5.60 / 5.01 / 5.70 / 2.43
5 / 4.82 / 4.30 / 6.39 / 1.42 / 0.29 / 4.99 / 5.06 / 4.78 / 5.64 / 5.42 / 4.91 / 4.36 / 6.21 / 3.24 / 5.20
6 / 5.11 / 3.71 / 9.33 / 2.14 / 0.96 / 4.63 / 4.51 / 4.96 / 5.19 / 3.27 / 4.67 / 4.59 / 4.85 / 4.47 / 0.75
7 / 5.48 / 3.85 / 10.38 / 3.36 / 0.29 / 4.99 / 4.73 / 5.79 / 2.91 / 0.86 / 5.52 / 4.79 / 7.24 / 3.86 / 1.17
8 / 4.98 / 3.95 / 8.07 / 1.83 / 0.67 / 4.42 / 3.96 / 5.79 / 2.91 / 0.34 / 5.09 / 4.32 / 6.92 / 3.16 / 1.01
9 / 4.80 / 4.02 / 7.13 / 3.76 / 0.58 / 5.31 / 3.69 / 10.20 / 3.00 / 0.34 / 4.83 / 3.72 / 7.48 / 2.89 / 0.00
10 / 4.48 / 5.00 / 2.94 / 6.00 / 0.38 / 5.08 / 4.30 / 7.44 / 2.91 / 0.09 / 5.09 / 4.02 / 7.64 / 2.80 / 1.01
11 / 4.90 / 4.34 / 6.60 / 2.24 / 4.04 / 5.18 / 4.09 / 8.46 / 2.09 / 0.34 / 4.43 / 3.59 / 6.44 / 1.75 / 0.17
12 / 4.46 / 4.72 / 3.67 / 5.60 / 1.92 / 4.90 / 4.73 / 5.42 / 4.82 / 1.46 / 4.60 / 3.35 / 7.56 / 2.10 / 0.42
13 / 5.19 / 5.73 / 3.56 / 3.66 / 3.75 / 4.53 / 4.36 / 5.06 / 2.91 / 1.55 / 4.65 / 3.52 / 7.32 / 3.42 / 1.34
14 / 5.01 / 5.45 / 3.67 / 6.61 / 26.54 / 5.06 / 5.55 / 3.58 / 5.00 / 1.20 / 4.79 / 4.59 / 5.25 / 4.29 / 1.84
15 / 5.45 / 5.80 / 4.40 / 7.73 / 3.75 / 4.92 / 4.82 / 5.24 / 4.09 / 7.66 / 5.02 / 5.26 / 4.46 / 6.22 / 1.59
16 / 4.33 / 5.28 / 1.47 / 4.78 / 5.67 / 4.72 / 5.22 / 3.22 / 8.92 / 9.12 / 4.46 / 4.46 / 4.46 / 3.86 / 0.92
17 / 5.17 / 6.29 / 1.78 / 8.34 / 14.71 / 4.88 / 5.34 / 3.49 / 5.64 / 3.70 / 4.98 / 5.50 / 3.74 / 7.27 / 4.69
18 / 5.93 / 6.50 / 4.19 / 8.85 / 1.83 / 4.76 / 4.91 / 4.32 / 3.18 / 3.70 / 4.65 / 5.46 / 2.70 / 7.54 / 13.91
19 / 4.96 / 6.22 / 1.15 / 9.05 / 17.21 / 4.99 / 4.88 / 5.33 / 4.28 / 9.12 / 4.98 / 6.00 / 2.55 / 6.05 / 16.85
20 / 5.17 / 5.91 / 2.94 / 11.90 / 6.44 / 5.04 / 5.12 / 4.78 / 5.00 / 9.38 / 5.19 / 6.20 / 2.78 / 10.34 / 11.06

Each chromosome is divided equally into 20 sections.

T-DNA FSTs were collected from TRIM and OTL libraries.

Frequency = number of T-DNA or Tos17 inserts in each section / total number of T-DNA or Tos17 inserts in each chromosome.

Grey bar indicates the centromeric region of each chromosome.

Densities or integration frequencies:Green, < 1%;black, 1-10%;blue, 10-20%;red, > 20%

1

Table S6 Correlations of integration frequency of T-DNA and Tos17 and transposable elements and non-transposable elements over 12 rice chromosomes

Chromosome /

T-DNA vs. non-TE

/ T-DNA vs. TE / Tos17 vs. non-TE / Tos17 vs. TE
1 / 0.723 / ** / -0.815 / *** / 0.755 / ** / -0.626 / **
2 / 0.607 / ** / -0.695 / ** / 0.663 / ** / -0.365
3 / 0.573 / ** / -0.491 / * / 0.871 / *** / -0.622 / **
4 / 0.787 / *** / -0.816 / *** / 0.595 / ** / -0.638 / **
5 / 0.741 / ** / -0.714 / ** / 0.616 / ** / -0.518 / *
6 / 0.685 / ** / -0.714 / ** / 0.649 / ** / -0.453 / *
7 / 0.807 / *** / -0.773 / *** / 0.691 / ** / -0.623 / **
8 / 0.568 / ** / -0.659 / ** / 0.664 / ** / -0.453 / *
9 / 0.694 / ** / -0.778 / *** / 0.594 / ** / -0.502 / *
10 / 0.817 / *** / -0.673 / ** / 0.537 / * / -0.537 / *
11 / 0.873 / *** / -0.654 / ** / 0.687 / ** / -0.673 / **
12 / 0.835 / *** / -0.830 / *** / 0.710 / ** / -0.725 / **
Overall / 0.730 / *** / -0.702 / *** / 0.638 / *** / -0.527 / ***

TE:transposable element;non-TE:non-transposable element.

Pearson correlation coefficients***: p< 0.0001, **: p< 0.01, *: p<0.05.

Each chromosome is divided equally into 20 sections.

1

Table S7 Examples of putative T-DNA tagged genes in TRIM database

Gene / Knockout genea
(mutant no.) / Activated geneb (mutant no.) / Gene / Knockout genea (mutant no.) / Activated geneb (mutant no.)

Auxin

/

Transcription factors

Synthesis-related

/ 5 / 3 /

MADS

/ 10 / 5
Influx-related / 4 / 2 / Myb / 42 / 38
Responsive / 14 / 12 /

WRKY

/ 9 / 20

GA

/ Zinc finger / 54 / 92

Synthesis-related

/ 8 / 6 / bZIP / 3 / 5
Responsive / 6 / 6 / bHLH / 6 / 7
Ethylene / Ribosomal proteins
Synthesis-related / 1 / 2 / Cytosol / 36 / 58
Receptor / 1 / 0 / Chloroplast / 5 / 5
Responsive / 0 / 9 /
Mitochondria
/ 4 / 2

a T-DNA insertion within 1000 bp upstream from the start codon and 200 bp downstream from the stop codon of putative genes.

b Within 10 kb upstream or downstream of the start codon.

1

Figure S1. Schematic diagrams of constructs pTag4 and pTag8 for gene knockout and activation tagging.

RB and LB, right and left border of T-DNA, respectively; D/A, three putative splicing donor and acceptor sites; GUS, -glucuronidase cDNA; pBS, backbone of plasmid pBluescript; p35S, promoter of cauliflower mosaic virus 35S RNA (CaMV35S) gene; Hph, hygromycin phosphotransferase gene; Nos 3’, terminator of nopaline synthase gene; 35S8xE, eight tandem repeats of the CaMV35S enhancer (total 2.4 kb). Distance between RB and LB of T-DNA is approximately 7.5 kb in Tag4 and 10 kb in Tag8.

Text for Figure S1

Two binary vectors were constructed for T-DNA insertional mutagenesis of the rice genome. Plasmid pTag4 contains multiple splicing donors and acceptors and a promoterless GUS gene located immediately downstream of the right border, and a CaMV35S promoter-hygromycin phosphotransferase (Hph) chimeric gene as a selection marker (Supplementary Figure1a). pTag4 is a bi-functional vector designed for promoter trapping and gene knockout tagging. Plasmid pTag8 has a structure similarto pTag4, except that 8 tandem copies of the CaMV35S enhancer were placed upstream of the left border (Supplementary Figure1b). pTag8 is a tri-functional vectordesignedfor promoter trapping,gene knockout, and gene activation tagging.

1

1

Figure S2. Distribution profiles of gene density, callus-expressed genes and T-DNA integrations are correlated over 12 rice chromosomes.

Graphs for 12 rice chromosomes show from left to right the distance along the pseudomolecule (in Mb), the density of callus EST sites (each peak represents thenumber of EST countson a 100-kb scale), the gene density (white to black color boxes indicate non-transposable element annotated gene density on a 100-kb scale), and the density of T-DNA and Tos17integrations (each peak represents the number of T-DNA or Tos17 integration siteson a 100-kb scale). Numbers of ESTs higher than 100 or numbers of T-DNA and Tos17integrations higher than 50 are indicated at the top of their corresponding peaks. The centromere region, as calculated by the location of CentO clusters (155-bp long CentO repeating hundreds of times) extending 200-kb toward both sides, is indicated as a vertical bar at the left side of the pseudomolecule. The data plotted for T-DNA FSTs are from TRIM and OTL(Sallaud et al., 2004) databases, for callus ESTs are from the NCBI database, for Tos17 FSTs are from Miyao et al.(Miyao et al., 2003), and for 12 rice chromosomes are from the TIGR version 3.0 pseudomolecules of rice cv. Nipponbare.

Text for Figure S2

To determine whether T-DNA insertion into the rice genome was facilitated by transcription during callus culture, a total of 24,438 ESTs derived from cultured rice calli were retrieved from the NCBI database and analyzed. TIGR version 3 rice pseudo-molecules were used to represent 12 chromosomes, and densities of non-TE annotated genes were plotted on all chromosomes. A density graph of callus ESTs at every 100-kb interval was also plotted. A density graph of T-DNA insertions was established by plotting FSTs from the OTL and TRIM T-DNA tagged libraries at 100-kb intervals. Distributions of gene density and T-DNA FSTs were mostly non-uniform across the 12 chromosomes (Supplementary data Figure 1). The pericentromeric regions of the 12 rice chromosomes are highly heterochromatinized (Cheng et al., 2001). Density profiles of T-DNA integration sites followed the distribution of gene density, with frequencies higher at subtelomeric regions and lower at pericentromeric regions of each chromosome. However, significant insertion events at pericentromeric regions were also observed. Regions with high T-DNA integration events showed a high density of callus EST distribution, indicating a high correlation between T-DNA integration and callus gene expression.

1

1

Figure S3. T-DNA integration is less prone to hot and cold spots in the rice genome than Tos17.

Plot of integration events of Tos17 (triangles) and T-DNA (arrowheads) in a BAC clone (accession no AP005862, 118,635 bp in length) located at 85 cM of chromosome 9. The upper panel illustrates the entire BAC region and the lower panel illustrates the zoom-in Tos17 integration events in the 29-44-kb region of this BAC clone. Every vertical line in the upper and lower panel represents 1- and 0.1-kb regions, respectively. An upright triangle or arrowhead indicates forward orientation of integration and inverted triangle indicates reverse orientation of integration.

1

1

Figure S4. T-DNA is integrated into regions with broader GC contents as compared with Tos17.

The GC content within 120 bp of T-DNA and Tos17 integration points and 120 bp of randomly selected rice genomic DNA fragments was calculated, and the frequency plotted at 5 % intervals. The frequency distribution of FSTs from TRIM (11,992 entries) and OTL (7,480 entries) databases are shown as red and yellow lines, respectively; FSTs from Tos17 (18,024 entries) database is shown as a blue line, genomic sequence is shown as a green line.

Text for Figure S4

To determine whether there is any difference in the GC content in integration regions of these two elements, GC contents in 120-bp regions, centered by T-DNA or Tos17 integrations site, were determined. FSTs from TRIM, OTL, and Tos17 databases were analyzed. The frequency of GC content (y axis) was plotted against 5 % GC content intervals in the rice genome (x axis). Tos17 integration (blue line)mainly occurred atregions of 35-45 % GC content,with almost no integration in regions of low (< 20 %)or high (> 70 %) GCcontent. In contrast, T-DNA integration (red and yellow lines) centered atregions of 35-50 % GC,extending to regions with low (15-20 %) or high (75-80 %) GC content. There was almost no integration of T-DNA at regions with GC content <10% or >85%. These analyses indicated narrower GC contents in Tos17 integration sites than T-DNA.

For comparison with the GC frequency distributions of T-DNA and Tos17, similar analysis for rice genomic sequences was also performed. Twenty thousand 120-bp rice genomic DNA fragments were randomly selected from 12 rice chromosomes and the GC content in each fragment was calculated. The GC frequency distribution of genomic sequences (green line) centered at 35-45 % GC content regions. The average percentage GC content and standard deviation in integration regions of Tos17, T-DNA in OTL library, T-DNA in TRIM library and random rice genomic DNA sequences were 41.33±6.94, 43.96±9.74, 47.53±10.55 and 43.53±13.02, respectively. These analyses indicated that, although the average GC content of Tos17 and T-DNA integration regions were similar, the frequency distribution pattern of GC content for Tos17 FSTs was narrower and more symmetrical, while those for the rice genomic DNA and two T-DNA FSTs were broader and shifted toward higher GC content regions.

1

1