Proofreading and Secondary Structure Processing Determine the Orientation Dependence of CAG·CTG Trinucleotide Repeat Instability in Escherichia coli (Supplementary Information)

Logistic regression analysis of CAG.CTG repeat instability
CAG repeats

A logistic regression model was fitted to the CAG repeat groups. All groups have been compared directly to DL1995 (CAG)75. The t-probabilities indicate whether the particular group has a significantly higher or lower instability proportion than (CAG)75. The final column is the log odds ratio comparing each group with (CAG)75, and is a measure of the relative instability. Values greater than 1 indicate more instability than (CAG)75 and vice versa.

Summary of analysis

mean deviance approx

Source d.f. deviance deviance ratio F pr.

Regression 6 160.2 26.692 15.23 <.001

Residual 305 534.6 1.753

Total 311 694.8 2.234

Dispersion parameter is estimated to be 1.75 from the residual deviance.

Estimates of parameters

antilog of

Parameter estimate s.e. t(305) t pr. estimate

Constant 1.890 0.179 10.56 <.001 6.619

DL2301 (dnaQ) -0.617 0.231 -2.67 0.008 0.5396

DL2302 (mutS) -0.464 0.235 -1.97 0.050 0.6287

DL2303 (sbcCD) 0.653 0.293 2.23 0.027 1.921

DL2976 (dnaQ sbcCD) -0.846 0.226 -3.75 <.001 0.4291

DL2250 (CAG45) 3.57 1.22 2.93 0.004 35.49

DL2639 (CAG84) -0.477 0.235 -2.03 0.043 0.6204

Parameters for factors are differences compared with the reference level:

Factor Reference level

Group DL1995 (CAG75)

CTG repeats

A logistic regression model was fitted to the CTG repeat groups. All groups have been compared directly to DL2009 (CTG)95. The t-probabilities indicate whether the particular group has a significantly higher or lower instability proportion than (CTG)95. The final column is the log odds ratio comparing each group with (CTG)95, and is a measure of the relative instability. Values greater than 1 indicate more instability than (CTG)95 and vice versa.

Summary of analysis

mean deviance approx

Source d.f. deviance deviance ratio F pr.

Regression 6 220.7 36.780 25.94 <.001

Residual 281 398.5 1.418

Total 287 619.2 2.157

Dispersion parameter is estimated to be 1.42 from the residual deviance.

Estimates of parameters

antilog of

Parameter estimate s.e. t(281) t pr. estimate

Constant 3.505 0.323 10.85 <.001 33.29

DL2445 (dnaQ) -2.316 0.348 -6.66 <.001 0.09871

DL2300 (mutS) 0.158 0.589 0.27 0.788 1.172

DL2104 (sbcCD) 0.158 0.475 0.33 0.739 1.172

DL3046 (dnaQ sbcCD) -1.704 0.359 -4.75 <.001 0.1820

DL1994 (CTG48) 1.95 1.13 1.73 0.085 7.058

DL2305 (CTG140) -1.481 0.365 -4.06 <.001 0.2275

Parameters for factors are differences compared with the reference level:

Factor Reference level

Group DL2009 (CTG95)