Quantitative Structure Activity Relationship (QSAR) studies: QSAR studies were performed using the TSAR 3.0 suite of programs (Accelrys, San Diego, CA) on 60 mono-indolyl-quinones. Substitutions were considered on 5 indole positions (2, 4, 5, 6, and 7). The numbering of substituents for the QSAR studies is as follows: substituent 1 corresponds to the 2-position on the indole; substituent 2, the 4-position; substituent 3, the 5-position; substituent 4, the 6-position; and substituent 5, the 7-position. We excluded substitutions on the indole nitrogen and benzoquinone hydroxyls as only two compounds (ZL-194 and LD-I-206) had non-hydrogen substitutions at these positions. We also excluded compounds with different quinone ring structures. Nineteen theoretical parameters were considered for each substituent, including PI (aromatic), molecular refractivity (aromatic), Verloop steric parameters (L, B1, B2, B3, B4), molecular mass, molecular surface area, molecular volume, moment of inertia (size 1,2,3; length 1,2 3), ellipsoidal volume, MlogP, Weiner coefficient. Five empirical parameters were considered for each substituent including Swain and Lupton F, Swain and Lupton R, sigma Meta, sigma Para, and Taft ES. TSAR was also used to perform molecular modeling studies of the compounds using the CORINA algorithm (Accelrys, San Diego, CA). These 3D structures were then charge optimized using COSMIC (Accelrys, San Diego, CA). Parameters were calculated for the whole molecules based on these structures. All compounds were included in the analysis of whole molecule descriptors as substitutions were not explicitly considered. We used 16 whole molecule descriptors, including molecular mass, molecular surface area, molecular volume, ellipsoidal volume, lipole moments, moments of inertia (size and length), and log P to generate QSAR models.

Multiple regression models were generated using F-stepping. Initially, models were limited to linear regression. Cross validation by leaving out each of three groups of data or leaving out individual rows with multiple random selections was used to test predictive power (r2(CV)). The F-enter and F-leave parameters were systematically varied to optimize the model without including too many parameters. Partial F-values were inspected to look for obvious cut-offs. For this analysis, we considered theoretical, empirical and whole molecule parameters separately. Twenty-five independent variables were considered and 8 variables were included in the final linear regression model using F-test stepping (F to enter = 5, F to leave = 5). Cross-validation by leaving rows of data in multiple random repeats was performed to test predictive power. After running linear regression models, the models were re-run but allowing the parameters to vary up to the 4th power. No increase in fit was observed with non-linear models, so non-linear regression models were subsequently excluded. The predicted values were correlated against the empirical values. The resulting model for IR activation using theoretical parameters shows a high correlation between predicted and measured activity and explains 94% of the variance in the data (R2 = 0.9369). The empirical and whole molecule parameters gave less predictive models (R2 = 0.7637 and 0.4398 respectively). The significant individual terms in the theoretical parameter model include MlogP, inertia moment 3 size, and Verloop B3/B4 on substituent 7, and Verloop B2/B3, MlogP, and inertia moment 2 length on substituent 6

Forward feed neural networks were generated using the IR activity as measured by ELISA as target parameter and the substituent or whole molecule parameters as variables. The ratio of variables to samples must be approximately 1:2 to prevent over-fitting of the data. As the reduced monoquinone library only contained 47 members, we wanted to limit the initial theoretical variables considered to 20-25 from the potential 95 variables (19 parameters, 5 positions). Data reduction was achieved by generating correlation coefficients between all potential variables. Variables with correlation coefficients greater than 0.90 were considered redundant and all but one were eliminated. We chose the 5 variables with the lowest correlations. Of the nineteen theoretical variables, the most divergent were Verloop L, molecular surface area, Weiner coefficient, inertia moment 1 length, and inertia moment 1 size so these three variables were included for the theoretical parameter model. For the empirical parameters, we chose Swain and Lupton F, Swain and Lupton R, sigma-meta, sigma-para on all positions and Taft-ES on substituent 5. Two hidden nodes were predicted. The test RMS fit of the model was calculated by randomly excluding 10% of the data at each cycle and calculating the model fit. After modeling, the predicted values were plotted against the empirical values and correlation coefficients generated. The best RMS fit of the model was compared to the test RMS fit to gauge the predictive value of the model; the RMS values should be similar for a stable predictive model. The parameter dependencies were determined by systematically varying each parameter and measuring the effect on the output variable. The neural network models for all three classes of parameters showed higher correlations than the linear regression models (R2 = 0.965, 0.900, and 0.7986 respectively). Here again the model using the theoretical parameters gave a more predictive model than the empirical or whole molecule parameter models. For the theoretical parameters, 25 initial parameters were included and the resulting net configuration was 25-2-1. The best model RMS fit was 0.047 after 1720 cycles of training compared to the test RMS fit of 0.090 indicating good predictive power. The major dependencies were analyzed. The parameters showing the greatest dependencies were Weiner coefficient, inertia moment length 2, Verloop L, inertia moment length 1, inertia moment size 1 on substituent 7, and the Weiner coefficient and inertia moment size 1 on substituent 6. The empirical parameter model was also predictive (test RMS 0.116, model RMS 0.09) but the whole molecule model was not predictive (test RMS 0.260, model RMS 0.095).

Figure S1. Glycogenolytic effect of DAQ-B1: A, 3T3-L1 adipocytes were stimulated with insulin (0.2, 1.7, 8.3 nM), DAQ-B1 (3-300 µM) or both for 2 h in the presence of 14C-glucose. Glycogen was extracted and precipitated counts measured. Asterisks indicate statistical significance relative to basal incorporation, # indicates significance relative to insulin stimulated values. B, 3T3-L1 adipocytes were stimulated with insulin (1.7 nM), DAQ-B1 (30 µM), lithium chloride (10 or 50 mM) or combinations thereof for 2 h in the presence of 14C-glucose. Glycogen was extracted and precipitated counts measured as before. Asterisks indicate statistical significance relative to basal incorporation, # indicates significance relative to insulin stimulated values. C, 3T3-L1 adipocytes were stimulated with insulin (8.3 nM) or DAQ-B1 (30 µM) or both for 2 h. Glycogen synthase activity was measured in cell lysates following published protocols (Metabolism 49, 962-968, 2000). Asterisks indicate statistical significance relative to basal values. D, 3T3-L1 adipocytes were pre-loaded with 14C-glucose in the presence of insulin for 2 h. Label was washed away then the cells were stimulated with agonists (8.3 nM insulin, 30 µM DAQ, 10µM forskolin, 100nM isoproternol) for 30 min. Glycogen was extracted and precipitated counts measured as before. Asterisks indicate statistical significance relative to basal incorporation. E, 3T3-L1 adipocytes were stimulated with insulin (1.7 or 8.3 nM) or DAQ-B1 (30 µM) for 15 min at 37˚C in triplicate. Cells were lysed and extracts blotted for phosphoGSK3(Ser9/21).