SUPPLEMENTARY MATERIAL

S1 methods

S1.1 Extracting zinc-binding sites from PDB

We downloaded 2506 PDB entries that contain zinc and are of less than 95% mutual sequence identity. Redundant chains were removed from the downloaded structures. Then zinc sites that are with multiple conformations, of metal atom occupancy < 0.5, or of B factor > 90 were considered as “abnormal” and discarded. For the remaining zinc sites, zinc ligand residues containing a zinc-coordinating atom within 2.8 Å from the zinc ion were identified. Sites for which less than three coordinating atoms could be found from the protein were also excluded. This is similar to the approach of (Torrance, et al., 2008). The reasoning is that such sites are mostly located on the surface of the respective proteins and lack link to biological functions. The resulting dataset consisted of 2869 zinc-binding sites from 1965 PDB entries.

S1.2 Defining templates of zinc-binding residue pairs

To extract information needed by our prediction method from the above experimentally determined zinc-binding sites, a pair of zinc ligand residues contained in the same zinc site was used as a basic unit or a “template”. All possible templates were extracted from the above dataset of zinc sites. These templates fall into different chemical types depending on the types of their composing residues. For example, a template formed by two histidine residues is of type H-H, and so on. In addition, the geometry of a template composed of residues A and B was represented by the following four parameters,

d or the distance between their Cα atoms,

A or the angle formed by the Cβ atom of residue A, the Cα atom of residue A and the Cα atom of residue B,

B or the angle formed by the Cβ atom of residue B, the Cα atom of residue B and the Cα atom of residue A,

ω or the torsional angel formed by the Cβatom of residue A, the Cα atom of residue A, the Cα atom of residue B, and the Cβ atom of residue B.

The above four parameters completely specify the relative positions of the Cα and Cβ atoms of the two composing residues for a template.

S1.3 Constructing a library of representative templates

In our prediction method, each of all the possible residue pairs in a target protein needs to be compared with each of the template residue pairs. We found that the speed of the method could be increased without losing accuracy if the original 13,368 templates were replaced with a smaller number of representative templates. A library of representative templates were constructed by first grouping the original templates based on their chemical types and geometries, and then retaining only one representative for each group. Templates that had been grouped together were of the same chemical type. For each chemical type, the geometric properties are divided into uniform bins that cover the range [pmin, pmax], here p corresponds to one of {d, A, B, ω}, and pmin and pmax corresponds to the minimum and maximum values, respectively, of the corresponding property among all templates of the same chemical type. Templates in the same group have geometric properties that fall into the same bins. The half widths of the bins for each geometric property of each template type are given inSupplementary Table S3. These resulted in a reduced template library of 8011 representative templates. The widths of the bins had been determined based on that the accuracy of predictions were not affected by using the reduced template library instead of the complete one.

S1.4 The training dataset and the independent test dataset

To perform an unbiased test of the prediction method, the target protein must be unrelated to any proteins used to train the method or any of the proteins contributing to the template library. To achieve this, the target proteins used to train the parameters in our prediction method (the training dataset) and those used to test the final model (the independent test dataset) had been chosen in the following way.

First, the proteins in the 1965 zinc-containing PDB entries were clustered at a sequence identity cutoff of 30% after multiple sequence alignment. Then from each cluster, one representative protein chain was culled using the CD-HIT program (Li and Godzik, 2006). The resulting 438 representative proteins were then randomly divided into a training dataset of 338 protein chains and an independent test dataset of 100 protein chains. Finally, all templates contributed from proteins contained in the independent test dataset as well as from all the homologs of these proteins (based on the 30% sequence identity cutoff) were removed from the reduced template library. There remained 6529 templates in the library.We note that the training dataset contains 472 and the independent test dataset contains 136 experimentally determined zinc-binding sites, not including sites containing ligand residues from different peptide chains.To check to what extent our method is sensitive to the presence of templates from proteins thatare structurally similar but with low sequence similarity (< 30% sequence identity) toa test protein, we have also performed tests in which templates contributed from all the structural analogs (Dali Z-score ≥ 10) of a given test proteins were disregarded. The performance of TEMSP did not decline with a sensitivityof 81.6%, a selectivity of 95.7% and an average IoUR maintain of 0.943.

S1.5 Predicting zinc-binding sites

The prediction process includes two steps. The first step is template match, and the second is residue pair combination.

S1.5.1 The template match step In this step, candidate pairs of zinc ligand residues in the target protein are detected based on that a candidate must “match” at least one template contained in the template library. A match is defined when the target pair and the template are of the same chemical type and the differences between their respective geometric parameters are all smaller than the cutoffs given in Supplementary Table S3. These cutoffswere treated as parameters of the method and have been determined by optimizing the performance of the method on the training dataset using a ROC curve-based criterion (see “Training and testing”).For a target protein, all possible pairwise combinations of its residues that can act as zinc ligands (i.e., residues of type Cys, His, Asp, or Glu) were tested, one by one, to determine whether a particular pair can match any template. We note that matching with self templates have been excluded, although the template library contained template pairs from target proteins in the training dataset. For a detected candidate pair and each of its matching templates, the structure of the template was transformed to superpose the Cα and Cβ atoms of its two residues onto the corresponding atoms of the two candidate residues. The transformed positions of the remaining atoms contained in the template, including the side chain atoms and the zinc ion, provided the basisto predictatomic positions for the zinc-coordinated target protein.

S1.5.2 The residue pair combination step In this step, possible pairwise combinations of candidate ligand residue pairs (i.e., pair-of-pairs)detected in the first step are tested using the following filters which have been defined using the predicted atomic positions for the two candidate pairs based on their matching templates. (i) The zinc distance filter. The distance between the two zinc positions, one predicted for each of the two candidate pairs, should be smaller than 1 Å. (ii) The coordination geometry filter. To allow tetrahedron coordination, all of the predicted ligand atom-zinc-another ligand atom angles should be in the range between 73° and 145°. To calculate these angles, the position of the zinc was taken as the average of the two predictions based on the two candidate pairs. (iii) The side chain-main chain clash filter. The predicted positions of the side chain atoms of the zinc ligand residues should not lead to any significant steric clash with any main chain atom. Clashes with other side chain atoms were not considered.

For the pair-of-pairs combinations passing the above filters, a numerical score, T-deviation, is empirically defined as the sum of the distance between the two zinc positions, each predicted for one candidate ligand residue pair in the pair-of-pairs, and the deviationsbetween the geometries of the target residue pairs and the template,

(2)

in which

(3)

The distance DistZnis in angstrom and thus falls in the range 0-1. The cutoffs pcutoff were used to define the match between candidate pairs and templates (Supplementary Table S3), so the total contribution of the geometric deviations to T-deviation also falls between 0-1. Thus T-deviation falls between 0-2, with smaller values representing better geometric match between the candidate pairs and their respective templates, as well as better consistency for the pair of candidate ligand residue pairs to form one zinc-binding site. In our final model, any pair-of-pairs with a T-deviation score above 1.5 was discarded. The remaining pair-of-pairs consisted the final predicted zinc-binding sites. By this definition, one predicted zinc site contains only one zinc ion with three or four ligand residues. A binuclear or multinuclear zinc site with more ligand residues could be predicted as two or multiple zinc sites having shared ligand residues.

In the residue pair combination step, the threshold values for the various filters as well as for the final T-deviation score have all been treated as parameters of the method. They have again been determined by optimizing the performance of the method on the training dataset using a ROC curve-based criterion (see below).

S1.6 Training and testing

The parameters mentioned above have been optimized using proteins in the training set as targetsanda ROC curve-based objective index. The parameters were optimized one by one, i.e., one parameter was scanned with the remaining parameters taken some fixed values to produce a curveof sensitivity versus (1-selectivity).An optimal value for the parameter was obtained when the Youden's index (sensitivity – (1-selectivity)) reached a maximum. The process was manually iterated, including reoptimization of some parameters with the other parameters taken their optimized, new values. The final model corresponds to at least a local minimum in the parameter space, with each of the final values of all the parameters corresponds to the optimum with the maximum Youden's index if the other parameters are fixed at their final values. We found thatthe performance is most responsive to DistZn, the distance cutoff between the two zinc positions, each predicted for one candidate ligand residue pair in the pair-of-pairs. And the optimization of these parametersled to significant improvements in both sensitivity and selectivity (about 5%, absolute changes) as compared with various initially “guessed” parameter values.

Testing were performed using proteins in the independent test dataset as targets. These structures are actually in their holo states. Another set of tested targets were the proteins in the CHED test set, which are structures in the apo states. We note that the test set in ref (Babor, et al., 2008) contained 27 protein structures. However, close inspection of the corresponding holo state structures (Dataset 2, Supplementary Material of ref (Babor, et al., 2008)) of these target proteins indicated that the “zinc-binding” sites in three structures were abnormal and doubtable. These included 1ENR, in which the occupancy of zinc atom as given in the PDB data is actually zero, 1GL4, in which the relative geometry between the zinc ion and the ligand His515 is abnormal, and 1L0Y, in which the zinc ion is shown to coordinate with the Nε2 (instead of Oε1) of Gln215. The corresponding “apo” structures of these three problematic holo structures have been excluded from the CHED test set, resulting in a test set of 24 chains containing 25 zinc sites in their apo states (Supplementary Table S1).

S1.7 Functional inference of the predicted zinc sites in proteins of “Unknown Function”

The predicted zinc siteswere empirically judged whether it could be potential “catalytic sites” or “structural sites”, based on geometry criterion.For example, TEMSP may find only three ligands occupying three vertices of a tetrahedron (possibly corrected with visual inspections), and a water molecule may serve as the fourth ligand occupying the remaining empty vertex. Such a zinc site is often associated with certain catalytic activity, especially when an equally conserved auxiliary chemical group (e.g., a carboxylate close to the water molecule) can be found close by. This type of predicted zinc sites may be categorized as “potential catalytic sites”. Another type of sites comprises of only protein residues as zinc ligands and lacksany space for a catalytically important water. Thesemay usually be characterized as “potential structural sites”.

As for any of these 45 proteins, we found almost no direct information from literature about its possible function, and then we tried to investigate whether the inferences could be supported by any indirect evidence. Briefly, for each of the protein, we searched the structure database using Dali (Holm, et al., 2008). If in any of the Dali hits (i.e., structurally similar proteins of the targets as according to Dali), the site corresponding to the one predicted in the target protein contained a metal ion coordinated with the same type of ligand residues as the target protein, then theliterature of the Dali hit was checked to look for possible biochemical function associated with the metal site. By analogy, it may be inferred that the predicted zinc site in the target protein could be associated with the same type of biochemical function. The functions inferred in such a way are listed as “Possible function related to the predicted zinc site” in Supplementary Table S5. The PDB ID of the most relevant Dali hits and related literatures, if exist for any target protein, are also listed in this table. As we cannot be certain whether zinc would be the native metal substrate in many of these inferred functions, we have used the terms “metalloenzymes” or “metalloproteins” instead of “zinc enzymes or zinc proteins” in the table.

Supplementary Figure S1:


Fig.S1. Comparisons between local conformations of the zinc binding site in 1RDZ_A (apo state, green carbon) and1FRP_A (holo state, graycarbon). Side chains predicted by TEMSP are shown with carbon in salmon. Ligand residues for zinc are shown as thicker sticks. A zinc-coordinating water in 1FRP_Ais shown as a small red sphere. The two partially overlapping large spheres represent zinc in 1FRP_A (gray) and in the structure predicted by TEMSP (white), respectively.

Supplementary Table S1:

Note: Entries are described by their PDB ID followed by chain identifier.

training dataset (338 chains)
1A5T_A, 1ADT_A, 1AJY_A, 1AK0_A, 1B66_A, 1BBO_A, 1BF6_A, 1BI0_A, 1BOR_A, 1C3R_A, 1CO4_A, 1CTT_A, 1CU1_A, 1D4U_A, 1DGZ_A, 1DO5_A, 1DX8_A, 1DY1_A, 1DYQ_A, 1E4U_A, 1EH6_A, 1EU4_A, 1F6U_A, 1FAQ_A, 1FBV_A, 1G25_A, 1GL4_A, 1H1Z_A, 1HP7_A, 1HR6_B, 1I3J_A, 1I6N_A, 1IA9_A, 1J3G_A, 1J7N_A, 1JJD_A, 1JK0_A, 1JM7_A, 1JM7_B, 1JN7_A, 1JOC_A, 1JQG_A, 1JW9_B, 1K24_A, 1K6Y_A, 1L1T_A, 1LBU_A, 1LLM_C, 1LPV_A, 1M9O_A, 1NCS_A, 1NZJ_A, 1ODH_A, 1OHT_A, 1OZJ_A, 1P4Q_B, 1P5D_X, 1P7M_A, 1PCX_A, 1PL8_A, 1PY0_A, 1Q14_A, 1Q2L_A, 1QWR_A, 1QX0_A, 1R1H_A, 1R79_A, 1RLY_A, 1RQG_A, 1RUT_X, 1RYQ_A, 1S0U_A, 1SE0_A, 1SRK_A, 1SU0_B, 1T4W_A, 1T8H_A, 1TAQ_A, 1U5K_A, 1UL4_A, 1UWY_A, 1V33_A, 1V5R_A, 1V87_A, 1VD4_A, 1VK6_A, 1VQ0_A, 1VQ2_A, 1VSR_A, 1VYX_A, 1W57_A, 1WD2_A, 1WEP_A, 1WEQ_A, 1WEV_A, 1WFE_A, 1WFF_A, 1WII_A, 1WIM_A, 1WIR_A, 1WJ2_A, 1WJP_A, 1WJV_A, 1WKQ_A, 1WPK_A, 1WWF_A, 1X31_D, 1X3C_A, 1X3Z_A, 1X6M_A, 1XA6_A, 1XER_A, 1XJH_A, 1XRT_A, 1XTO_A, 1Y02_A, 1Y0J_B, 1Y79_1, 1YG9_A, 1YLK_A, 1YUJ_A, 1Z05_A, 1Z3I_X, 1Z8R_A, 1ZE9_A, 1ZR9_A, 1ZU1_A, 1ZY7_A, 2A0B_A, 2A25_A, 2AKL_A, 2AQP_A, 2AU3_A, 2AYJ_A, 2B5L_C, 2B5W_A, 2BAI_A, 2BJR_A, 2BNM_A, 2BZ1_A, 2C1I_A, 2CJS_C, 2CKL_A, 2CKL_B, 2CON_A, 2CRW_A, 2CS2_A, 2CS8_A, 2CSH_A, 2CT7_A, 2CTT_A, 2CUP_A, 2CZR_A, 2D5B_A, 2D8R_A, 2D8S_A, 2D9H_A, 2DIP_A, 2DJR_A, 2DKT_A, 2DMI_A, 2DPH_A, 2DRP_A, 2E2Z_A, 2E5R_A, 2E61_A, 2E6I_A, 2EBV_A, 2ECG_A, 2ECT_A, 2ECW_A, 2EGM_A, 2EK8_A, 2ELI_A, 2ELM_A, 2ELN_A, 2ELP_A, 2EO4_A, 2EQE_A, 2EXU_A, 2F8B_A, 2F9Y_B, 2FE3_A, 2FGY_A, 2FK4_A, 2FNF_X, 2FR5_A, 2G0D_A, 2GHF_A, 2GTQ_A, 2GVI_A, 2H1N_A, 2H6L_A, 2HF1_A, 2HJN_A, 2HVY_C, 2HZ8_A, 2I0M_A, 2I50_A, 2I5O_A, 2IDA_A, 2IHX_A, 2ISW_A, 2IXD_A, 2J6A_A, 2JIG_A, 2JM1_A, 2JM3_A, 2JMO_A, 2JOX_A, 2JR7_A, 2JUN_A, 2JVX_A, 2K0A_A, 2K16_A, 2K1P_A, 2K2D_A, 2K4X_A, 2K5C_A, 2K7R_A, 2K9H_A, 2KAE_A, 2KAK_A, 2KDP_A, 2KDX_A, 2KI7_B, 2KKH_A, 2KN9_A, 2KQ9_A, 2KQB_A, 2KR1_A, 2NUT_A, 2O3E_A, 2ODD_A, 2ODX_A, 2OH3_A, 2OLM_A, 2OSO_A, 2OSV_A, 2OWA_A, 2OWO_A, 2OZU_A, 2P09_A, 2PEB_A, 2PG3_A, 2PGF_A, 2PPT_A, 2Q7S_A, 2QFA_A, 2QGP_A, 2QKD_A, 2QSW_A, 2R6M_A, 2RI7_A, 2RIQ_A, 2RMN_A, 2ROW_A, 2RPR_A, 2RPZ_A, 2UZ9_A, 2UZG_A, 2V0C_A, 2V9K_A, 2VL6_A, 2VR2_A, 2VRD_A, 2W0T_A, 2W3Q_A, 2WAD_A, 2WJY_A, 2WKX_A, 2WWY_A, 2YQP_A, 2YRE_A, 2YRK_A, 2YSA_A, 2YSM_A, 2YU4_A, 2YV5_A, 2YVR_A, 2ZE7_A, 2ZNR_A, 2ZZE_A, 3ALC_A, 3BAL_A, 3BK2_A, 3BO5_A, 3BOC_A, 3BOF_A, 3BQ5_A, 3BVU_A, 3C10_A, 3C37_A, 3C5K_A, 3CG7_A, 3CSQ_A, 3D00_A, 3D68_A, 3DI4_A, 3DPL_R, 3DRA_B, 3DXT_A, 3DZY_A, 3DZY_D, 3E1Z_A, 3E6U_A, 3EH2_A, 3EO3_A, 3EPQ_A, 3F6Q_B, 3FEH_A, 3FW3_A, 3GI1_A, 3GIP_A, 3H1W_A, 3H3E_A, 3H6L_A, 3HTK_C, 3I2D_A, 3ICJ_A, 3IEH_A, 3IO2_A, 3IT7_A, 3IUF_A, 3IUU_A, 3JU2_A, 3JWP_A, 3K9T_A, 3KDE_C, 3KHI_A, 3KNV_A, 3KSV_A, 3L0A_A, 3L11_A, 3LGD_A, 3LQ0_A, 3LY0_A, 5GAT_A
independent test dataset(100 chains)
1B8T_A, 1CL4_A, 1DVP_A, 1E7L_A, 1EPW_A, 1KWG_A, 1LBA_A, 1LFW_A, 1LI5_A, 1LML_A, 1LV3_A, 1ML9_A, 1N0Z_A, 1NEE_A, 1OHL_A, 1P9R_A, 1PG5_B, 1PXE_A, 1PZW_A, 1QWY_A, 1QYP_A, 1R42_A, 1R44_A, 1R6O_A, 1RMD_A, 1S3G_A, 1T3K_A, 1TON_A, 1TOT_A, 1TXL_A, 1UUF_A, 1UW0_A, 1V5N_A, 1VJE_B, 1WEO_A, 1WEU_A, 1WFK_A, 1WIL_A, 1XB8_A, 1Y13_A, 1ZFD_A, 1ZW8_A, 258L_A, 2C6A_A, 2CQE_A, 2CS3_A, 2CSY_A, 2D6F_C, 2DAR_A, 2E5S_A, 2E9H_A, 2ELO_A, 2ELU_A, 2EN8_A, 2EOD_A, 2F3B_A, 2FC6_A, 2FU5_A, 2GFO_A, 2GQJ_A, 2GX8_A, 2HJH_A, 2HSI_A, 2I1O_A, 2I9W_A, 2IJD_1, 2IMR_A, 2J2S_A, 2J4X_A, 2J9U_B, 2JKS_A, 2JZ8_A, 2KE1_A, 2NLY_A, 2P53_A, 2PW6_A, 2QWX_A, 2R3A_A, 2RF5_A, 2RO1_A, 2VY5_A, 2WFQ_A, 2YRT_A, 2ZKL_A, 3A32_A, 3BKF_A, 3BLD_A, 3BYR_A, 3CE2_A, 3COQ_A, 3CWW_A, 3GA8_A, 3GL6_A, 3HRU_A, 3I9F_A, 3IFU_A, 3IR9_A, 3IRB_A, 3ISZ_A, 3L2Q_A
CHED test set (25 chains)
Apo / 1arl_A, / 1ozt_M, / 1emv_B, / 1qlp_A, / 1qov_H, / 1et9_A, / 1xla_A, / 1fnu_A,
Holo / 1m4l_A, / 1mfm_A, / 1fr2_B, / 1hp7_A, / 1dv6_A, / 1eu4_A, / 1xll_A, / 1l0y_B,
Apo / 1om6_A, / 1i60_A, / 1iad_A, / 1lt7_A, / 3enl_A, / 1e65_A, / 1k0f_A, / 1pta_A,
Holo / 1g9k_A, / 1i6n_A, / 1ast_A, / 1lt8_A, / 4enl_A, / 1e67_A, / 1toa_A, / 1i0d_A,
Apo / 1rdz_A, / 1et6_A, / 2cbe_A, / 1ilw_A, / 1ksp_A, / 1bi1_A, / 1c3p_A, / 1k6k_A,
Holo / 1frp_A, / 1eu3_A, / 1moo_A, / 1im5_A, / 2kfn_A, / 1bi0_A, / 1c3r_A, / 1mbx_A,
Note: Three structures were excluded from CHED apo test set: 1ENR, in which the occupancy of zinc atom is actually zero, 1GL4, in which the relative geometry between the zinc ion and the ligand His515 is abnormal, and 1L0Y, in which the zinc ion is shown to coordinate with the Nε2 (instead of Oε1) of Gln215.
1888 protein structures from PDB classifiedas of “Unknown Function” and determined by X-ray diffraction with resolutionshigherthan 2.5 Å.
1DI6_A, 1DI7_A, 1DM5_A, 1EW4_A, 1F89_A, 1FUX_A, 1G2R_A, 1HQQ_AE, 1HRU_A, 1HTW_A, 1HXL_AC, 1HXZ_AC, 1HY2_AE, 1I36_A, 1I60_A, 1I6N_A, 1I9H_A, 1IHN_A, 1IJ8_A, 1ILV_A, 1IN0_A, 1IUJ_A, 1IUK_A, 1IUL_A, 1IXL_A, 1IZM_A, 1J27_A, 1J2R_A, 1J2V_A, 1J31_A, 1J3M_A, 1J3W_A, 1J5U_A, 1J74_A, 1J7D_AB, 1J8B_A, 1J9J_A, 1J9K_A, 1J9L_A, 1JAL_A, 1JO0_A, 1JOG_A, 1JOV_A, 1JRK_A, 1JSX_A, 1JYH_A, 1JZT_A, 1K26_A, 1K2E_A, 1K3R_A, 1K4N_A, 1K77_A, 1K7J_A, 1K7K_A, 1KJN_A, 1KK9_A, 1KON_A, 1KQ3_A, 1KQ4_A, 1KR4_A, 1KUU_A, 1KYH_A, 1KYT_A, 1L0B_A, 1L1S_A, 1L5X_A, 1L6R_A, 1LCV_A, 1LCW_A, 1LCZ_A, 1LDO_A, 1LJO_A, 1LPL_A, 1LXJ_A, 1LXN_A, 1M1S_A, 1M33_A, 1M3S_A, 1M65_A, 1M68_A, 1M98_A, 1MK4_A, 1MOG_A, 1MW5_A, 1MW7_A, 1MWQ_A, 1MWW_A, 1MZG_A, 1N1Q_A, 1N81_A, 1NC5_A, 1NC7_A, 1NE2_A, 1NE8_A, 1NF2_A, 1NG6_A, 1NI9_A, 1NIG_A, 1NIJ_A, 1NJH_A, 1NJK_A, 1NJR_A, 1NKQ_A, 1NMN_A, 1NMO_A, 1NMP_A, 1NNH_A, 1NNQ_A, 1NNW_A, 1NNX_A, 1NO5_A, 1NOG_A, 1NPD_A, 1NPY_A, 1NQM_A, 1NQN_A, 1NRI_A, 1NS5_A, 1NU0_A, 1NX4_A, 1NX8_A, 1NXJ_A, 1NXZ_A, 1NY1_A, 1NYE_A, 1NZA_A, 1NZJ_A, 1NZN_A, 1O13_A, 1O1Y_A, 1O22_A, 1O3U_A, 1O4T_A, 1O4W_A, 1O50_A, 1O5J_A, 1O5U_A, 1O61_A, 1O62_A, 1O65_A, 1O69_A, 1O6A_A, 1O6D_A, 1O89_A, 1ON0_A, 1OQ1_A, 1ORU_A, 1OSC_A, 1OYZ_A, 1OZ9_A, 1P1L_A, 1P1M_A, 1P5F_A, 1P8C_A, 1P99_A, 1P9I_A, 1P9Q_C, 1PB0_A, 1PBJ_A, 1PG6_A, 1PQY_A, 1PT5_A, 1PT7_A, 1PT8_A, 1PUG_A, 1PV5_A, 1PVM_A, 1Q2Y_A, 1Q4R_A, 1Q7H_A, 1Q8B_A, 1Q8C_A, 1Q9U_A, 1QVV_A, 1QVW_A, 1QVZ_A, 1QW2_A, 1QY9_A, 1QYA_A, 1QZ4_A, 1R0U_A, 1R3D_A, 1R4V_A, 1R5X_A, 1R6Y_A, 1R75_A, 1R7L_A, 1RFE_A, 1RI6_A, 1RKI_A, 1RKQ_A, 1RLH_A, 1RLJ_A, 1RLK_A, 1RTT_A, 1RTW_A, 1RTY_A, 1RV9_A, 1RVK_A, 1RW0_A, 1RW1_A, 1RW7_A, 1RXD_A, 1RXJ_A, 1RXK_A, 1RYL_A, 1RZ2_A, 1RZ3_A, 1S12_A, 1S2X_A, 1S4C_A, 1S4K_A, 1S5A_A, 1S7H_A, 1S7I_A, 1S7O_A, 1S8N_A, 1S9U_A, 1SAW_A, 1SBK_A, 1SC0_A, 1SD5_A, 1SDI_A, 1SDJ_A, 1SED_A, 1SEF_A, 1SEN_A, 1SF9_A, 1SFN_A, 1SFS_A, 1SFX_A, 1SG9_A, 1SH8_A, 1SMB_A, 1SPV_A, 1SQE_A, 1SQH_A, 1SQS_A, 1SQU_A, 1SQW_A, 1SS4_A, 1SU0_B, 1SU1_A, 1T06_A, 1T07_A, 1T0B_A, 1T0T_V, 1T1J_A, 1T2B_A, 1T57_A, 1T5R_A, 1T6A_A, 1T6S_A, 1T6T_1, 1T8H_A, 1T95_A, 1T9F_A, 1TC5_A, 1TE5_A, 1TLQ_A, 1TOV_A, 1TP6_A, 1TQ5_A, 1TQ8_A, 1TQX_A, 1TTZ_A, 1TU1_A, 1TU9_A, 1TUA_A, 1TUH_A, 1TUV_A, 1TUW_A, 1TWU_A, 1TWY_A, 1TXJ_A, 1TXL_A, 1TXZ_A, 1TY8_A, 1TZ0_A, 1TZA_A, 1TZZ_A, 1U0K_A, 1U5W_A, 1U61_A, 1U69_A, 1U7I_A, 1U7N_A, 1U84_A, 1U9C_A, 1U9D_A, 1U9P_A, 1UAN_A, 1UC2_A, 1UCR_A, 1UF3_A, 1UFA_A, 1UFB_A, 1UFH_A, 1UJ8_A, 1UMJ_A, 1V30_A, 1V6H_A, 1V6T_A, 1V70_A, 1V8D_A, 1V8H_A, 1V96_A, 1V99_A, 1V9B_A, 1VAJ_A, 1VBK_A, 1VCT_A, 1VDH_A, 1VDW_A, 1VE3_A, 1VGG_A, 1VGY_A, 1VH0_A, 1VH5_A, 1VH9_A, 1VHC_A, 1VHE_A, 1VHF_A, 1VHM_A, 1VHN_A, 1VHO_A, 1VHQ_A, 1VHS_A, 1VHU_A, 1VHY_A, 1VI3_A, 1VI4_A, 1VI8_A, 1VIM_A, 1VIZ_A, 1VJ1_A, 1VJ2_A, 1VJF_A, 1VJG_A, 1VJK_A, 1VJL_A, 1VJX_A, 1VK0_A, 1VK1_A, 1VK5_A, 1VK8_A, 1VKA_A, 1VKB_A, 1VKD_A, 1VKH_A, 1VKI_A, 1VKM_A, 1VKW_A, 1VL0_A, 1VL4_A, 1VL5_A, 1VL7_A, 1VLY_A, 1VM0_A, 1VMF_A, 1VMH_A, 1VMJ_A, 1VP2_A, 1VP4_A, 1VP8_A, 1VPB_A, 1VPH_A, 1VPQ_A, 1VPV_A, 1VPZ_A, 1VQR_A, 1VQS_A, 1VQW_A, 1VQY_A, 1VQZ_A, 1VR4_A, 1VR9_A, 1VRM_A, 1W8I_A, 1W9A_A, 1WD5_A, 1WDI_A, 1WDJ_A, 1WDT_A, 1WDV_A, 1WEH_A, 1WEK_A, 1WHZ_A, 1WJ9_A, 1WKC_A, 1WLU_A, 1WLV_A, 1WLZ_A, 1WM6_A, 1WMM_A, 1WN3_A, 1WN9_A, 1WNA_A, 1WOL_A, 1WOZ_A, 1WPB_A, 1WR2_A, 1WSC_A, 1WTY_A, 1WUE_A, 1WV3_A, 1WV8_A, 1WV9_A, 1WVI_A, 1WVQ_A, 1WVT_A, 1WWI_A, 1WWP_A, 1WWS_A, 1WWZ_A, 1WY6_A, 1X25_A, 1X6I_A, 1X6J_A, 1X7F_A, 1X7V_A, 1X9G_A, 1XAF_A, 1XBF_A, 1XBV_A, 1XBW_A, 1XBX_A, 1XBY_A, 1XBZ_A, 1XCC_A, 1XE1_A, 1XE7_A, 1XFI_A, 1XFJ_A, 1XFS_A, 1XG7_A, 1XG8_A, 1XHN_A, 1XHO_A, 1XIZ_A, 1XJC_A, 1XKF_A, 1XKL_A, 1XKQ_A, 1XM7_A, 1XMT_A, 1XMX_A, 1XPJ_A, 1XQ6_A, 1XQA_A, 1XRG_A, 1XSV_A, 1XTL_B, 1XTM_B, 1XUV_A, 1XV2_A, 1XVS_A, 1XW8_A, 1XX7_A, 1XXL_A, 1XY7_A, 1Y0H_A, 1Y0K_A, 1Y0N_A, 1Y0Z_A, 1Y12_A, 1Y1X_A, 1Y2I_A, 1Y5H_A, 1Y63_A, 1Y6Z_A, 1Y71_A, 1Y7I_A, 1Y7M_A, 1Y7P_A, 1Y7R_A, 1Y80_A, 1Y81_A, 1Y82_A, 1Y88_A, 1Y89_A, 1Y8A_A, 1Y8T_A, 1Y9B_A, 1Y9I_A, 1YAV_A, 1YB2_A, 1YB3_A, 1YBM_A, 1YBX_A, 1YCD_A, 1YDH_A, 1YDW_A, 1YE5_A, 1YEM_A, 1YEY_A, 1YF9_A, 1YHF_A, 1YKU_A, 1YKW_A, 1YLK_A, 1YLL_A, 1YLM_A, 1YLN_A, 1YLO_A, 1YLX_A, 1YN4_A, 1YN5_A, 1YN8_A, 1YOA_A, 1YOC_A, 1YOX_A, 1YOY_A, 1YOZ_A, 1YQE_A, 1YQF_A, 1YQH_A, 1YRE_A, 1YTL_A, 1YWF_A, 1YX1_A, 1YYV_A, 1YZ1_A, 1YZV_A, 1YZY_A, 1Z0P_A, 1Z1S_A, 1Z40_A, 1Z67_A, 1Z6M_A, 1Z6N_A, 1Z7A_A, 1Z7U_A, 1Z84_A, 1Z85_A, 1Z8H_A, 1Z90_A, 1Z94_A, 1Z9T_A, 1ZBM_A, 1ZBP_A, 1ZBS_A, 1ZC6_A, 1ZCE_A, 1ZD0_A, 1ZEE_A, 1ZEL_A, 1ZHV_A, 1ZKD_A, 1ZKE_A, 1ZKI_A, 1ZKP_A, 1ZL0_A, 1ZN6_A, 1ZNO_A, 1ZOX_A, 1ZPV_A, 1ZPW_X, 1ZQ7_A, 1ZS7_A, 1ZSO_A, 1ZSW_A, 1ZTC_A, 1ZTD_A, 1ZUP_A, 1ZUU_A, 1ZVP_A, 1ZWJ_A, 1ZWY_A, 1ZX5_A, 1ZX8_A, 1ZXU_A, 1ZZM_A, 2A13_A, 2A15_A, 2A1V_A, 2A2L_A, 2A2M_A, 2A2O_A, 2A33_A, 2A35_A, 2A3N_A, 2A3Q_A, 2A5B_A, 2A5Z_A, 2A67_A, 2A6B_A, 2A6C_A, 2A6P_A, 2A8G_A, 2A9S_A, 2AAM_A, 2AB0_A, 2ACA_A, 2AEG_A, 2AEU_A, 2AEV_A, 2AH5_A, 2AH6_A, 2AI4_AB, 2AJ6_A, 2AJ7_A, 2ALI_A, 2AMH_A, 2AMU_A, 2AO9_A, 2AP3_A, 2APJ_A, 2APL_A, 2AQW_A, 2AR1_A, 2ARH_A, 2ARZ_A, 2ASF_A, 2ATR_A, 2ATZ_A, 2AU5_A, 2AUA_A, 2AUW_A, 2AV4_A, 2AVN_A, 2AX3_AB, 2AXO_A, 2AZ4_A, 2AZP_A, 2AZW_A, 2B06_A, 2B0A_A, 2B0C_A, 2B0V_A, 2B1Y_A, 2B33_A, 2B3M_A, 2B4A_A, 2B4W_A, 2B6C_A, 2B6E_A, 2B78_A, 2B8M_A, 2BA2_A, 2BAZ_A, 2BBE_A, 2BDT_A, 2BDV_A, 2BE4_A, 2BO3_A, 2C5Q_A, 2CS7_A, 2CSU_A, 2CU3_A, 2CU5_A, 2CU6_A, 2CUW_A, 2CVB_A, 2CVE_A, 2CVI_A, 2CVL_A, 2CW4_A, 2CW5_A, 2CWQ_A, 2CWY_A, 2CX0_A, 2CX1_A, 2CXD_A, 2CY2_A, 2CYJ_A, 2CZ4_A, 2CZ8_A, 2CZL_A, 2D13_A, 2D16_A, 2D4G_A, 2D4O_A, 2D4P_A, 2D4R_A, 2D59_A, 2D5A_A, 2D7V_A, 2D8O_A, 2D8P_A, 2D9R_A, 2DBI_A, 2DBN_A, 2DBS_A, 2DC4_A, 2DCL_A, 2DCT_A, 2DDZ_A, 2DEC_A, 2DEG_A, 2DEH_A, 2DEV_A, 2DF8_A, 2DFA_A, 2DGB_A, 2DJW_A, 2DLB_A, 2DOK_A, 2DP9_A, 2DR3_A, 2DRH_A, 2DRV_A, 2DST_A, 2DSY_A, 2DT4_A, 2DUY_A, 2DVK_A, 2DX6_A, 2E5F_A, 2E66_A, 2E6U_X, 2E6X_A, 2E87_A, 2E8C_A, 2E8E_A, 2E8F_A, 2EA9_A, 2EBE_A, 2EBG_A, 2ECE_A, 2EEN_A, 2EFF_A, 2EFV_A, 2EG0_A, 2EGI_A, 2EGJ_A, 2EGR_A, 2EGT_A, 2EGX_A, 2EHP_A, 2EHW_A, 2EI5_A, 2EIS_A, 2EIU_A, 2EJ8_A, 2EJQ_A, 2EJX_A, 2EKD_A, 2EKM_A, 2EKY_A, 2EKZ_A, 2EPG_A, 2EPI_A, 2ES9_A, 2ESH_A, 2ESN_A, 2ETD_A, 2ETH_A, 2ETS_A, 2ETV_A, 2EVE_A, 2EVR_A, 2EW0_A, 2EWC_A, 2EWR_A, 2EXX_A, 2F06_A, 2F0R_A, 2F1L_A, 2F20_A, 2F22_A, 2F3L_A, 2F46_A, 2F4I_A, 2F4P_A, 2F4Z_A, 2F7W_A, 2F7Y_A, 2F8L_A, 2F9H_A, 2FA8_A, 2FB0_A, 2FB6_A, 2FBL_A, 2FBM_A, 2FBN_A, 2FCJ_A, 2FCK_A, 2FCL_A, 2FD7_A, 2FD9_A, 2FDO_B, 2FDR_A, 2FDS_A, 2FE1_A, 2FE7_A, 2FEF_A, 2FEX_A, 2FFG_A, 2FFJ_A, 2FG0_A, 2FG1_A, 2FG9_A, 2FGG_A, 2FHP_A, 2FHQ_A, 2FI0_A, 2FIU_A, 2FIY_A, 2FKB_A, 2FML_A, 2FMU_A, 2FNA_A, 2FNE_A, 2FNO_A, 2FO3_A, 2FPN_A, 2FQP_A, 2FR2_A, 2FSQ_A, 2FSU_A, 2FSX_A, 2FTR_A, 2FU0_A, 2FU2_A, 2FUP_A, 2FUR_A, 2FWV_A, 2FYW_A, 2FYX_A, 2FZT_A, 2FZV_A, 2G03_A, 2G0I_A, 2G0W_A, 2G0Y_A, 2G1U_A, 2G2C_A, 2G2N_A, 2G2P_A, 2G2X_A, 2G38_AB, 2G39_A, 2G3V_AE, 2G3W_A, 2G40_A, 2G42_A, 2G6B_A, 2G7U_A, 2G7Z_A, 2G84_A, 2G8L_A, 2GA1_A, 2GA8_A, 2GAA_A, 2GAN_A, 2GAX_A, 2GBO_A, 2GD9_A, 2GDQ_A, 2GF4_A, 2GF6_A, 2GFG_A, 2GFQ_A, 2GGE_A, 2GHS_A, 2GJG_A, 2GJU_A, 2GJV_A, 2GK3_A, 2GK4_A, 2GKP_A, 2GL5_A, 2GLZ_A, 2GM3_A, 2GM6_A, 2GMQ_A, 2GMY_A, 2GNP_A, 2GNX_A, 2GO8_A, 2GPI_A, 2GPJ_A, 2GPY_A, 2GS5_A, 2GSC_A, 2GSV_A, 2GTC_A, 2GTR_A, 2GTS_A, 2GU3_A, 2GUK_A, 2GUS_A, 2GUU_A, 2GUV_A, 2GVI_A, 2GWG_A, 2GWN_A, 2GX8_A, 2GYQ_A, 2GZ4_A, 2GZX_A, 2H1Q_A, 2H1T_A, 2H28_A, 2H5N_A, 2H6L_A, 2H9F_A, 2HBO_A, 2HBW_A, 2HD9_A, 2HE4_A, 2HEK_A, 2HH6_A, 2HHG_A, 2HI1_A, 2HIA_A, 2HIQ_A, 2HIY_A, 2HJ1_A, 2HJS_A, 2HKV_A, 2HLJ_A, 2HLY_A, 2HMC_A, 2HNE_A, 2HNG_A, 2HQ4_A, 2HQ8_A, 2HQ9_A, 2HQV_A, 2HQY_A, 2HRX_A, 2HRZ_A, 2HSB_A, 2HSI_A, 2HSJ_A, 2HSZ_A, 2HTD_A, 2HUH_A, 2HUJ_A, 2HV2_A, 2HV6_A, 2HX5_A, 2HXJ_A, 2HXT_A, 2HXU_A, 2HYT_A, 2HYX_A, 2I02_A, 2I0M_A, 2I15_A, 2I1S_A, 2I2O_A, 2I3D_A, 2I3F_A, 2I51_A, 2I52_A, 2I5E_A, 2I5H_A, 2I5I_A, 2I5R_A, 2I5T_A, 2I5U_A, 2I6G_A, 2I6H_A, 2I6T_A, 2I71_A, 2I7R_A, 2I8D_A, 2I8E_A, 2I8G_A, 2I9C_A, 2I9I_A, 2I9W_A, 2I9X_A, 2I9Z_A, 2IA0_A, 2IA1_A, 2IA7_A, 2IAB_A, 2IAF_A, 2IAI_A, 2IAY_A, 2IAZ_A, 2IB0_A, 2IBD_A, 2ICG_A, 2ICH_A, 2ICU_A, 2IDL_A, 2IEC_A, 2IEE_A, 2IEL_A, 2IF6_A, 2IFA_A, 2IFX_A, 2IG6_A, 2IG8_A, 2IGL_A, 2IGS_A, 2IIU_A, 2IIZ_A, 2IJC_A, 2IJQ_A, 2IKB_A, 2IKK_A, 2IL5_A, 2IM8_A, 2IM9_A, 2IMH_A, 2IMJ_A, 2IMR_A, 2INB_A, 2INW_A, 2IOJ_A, 2IPQ_X, 2IS5_A, 2IT2_A, 2IT3_A, 2IT9_A, 2ITB_A, 2IVY_A, 2JB7_A, 2JEK_A, 2NLV_A, 2NN4_A, 2NN5_A, 2NQL_A, 2NQW_A, 2NR4_A, 2NR5_A, 2NR7_A, 2NRH_A, 2NRK_A, 2NS0_A, 2NS9_A, 2NUH_A, 2NV4_A, 2NVM_A, 2NVP_A, 2NWI_A, 2NWU_A, 2NWV_A, 2NX2_A, 2NX4_A, 2NXO_A, 2NYD_A, 2NYI_A, 2NZC_A, 2O08_A, 2O0P_A, 2O0Q_A, 2O14_A, 2O16_A, 2O1M_A, 2O1O_A, 2O1Q_A, 2O2A_A, 2O2X_A, 2O30_A, 2O34_A, 2O35_A, 2O38_A, 2O3A_A, 2O3I_A, 2O3L_A, 2O4D_A, 2O4T_A, 2O56_A, 2O57_A, 2O5H_A, 2O62_A, 2O6K_A, 2O6W_A, 2O8Q_A, 2O8S_A, 2O95_A, 2OA2_A, 2OBB_A, 2OBN_A, 2OC5_A, 2OC6_A, 2OD0_A, 2OD4_A, 2OD5_A, 2OD6_A, 2ODF_A, 2ODK_A, 2ODM_A, 2OEB_A, 2OEE_A, 2OEZ_A, 2OGF_A, 2OGI_A, 2OHW_A, 2OIK_A, 2OJH_A, 2OJL_A, 2OKF_A, 2OKQ_A, 2OLT_A, 2ONF_A, 2OO2_A, 2OO3_A, 2OOJ_A, 2OOK_A, 2OP5_A, 2OPK_A, 2OPL_A, 2OQM_A, 2OSD_A, 2OSO_A, 2OT9_A, 2OTA_A, 2OTM_A, 2OU3_A, 2OU6_A, 2OUF_A, 2OWP_A, 2OX6_A, 2OX7_A, 2OXJ_A, 2OXK_A, 2OY9_A, 2OYN_A, 2OYR_A, 2OYS_A, 2OYZ_A, 2OZ5_A, 2OZ8_A, 2OZH_A, 2OZI_A, 2OZJ_A, 2OZZ_A, 2P06_A, 2P0G_A, 2P0N_A, 2P0O_A, 2P0T_A, 2P0V_A, 2P10_A, 2P11_A, 2P12_A, 2P13_A, 2P17_A, 2P1A_A, 2P1G_A, 2P1X_A, 2P2E_A, 2P2L_A, 2P3P_A, 2P3Y_A, 2P4G_A, 2P4O_A, 2P4P_A, 2P5D_A, 2P5I_A, 2P5X_A, 2P65_A, 2P6C_A, 2P6H_A, 2P6Y_A, 2P7H_A, 2P7I_A, 2P84_A, 2P8T_A, 2P90_A, 2P92_A, 2P97_A, 2P9J_A, 2P9X_A, 2PAG_A, 2PBL_A, 2PCS_A, 2PD0_A, 2PD1_A, 2PD2_A, 2PEB_A, 2PFC_A, 2PFS_A, 2PFW_A, 2PG4_A, 2PGX_A, 2PH0_A, 2PH7_A, 2PHC_B, 2PHP_A, 2PIF_A, 2PIH_A, 2PIM_A, 2PJS_A, 2PJZ_A, 2PK7_A, 2PK8_A, 2PKT_A, 2PKW_A, 2PLI_A, 2PLM_A, 2PLS_A, 2PMA_A, 2PMB_A, 2PMR_A, 2PMY_A, 2PN2_A, 2PNK_A, 2PNT_A, 2POD_A, 2POZ_A, 2PPV_A, 2PPX_A, 2PQV_A, 2PRV_A, 2PS2_A, 2PSB_A, 2PST_X, 2PTF_A, 2PV4_A, 2PVZ_A, 2PW0_A, 2PW6_A, 2PW9_A, 2PWW_A, 2PYQ_A, 2PYT_A, 2PZZ_A, 2Q00_A, 2Q02_A, 2Q03_A, 2Q07_A, 2Q0T_A, 2Q0X_A, 2Q17_A, 2Q22_A, 2Q30_A, 2Q3L_A, 2Q3P_A, 2Q3S_A, 2Q3T_A, 2Q3V_A, 2Q40_A, 2Q44_A, 2Q46_A, 2Q48_A, 2Q4B_A, 2Q4D_A, 2Q4M_A, 2Q4N_A, 2Q4O_A, 2Q4P_A, 2Q4U_A, 2Q52_A, 2Q53_A, 2Q78_A, 2Q7X_A, 2Q8O_A, 2Q8U_A, 2Q9K_A, 2Q9R_A, 2Q9T_A, 2QBW_AB, 2QDD_A, 2QE6_A, 2QE8_A, 2QE9_A, 2QEA_A, 2QEN_A, 2QF9_A, 2QG3_A, 2QGG_A, 2QGQ_A, 2QGS_A, 2QGU_A, 2QGY_A, 2QH1_A, 2QH8_A, 2QH9_A, 2QHQ_A, 2QIK_A, 2QIO_A, 2QIP_A, 2QIW_A, 2QJ8_A, 2QJV_A, 2QJW_A, 2QKP_A, 2QL8_A, 2QM0_A, 2QM2_A, 2QML_A, 2QMM_A, 2QMW_A, 2QNG_A, 2QNI_A, 2QNL_A, 2QNT_A, 2QNU_A, 2QP2_A, 2QPV_A, 2QQY_A, 2QQZ_A, 2QRU_A, 2QS7_A, 2QSB_A, 2QSI_A, 2QSV_A, 2QTD_A, 2QTI_A, 2QTP_A, 2QU8_A, 2QUP_A, 2QV5_A, 2QVO_A, 2QVP_A, 2QVT_A, 2QWZ_A, 2QX2_A, 2QY6_A, 2QYA_A, 2QYC_A, 2QYZ_A, 2QZ7_A, 2QZB_A, 2QZC_A, 2QZG_A, 2QZI_A, 2R01_A, 2R0X_A, 2R1F_A, 2R39_A, 2R44_A, 2R47_A, 2R4I_A, 2R5S_A, 2R5X_A, 2R6O_A, 2R6S_A, 2R6U_A, 2R7H_A, 2R84_A, 2R85_A, 2R87_A, 2R8C_A, 2RA8_A, 2RA9_A, 2RAR_A, 2RAV_A, 2RB5_A, 2RB9_A, 2RBD_A, 2RBG_A, 2RBK_A, 2RC3_A, 2RCD_A, 2RD1_A, 2RD9_A, 2RDC_A, 2RDE_A, 2RDM_A, 2RDX_A, 2RE2_A, 2RFR_A, 2RG4_A, 2RGQ_A, 2RHM_A, 2RJZ_A, 2RLD_A, 2UVK_A, 2UVP_A, 2UYJ_A, 2UYK_A, 2UYN_A, 2UYP_A, 2V1L_A, 2V7S_A, 2VH3_A, 2VL7_A, 2W0M_A, 2WM3_A, 2YQY_A, 2YSK_A, 2YVT_A, 2YWI_A, 2YWO_A, 2YX5_A, 2YX6_A, 2YXH_A, 2YXY_A, 2YYO_A, 2YYS_A, 2YZI_A, 2YZJ_A, 2YZQ_A, 2YZS_A, 2YZT_A, 2YZY_A, 2Z08_A, 2Z09_A, 2Z0J_A, 2Z0R_A, 2Z13_A, 2Z3V_A, 2ZBN_A, 2ZBU_A, 2ZBV_A, 2ZCA_A, 2ZDC_A, 2ZDJ_A, 2ZFH_A, 2ZOP_A, 2ZZ8_A, 3B48_A, 3B49_A, 3B4Q_A, 3B5M_A, 3B5O_A, 3B5P_A, 3B73_A, 3B77_A, 3B7C_A, 3B83_A, 3B8L_A, 3B9C_A, 3BB5_A, 3BB6_A, 3BB9_A, 3BBJ_A, 3BCW_A, 3BCY_A, 3BDD_A, 3BDE_A, 3BDV_A, 3BDZ_A, 3BE3_A, 3BEE_A, 3BEY_A, 3BF4_A, 3BFM_A, 3BGH_A, 3BGU_A, 3BHN_A, 3BHP_A, 3BHW_A, 3BJQ_A, 3BJR_A, 3BK5_A, 3BL4_A, 3BLZ_A, 3BM7_A, 3BN7_A, 3BN8_A, 3BQ9_A, 3BQS_A, 3BQW_A, 3BQX_A, 3BRC_A, 3BS4_AB, 3BT5_A, 3BUT_A, 3BUU_A, 3BWS_A, 3BWW_A, 3BXP_A, 3BYQ_A, 3BZ6_A, 3C0B_A, 3C0F_B, 3C1L_A, 3C2Q_A, 3C4N_A, 3C4R_A, 3C4S_A, 3C5O_A, 3C5Y_A, 3C6C_A, 3C6V_A, 3C8C_A, 3C8L_A, 3C9P_A, 3C9Q_AL, 3CA8_A, 3CAX_A, 3CBN_A, 3CBU_A, 3CBW_A, 3CC8_A, 3CE8_A, 3CEC_A, 3CER_A, 3CET_A, 3CEW_A, 3CEX_A, 3CG4_A, 3CGH_A, 3CGI_A, 3CGV_A, 3CHV_A, 3CJE_A, 3CJL_A, 3CK1_A, 3CKJ_A, 3CKM_A, 3CKN_A, 3CKV_A, 3CLW_A, 3CNR_A, 3CNU_A, 3CNW_A, 3CNX_A, 3CNY_A, 3CP3_A, 3CPG_A, 3CSX_A, 3CU3_A, 3CVJ_A, 3CVO_A, 3CWQ_A, 3CWX_A, 3CY6_A, 3CYF_A, 3CYM_A, 3CZ9_A, 3CZA_A, 3D00_A, 3D01_A, 3D0J_A, 3D0W_A, 3D19_A, 3D33_A, 3D37_A, 3D3Y_A, 3D4R_A, 3D5P_A, 3D79_A, 3D7A_A, 3D7J_A, 3D7L_A, 3D7Q_A, 3D82_A, 3DB2_A, 3DB7_A, 3DBY_A, 3DC7_A, 3DCD_A, 3DCL_A, 3DCX_A, 3DCZ_A, 3DDE_A, 3DDJ_A, 3DEE_A, 3DF6_A, 3DF7_A, 3DFU_A, 3DI4_A, 3DI5_A, 3DKA_A, 3DL2_A, 3DL3_A, 3DLO_A, 3DM8_A, 3DMA_A, 3DMB_A, 3DMC_A, 3DME_A, 3DMN_A, 3DMY_A, 3DN7_A, 3DNH_A, 3DNP_A, 3DNU_A, 3DNX_A, 3DR5_A, 3DRZ_A, 3DS8_A, 3DSG_A, 3DSM_A, 3DT5_A, 3DTD_A, 3DTN_A, 3DTZ_A, 3DUE_A, 3DUK_A, 3DXI_A, 3DXP_A, 3DZA_A, 3DZZ_A, 3E02_A, 3E0F_A, 3E0H_A, 3E0S_A, 3E0X_A, 3E0Z_A, 3E11_A, 3E23_A, 3E29_A, 3E35_A, 3E38_A, 3E48_A, 3E49_A, 3E56_A, 3E57_A, 3E5Z_A, 3E61_A, 3E8O_A, 3E8P_A, 3E8V_A, 3E8X_A, 3E98_A, 3EBT_A, 3EBV_A, 3EBY_A, 3EC6_A, 3EC9_A, 3ECF_A, 3ED5_A, 3EDP_A, 3EEA_A, 3EEQ_A, 3EFG_A, 3EGC_A, 3EGL_A, 3EHC_A, 3EHD_A, 3EII_A, 3EIR_A, 3EJA_A, 3EJN_A, 3EJV_A, 3ELG_A, 3EMM_A, 3EN2_A, 3EN8_A, 3EO4_A, 3EO6_A, 3EOI_A, 3EQZ_A, 3ER6_A, 3ER7_A, 3ERM_A, 3ERV_A, 3ES1_A, 3ES4_A, 3ESG_A, 3ESM_A, 3EUR_A, 3EW1_A, 3EW2_A, 3EWL_A, 3EWN_A, 3EYP_A, 3EYR_A, 3EYT_A, 3EZ0_A, 3EZY_A, 3F08_A, 3F14_A, 3F1T_A, 3F2I_A, 3F2V_A, 3F2Z_A, 3F3K_A, 3F40_A, 3F42_A, 3F5D_A, 3F7C_A, 3F7E_A, 3F7S_A, 3F7X_A, 3F86_A, 3F87_A, 3F8H_A, 3F8X_A, 3F9S_A, 3F9U_A, 3FA5_A, 3FB9_A, 3FBG_A, 3FCN_A, 3FDB_A, 3FDI_A, 3FDJ_A, 3FDX_A, 3FEZ_A, 3FF0_A, 3FF2_A, 3FF4_A, 3FFY_A, 3FG8_A, 3FG9_A, 3FGB_A, 3FGG_A, 3FGV_A, 3FGY_A, 3FH0_A, 3FH1_A, 3FH3_A, 3FHK_A, 3FIJ_A, 3FJ2_A, 3FJS_A, 3FJV_A, 3FKA_A, 3FKJ_A, 3FKQ_A, 3FLE_A, 3FLH_A, 3FLJ_A, 3FM2_A, 3FMB_A, 3FMC_A, 3FOJ_A, 3FOV_A, 3FRM_A, 3FRN_A, 3FRW_A, 3FSD_A, 3FSE_A, 3FUY_A, 3FVV_A, 3FVW_A, 3FX7_A, 3FXA_A, 3FXD_AB, 3FXE_AB, 3FXH_A, 3FY6_A, 3FYB_A, 3FYF_A, 3FYN_A, 3FZX_A, 3G0K_A, 3G16_A, 3G17_A, 3G1J_A, 3G3L_A, 3G6I_A, 3G74_A, 3G7G_A, 3G7P_A, 3G8Y_A, 3G8Z_A, 3GAN_A, 3GBH_A, 3GBY_A, 3GDW_A, 3GEK_A, 3GF6_A, 3GG7_A, 3GGM_A, 3GGN_A, 3GH1_A, 3GHJ_A, 3GI7_A, 3GIW_A, 3GJU_A, 3GK5_A, 3GK6_A, 3GKB_A, 3GKX_A, 3GL5_A, 3GMG_A, 3GMI_A, 3GMS_A, 3GN6_A, 3GNJ_A, 3GNL_A, 3GO4_A, 3GO5_A, 3GOZ_A, 3GPI_A, 3GQJ_A, 3GQM_A, 3GQS_A, 3GRD_A, 3GUX_A, 3GV1_A, 3GVE_A, 3GW4_A, 3GX1_A, 3GZR_A, 3H04_A, 3H05_A, 3H0N_A, 3H2D_A, 3H35_A, 3H3H_A, 3H6P_AC, 3H79_A, 3H8U_A, 3H92_A, 3H9P_A, 3H9W_A, 3HA9_A, 3HBZ_A, 3HC1_A, 3HCZ_A, 3HDG_A, 3HDJ_A, 3HE1_A, 3HFI_A, 3HFQ_A, 3HG9_A, 3HIU_A, 3HIX_A, 3HL1_A, 3HLZ_A, 3HN5_A, 3HNW_A, 3HP7_A, 3HQX_A, 3HRG_A, 3HRL_A, 3HRP_A, 3HSA_A, 3HTN_A, 3HTR_A, 3HTY_A, 3HVZ_A, 3HWU_A, 3HXL_A, 3HZ7_A, 3HZE_A, 3HZP_A, 3I0T_A, 3I0Y_A, 3I18_A, 3I3F_A, 3I42_A, 3I4T_A, 3I8N_A, 3IB6_A, 3IBM_A, 3IBS_A, 3IBZ_A, 3IC5_A, 3IC8_A, 3ICL_A, 3IDF_A, 3IDU_A, 3IEE_A, 3IF4_A, 3II8_A, 3IJD_A, 3IJM_A, 3IJT_A, 3IKB_A, 3ILK_A, 3ILM_A, 3ILX_A, 3IMI_A, 3IMO_A, 3IPF_A, 3IR9_A, 3IRA_A, 3IRS_A, 3IRU_A, 3IUK_A, 3IUS_A, 3IWH_A, 3IWT_A, 3IX7_A, 3JR7_A, 3JRT_A, 3JSR_A, 3JU2_A, 3JYG_A, 3JZV_A, 3K0B_A, 3K12_A, 3K29_A, 3K2T_A, 3K4I_A, 3K4W_A, 3K67_A, 3K6A_A, 3K6C_A, 3K6O_A, 3K7X_A, 3K9R_A, 3KAW_A, 3KB2_A, 3KB4_A, 3KBQ_A, 3KD3_A, 3KD4_A, 3KEV_A, 3KG4_A, 3KGZ_A, 3KHN_A, 3KJJ_A, 3KJK_A, 3KK4_A, 3KL2_A, 3KLU_A, 3KOP_A, 3KQ5_A, 3KYE_A, 3KZP_A

Supplementary Table S2:

Prediction results of false-positive (FP) sites on training set and test set.

PDB ID & chain / Predicted ligand residues / Note
Training Set / 3C37_A / E107,H110,H208 / H110 and H208 were ligand residues
1ADT_A / C339,H341,C355 / C339 and C355 were ligand residues
1XRT_A / D153,C181,D183 / D153 and C181 were ligand residues
1T8H_A / H80,C125,H142 / Zn in 1U05_A & 1XAF_A
(Both E-values with 1T8H_A9E-31)
3CG7_A / D15,E17,D184 / Mn in 3CM5_A (isoform with 3CG7)
1UWY_A / C121,C267,C268 / FP*
2IDA_A / C21,C24,C45,H53 / FP
1XTO_A / D92,H93, H269 / FP
3H3E_A / D61,H62, H224 / FP
1X3Z_A / C191,H218,C237,E238 / FP
Test Set / 1ZFD_A / C44,H46,C49 / C44 and C49 were ligand residues
3ISZ_A / D100,E135,H349 / Zn in 3IC1_A (isoform with 3ISZ)
3I9F_A / C40,C46,C58 / FP
D81,H107,E111 / FP
2JKS_A / C16,C46,C138,C155 / FP
FP*: No evidence indicating this site a potential zinc-binding site.

Supplementary Table S3:

The cutoff criteria of four geometric parameters (d, A, B, ω) for template grouping in the reduced template library (top half) and for potential ligand residue pairs searching (bottom half).

Template type
C-C / C-E / C-H / C-D / D-D / D-H / D-E / E-E / E-H / H-H
half bin width d_group (Å) / 0.42 / 0.35 / 0.50 / 0.34 / 0.40 / 0.51 / 0.53 / 0.52 / 0.56 / 0.59
half bin width A_ group (°) / 4.66 / 5.04 / 4.83 / 3.29 / 4.09 / 4.45 / 4.59 / 5.02 / 4.78 / 4.51
half bin width B_ group (°) / 4.70 / 3.81 / 4.34 / 3.88 / 4.16 / 4.63 / 4.59 / 2.86 / 4.93 / 5.00
half bin widthω_ group (°) / 7.20 / 6.62 / 7.20 / 7.12 / 6.58 / 7.20 / 6.63 / 7.17 / 7.15 / 7.15
half bin widthd_cutoff (Å) / 0.64 / 0.53 / 0.75 / 0.51 / 0.56 / 0.76 / 0.79 / 0.77 / 0.84 / 0.89
half bin width A_cutoff (°) / 9.31 / 10.09 / 9.31 / 6.20 / 8.01 / 8.91 / 9.17 / 10.04 / 9.56 / 9.03
half bin width B_cutoff (°) / 8.85 / 7.45 / 8.32 / 7.76 / 8.32 / 8.17 / 9.18 / 5.72 / 9.87 / 10.00
half bin widthω_cutoff (°) / 18.00 / 16.55 / 17.94 / 17.29 / 16.44 / 18.00 / 16.51 / 17.93 / 17.87 / 17.86

Supplementary Table S4:

Prediction of zinc-binding sites for 1888 PDB entries.(TEMSP reported 186 zinc-binding sites in 145 protein structures)

(a) 82 predicted sites from 65structures were confirmed by zinc-binding.

PDB ID & chain / Predicted ligand residues
1I6N_A / 142,174,200,246;
1KQ3_A / 169,252,269;
1M65_A / 15,194,40;
1M68_A / 101,131,73; 15,194,40; 192,7,9;
1NNQ_A / 114,117,80; 142,145,157,160; 19,52,55;
1NO5_A / 46,48,79;
1PB0_A / 101,131,73; 15,194,40; 7,73,9;
1R5X_A / 67,69,80;
1SU0_B / 127,40,42,65;
1SU1_A / 11,129,9;
1T8H_A / 182,183,242,245;
1TQX_A / 179,36,38,70;
1TXL_A / 166,175,177;
1VHE_A / 182,237,68;
1XTL_B / 104,112,121,124;
1XTM_B / 104,112,121,124;
1XV2_A / 173,175,186;
1YLK_A / 35,37,88,91;
1YLO_A / 166,199,308; 166,221,62;
1Z84_A / 133,184,63,66; 216,219,255,310;
1ZKP_A / 134,155,59,61; 155,211,64;
1ZWJ_A / 133,184,63,66; 216,219,255,310;
1ZZM_A / 11,207,9; 117,144,148;
2AZ4_A / 167,189,92,94; 189,404,96,97;
2CS7_A / 28,31,33,7;
2FQP_A / 33,35,76;
2G84_A / 112,115,77;
2GLZ_A / 15,17,19,55;
2GPY_A / 29,93,97;
2GVI_A / 16,18,20,61; 174,177,195,198;
2GWG_A / 178,6,8;
2GX8_A / 107,336,68; 333,336,69;
2H6L_A / 104,89,91;
2HEK_A / 161,54,84;
2HSI_A / 180,184,259,261;
2I0M_A / 152,156,191,195;
2I9W_A / 10,28,29,8; 167,169,178,179;
2IMR_A / 238,97,99;
2NYD_A / 102,329,64; 326,329,65;
2O1Q_A / 101,59,61;
2OIK_A / 11,14,49,95;
2OSD_A / 116,139,142,150;
2OSO_A / 116,139,142;
2P2L_A / 100,104,63;
2P6Y_A / 75,77,91;
2PEB_A / 12,14,84;
2PLM_A / 200,279,55,57;
2PW6_A / 22,234,57;
2Q02_A / 138,169,196;
2QGS_A / 123,29,59;
2R8C_A / 324,63,65;
3BB6_A / 30,32,83;
3CHV_A / 233,51,53;
3D00_A / 165,168,180,183;
3DI4_A / 150,167,237;
3E02_A / 258,49,51;
3E0F_A / 24,262,49;
3E38_A / 113,44,46; 216,51,76;
3E49_A / 258,49,51;
3H0N_A / 147,152,168,172;
3HTR_A / 34,60,62;
3IMI_A / 101,11,50,8;
3IR9_A / 330,333,355,358;
3JU2_A / 141,203,247;
3JYG_A / 15,38,40;

(b) 5 predicted sites from 4 structures were confirmed by zinc-binding in their isoform structures.

PDB ID & chain / Predicted ligand residues / Isoformstructure
1I60_A / 142,174,246; / 1I6N_A
1PB0_A / 164,192,194; / 1M68_A
2PNK_A / 26,28,355; / 2Q6E_A
1M65_A / 101,131,73; 192,7,73,9; / 1M68_A

(c) 36 predicted sites from 31structures were found non-zinctransition metal(manganese, iron, cobalt, nickel, copper or cadmium) ions binding.

PDB ID & chain / Predicted ligand residues
1NX4_A / 101,103,251;
1NX8_A / 101,103,251;
1O4T_A / 102,61,63;
1P1M_A / 200,55,57;
1VJ2_A / 52,54,92;
1XX7_A / 33,67,68;
1ZTC_A / 140,69,71; 162,73,74;
2AMU_A / 115,118,17;
2FIY_A / 185,188,211,214; 225,228,256,259;
2GM6_A / 147,94,96;
2GZX_A / 204,6,8;
2HKV_A / 123,127,48;
2ITB_A / 122,154,69; 151,38,69,72;
2O08_A / 21,50,51;
2O8Q_A / 100,58,60;
2OC5_A / 128,160,73; 45,73,76;
2OGI_A / 29,58,59;
2OZI_A / 32,34,76;
2QE9_A / 124,128,44;
2QH1_A / 146,149,169,174;
2R6S_A / 160,162,292;
2RG4_A / 119,121,187;
3BWW_A / 138,171,203; 203,30,59;
3C6C_A / 250,47,49;
3D82_A / 44,51,85;
3E0F_A / 17,260,74;
3GG7_A / 194,5,7;
3GVE_A / 16,18,248;
3HC1_A / 121,151,152; 121,237,248;
3HTN_A / 131,133,147;

(d) 3 predicted sites from 3 structures are located on the artificially fused His-tag tails.

PDB ID & chain / Predicted ligand residues
3IPF_A / 84,86,87;
3E61_A / 325,326,328;
1MZG_A / 142,144,145;

(e) 4 sites from 4 structures may be actually wrong predictions after close inspections.