Supplementary Material
Compilation and Physicochemical Classification Analysis of a Diverse hERGInhibition Database
Remigijus Didziapetris1,2, Kiril Lanevskij1,2,*
1VšĮ „Aukštieji algoritmai“, A.Mickevičiaus 29, LT-08117 Vilnius, Lithuania
2ACD/Labs, Inc., 8 KingStreetEast,Suite 107, Toronto, Ontario, Canada M5C 1B5
*Correspondingauthor: Addressforcorrepondence: Traidenio 34, LT-08116 Vilnius, Lithuania
Telephone: +370 5 262 3408; Fax: +370 5 262 3728.
E-mail:
Description of fields in hERG inhibition data base
Name: the name of the compound. In case of congeneric series, this includes a short description of the series and ID number of the compound in the original publication.
SMILES: the compound structure provided in SMILES notation format.
I_hERG: assignedbinary classification of hERG inhibition (1/0):
- 1 if IC50 (Ki) ≤10 µM
- 0 if IC50 (Ki) > 10 µM.
Type:indicates what kind of metric is provided in the Value field:
- IC50: half-inhibitory concentration determined in patch-clamp or radioligand displacement assay
- Ki: inhibition constant determined in radioligand displacement assay
- Kimin: the lower limit of Ki value
- IC50min: the lower limit of IC50 value
- IC50max: the upper limit of IC50 value
- IC50 (%):IC50 value estimated from single point data (percentage inhibition at fixed ligandconcentration [L]) using the following equation:
Since this is a very rough estimate, only the entries with resulting IC50(%)50 µM or IC50(%)2 µM were recorded in the database.
Value:quantitative hERG activity value of givenType.
Assay: a brief description of hERG inhibition assay used to derive the given activity value:
- Patch-clamp conventional (cell line indicated in parentheses: HEK293, CHO, XO, or ND if not specified)
- Patch-clamp automated (cell line indicated in parentheses: HEK293, CHO, or ND if not specified)
- Patch-clamp ND – patch-clamp with unspecified details
- Electrophysiology (myocytes)
- Binding (reference ligand indicated in parentheses: dofetilide, astemizole, or MK-499|).
This field may also contain two additional notes:
- “assumed” –exact assay details were not reported in the article, but could be reasonably implied to be identical to those described in related publications by the same laboratory
- “confirmed by authors” – the authors of the publication had provided some of the missing information upon request
Code: the assigned assay code as outlined in Table 1 of the article.
Reference: the literature reference number. The full list of references is provided as a separate data sheet alongside the main database.
Set: indicates whether the compound was part of Modeling set (used as training orinternal validation data in different modeling runs), or External validation set.
n (congeneric):the number of compounds in the congeneric series, or 1 for non-congeneric compounds (see Data & Methods section of the article)
w (adjusted):weightadjustment factor(see Data & Methods for details), can be 0.5; 1; or 2.
weight:the finalweight of the compound used in modeling,weight = w (adjusted)/n (congeneric)
logP: octanol/water partitioning coefficient
pKa1(Acid): the strongest acidic pKa
pKa1(Base): the strongest basic pKa
pKa2(Base): the second strongest basic pKa
logP and pKa were calculated using ACD/LogP GALAS and ACD/pKa GALAS algorithms implemented in ACD/Percepta software ( In several cases when these produced very unreliable estimates, they were replaced by ACD/LogP Classic and ACD/pKa Classic predictions (marked italic), or experimental values if available (marked bold).
MW: Molecular weight
TPSA: Topological Polar Surface Area
NAR: Number of Aromatic Rings
FRB: Fraction of Rotatable Bonds
p (predicted): predicted probability of the compound being a hERG inhibitor with IC50 ≤ 10 µM as an averaged output of ten models: