Assigning CHARMM-CGenFF parameters to a ligand

Author: Toni Giorgino ()
Copyrights 2011, University Pompeu Fabra. All rights reserved.

Software requirements: VMD. The latest CHARMMforcefield files, including CGenFF.
Knowledge requirements: Use of CHARMM topology and parameter files, CHARMM system building.
Input requirements: See below.

Disclaimer:This protocol is based on a best practice. In no way we guarantee that it is optimal for your system. You use it at your own risk. Note that the CGenFF and the Param Chem web sites are under development. Results may vary when new releases are out. Procedure tested on 27/feb/2011 with VMD 1.8.7 and CHARMM parameter set 36a.

Example files

Download themhere.

Basic procedure

Small organic molecules that are not standard biological molecules (amino acids, lipids, etc.) often do not have pre-built files to match the force field you are going to use, which means you will have to make them on your own. The procedure outlined in this page generates CHARMM-style topology (PSF) and parameter files (INP) of a small organic molecule; the molecule can then be embedded in a larger biomolecular system, typically built with the CHARMM all-atom forcefield.

This procedure relies on theParamChemweb service, which performs fragment-based parameter assignments by analogy; the parameters themselves belong to the general CHARMM forcefield (CGenFF) [4]. Further QM-based optimization of parameters, e.g. dihedrals (i.e. "parametrization" in the strict sense), is beyond the scope of this tutorial [1].

In some cases, you may be able to parametrize molecules and/or build the system usingCHARMM-GUI("PDB reader" function).

Step 1: Prerequisites for the input

For the following procedure to work, you will need the ligand's structure with the following characteristics:

·  molecular structure in Tripos MOL2 file formats, with coordinate information (in this example:ligand.mol2)

·  the same molecule in PDB format, for system preparation

·  explicit hydrogens should be present

·  protonation states should be appropriate for the simulation conditions

·  total charge should be known

·  all of the atoms of the parametrized ligand should be present when you generate a PSF structure.

Please see the "solution to common problems" section if some of the conditions are not met.

Step 2: Upload molecule to the ParamChem server

1.  Register with theParamChemweb site (if not already done) and note the password.

2.  Select "upload molecule" and upload theMOL2file.

3.  If the process is successful, you will be able to download astreamfile. Save and rename it with thetoppar_prefix and the .strextension. For example:toppar_ligand_cgenff.str.

4.  Inspect the file to verify the quality of guessed parameter assignments. The server inserts notes concerning the 'penalty' of charges and bonded parameters. If penalties are low, you can proceed.

Step 3: Split the stream file into topology and parameters

Two new files, the topology and parameter components, must now be generated from the stream file.

1.  Intop_ligand.rtf(topology file): copy the lines betweenreadrtfcardappendand the nextEND(both excluded). Manually modify the line starting withRESIin order to set your chosen residue name. This will be the name of the residue in the PDB file.

2.  Inpar_ligand.inp(parameter file): copy the lines betweenreadparamcardflexappendand the nextEND(both excluded).

Note: The parameter file may or may not contain values, depending on the molecule.

Step 4: Generate PSF

The system can now be built. Likely you will use eitherautopsforpsfgen. In either case, you will add the newly-generated residue topology filetop_ligand.rtfandthe CGenFF topology filetop_all36_cgenff.rtfto the list of topology files considered by the building program.

·  If usingautopsf,this is achieved clicking "Add" in the "topology files" section.

·  Inpsfgen, it is done with the "topology top_ligand.rtf" directive.

Structure building is based on manipulation or merging of PDB files and uses initial coordinate information stored therein. Please refer to the appropriate tutorials [3].

Step 5: Merge parameter files

If any of the sections of thepar_ligand.inpfile is not empty, the corresponding lines should be merged in the parameter file chosen for the simulation. For example, the widely-usedCHARMM22 All-Hydrogen Parameter File for Proteins and Lipidsis found in a file namedpar_all27_prot_lipid_na.inp. Make a local copy of the latter and insert the sections ofpar_ligand.inpin the corresponding places. Do the same with the CGenFF parameter filepar_all36_cgenff.prm. A pre-merged version of the two ishere.

NAMD users can skip step 5 using.strfiles directly.

Solution to common problems

·  Molecule not in MOL2 format.Software likeMarvinSketchandopenbabelcan perform conversions, while computing bonds and bond orders from the three-dimensional structure. Inspect and fix them manually if necessary.

·  Molecule not in PDB format.As above,openbabelor other molecular manipulation software (VMD,AmberTools'santechamber,MarvinSketch, etc.) can perform the conversion. Note that the conversion is lossy.

·  Error messages fromParamChemmay be due to wrong bond order: correct connectivity and bond order information in the MOL2 file is crucial. In particular, make sure that carbon atoms have the correct valence. Some software (notably, VMD) outputs incorrect bond orders in MOL2 files. To fix them, you have two options:

o  (recommended) use a chemically-aware software (see above) to produce the correct ones; or

o  enable theguess bond orders from connectivitycheck-box inParamChem. Note that this will disable some consistency checks; in particular, it may fail to produce reasonable parameters if the input connectivity is wrong.

·  Missing explicit hydrogen atoms, protonation states, total chargecan be computed withinMarvinSketch.

·  Stereoisomerism.If the specie being parametrized is compound by two or more conformations that do not interconvert within your sampling time, make sure that the initial coordinates used for the simulation reflect the configuration of interest. However, stereoisomers have the same parameter set.

·  Missing starting coordinates, or 2D coordinates only available. You may obtain an initial guess of the molecular geometry running a short simulation or minimization on the ligand. See e.g. how it is done inbasic energy minimizationin CHARMM.

·  Guess coordinates works poorly. Although generally unnecessary, the "guess coordinates" output can be improved adding an IC table to the RTF. A workaround to get the IC table is to submit the molecule (in the appropriate geometry) to theSwissParamwebsite. Open the resulting topology file, and copyonlythe lines beginning by IC into the topology created byParamChem.

Example files

Download themhere.

References

[1] QM-based fitting of parameters,http://dogmans.umaryland.edu/~kenno/tutorial/#equil.

[2] Parametrization by analogy,http://www.ks.uiuc.edu/Training/Tutorials/science/forcefield-tutorial/forcefield-html/node6.html.

[3]NAMD tutorial, Generating a Protein Structure File (PSF)

[4] Vanommeslaeghe K, Hatcher E, Acharya C, Kundu S, Zhong S, Shim J, et al. CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields.J. Comput. Chem. 2010;31(4):671–90.