Automated Reaction Kinetics using Excel VBA and JCAMP.docPage 1 of 8
Automated Reaction Kinetics using Excel VBA and JCAMP
JCAMP automation comprises two software packages which work together in processing NMR data for the UsefulChem project. The Java program CreateBLOCKFile takes a set of JCAMP files from the Varian UNITY INOVA-300 NMR, decompresses their data (using the algorithm/class from JSpecView), and combines them into a single JCAMP BLOCK file with the following format:
##TITLE=
##JCAMP-DX=
##DATA TYPE=
##TITLE=
##JCAMP-DX=
##DATA TYPE=
##DATA CLASS=
##ORIGIN=
##SPECTROMETER/DATA SYSTEM=
##.OBSERVE FREQUENCY=
##.OBSERVE NUCLEUS=
##.FIELD=
##.ACQUISITION TIME=
##.AVERAGES=
##RESOLUTION=
##.SPINNING RATE=
##.PHASE 0=
##.PHASE 1=
##XYPOINTS=(XY..XY)
(XY Data Pairs …)
##END=
(etc., more spectra …)
##END= $$end of BLOCKs
CreateBLOCKFile can be executed from a Windows command line prompt using the command line
java –cp . CreateBLOCKFile <configuration file>
where <configuration file> is anXML file the configuration parameters are read from (don’t include the brackets). As shown, the command assumes the path to the Java run-time executable to be in your PATH system variable. If it is not, you will need to modify your PATH variable accordingly, or place the entire path in the command line. The configuration file has the following format, e.g.:
<?xml version="1.0" encoding="utf-8" ?>
<Configuration
InputFileDirectory="C:\Drexel\Cheminformatics\JCAMP\exp054\126"
SequenceStart="1/26/2007 10:50:00 EST"
InputFileRoot="*"
InputFiles=""
HTTPath=""
OutputFile="Exp054126"
Title="##TITLE=Exp054126"
JCAMPVersion="##JCAMP-DX=5.00"
DataType="##DATA TYPE=LINK"
BunchingFactor="10"
BaselineRange="0.5,1.5,10,11"
/>
InputFileDirectory is the directory the JCAMP files reside in. Note that this directory path must already exist; the software does not attempt to create it. InputFileDirectory can be a network drive path (i.e. //<machine name>/<directory>/<sub-directory>), although as the user must have write access to it, as output files will be written there. SequenceStart is a parameter used forconcentration/kinetics studies: if it is set to a date/time in the format shown here, when CreateBLOCKFile generates the BLOCK file it will replace the ##TITLE= fields in the JCAMP files with a time stamp (in minutes) calculated as the difference between the file’s internal date/time field and SequenceStart. If SequenceStart is left blank, this calculation/replacement will not be performed.
InputFileRoot and InputFiles are used to determine which files in InputFileDirectory are to be processed. InputFileRoot can be used as a search prefix; here it is set to "*" to indicate all files in the directory (with the extention jdx). InputFiles is a simply list of comma separated file names. Note that if the search prefix field is used, the file extension is assumed to be jdx. OutputFile is the prefix name for all the files created by the software; these files are also placed in InputFileDirectory.
If there is an entry under HTTPPath (which must be terminated with /), this will be combined with any entries in InputFiles to search for and download those files from the specified address into InputFileDirectory. Once the files are downloaded, this field can be blanked out, of course.
The Title, JCAMPVersion, and DataType fields comprise the three header fields for the JCAMP BLOCK file. They should be set as shown here, except that Title should be set to the actual experiment title, e.g., Title="##TITLE=Exp054126". BunchingFactor reduces the amount of data in the JCAMP file by replacing each bunched set set with the average of the bunch; i.e., BunchingFactor="10" replaces each set of 10 points with its average value, reducing the amount of data by 90%. This is useful to save processing time, and may also be necessary for the peak-searching algorithm to work well. BaselineRange is a set of PPM values, encompassing baseline regions near the beginning and end of the spectrum, that will be used to subtract out any baseline drift; at the current time, it can only be determined by viewing the spectra.
As noted, CreateBLOCKFile can be executed as a stand-alone program. The resulting BLOCK JCAMP file, which has the file path name <InputFileDirectory>\<OutputFile>.jdx, can even be loaded and viewed in JSpecView. (In addition to the BLOCK JCAMP file, CreateBLOCKFile also creates an output file in CML format, with the path <InputFileDirectory>\<OutputFile>.cml.) However, the real strength of the software lies with the JCAMPNMR.xls Excel workbook which accompanies it. When you open this workbook, it prompts you for an initialization file, which upon opening the test one yields the screen below (if you cancel out of the File Open box, you will get this same screen, with only a few parameters filled in):
There are five command buttons in Row #1 of this sheet: |Get Parameters|, which opens the same File Open box as opening the workbook, |Save Parameters|, which opens a File Save As box to save the initialization parameters (this function must be invoked after every parameter change), |Build BLOCK File|, which runs the CreateBLOCKFile program described above, using the parameters in the [BLOCKFileBuilder] parameter section (first grey row), and finally |Process BLOCK File| and |Reintegrate Peaks| which will be shortly.
The entry fields (white areas) in the [BLOCKFileBuilder] section are simply the same as those described previously for the CreateBLOCKFile program. Again, the Java run-time environment must be in your PATH system variable for the program to run this program as shown; otherwise you will need to specify this path. Clicking |Save Parameters| not only saves the initialization file for the workbook but also saves/creates the configuration file for CreateBLOCKFile. The fields in the [Parameters] section, except for ThreshholdMultiplier, XLabel, and YLabel, control the layout and appearances of the various plots on the worksheets which will be created. They should be left as is in the beginning.
The [Peak#] sections describes the various peaks, or integration regions, expected to be found in the NMR spectra to be processed. If these fields are left blank, the software will search for and produce a list of peaks when |Process BLOCK File| is clicked. To search for peaks effectively it may be necessary to tweak the BunchingFactor and ThreshholdMultiplier fields. Remember that when any field in the [BLOCKFileBuilder] section is changed, both “Save Parameters” and “Create BLOCK File” must be run before “Process BLOCK File” – changing field entries in either the [Parameters] or [Peak#] sections only requires |Save Parameters| to be run first.
The entries in the [Peak#] section above are as follows. PeakName, PeakStart, and PeakEnd are self-descriptive. NucleusCount is the number of NMR active nuclei in the peak being integrated; for example, 3 for the “Methyl” species in [Peak1]. The Concentration field is used only if InternalStandard is set to “True”; in this case, a response factor is calculated for each spectrum processed based on this standard, and this factor is used in conjunction with the NucleusCount fields of all the peaks to determine the actual run-time concentration of the associated species in all the spectra of the BLOCK file. InternalStandard should therefore only be set “True” for species known to be constant across a set of spectra in a BLOCK. If this is done, then if the entry in the Concentration field is, say, in mmol, “mmol” should be the entry in the YLabel field of the [Parameters] section. ReactionOrder is used in time studies, and specifies the expected rection order for the particular species represented by the peak.
Note that in the above example, there is an entry in the SequenceStart field of the [CreateBLOCKFile] section. Because of this, when “Build BLOCK File” is clicked, the spectra will be assigned a time offset from SequenceStart in their "##TITLE=" fields, as described above for the CreateBLOCKFile program. As these time offsets will be in minutes, the XLabel field under [Parameters] is set to “Minutes”.
Once the BLOCK file is created with |Build BLOCK File|, you click |Process BLOCK File| to readand plot the spectra, integrate all selected regions/peaks, and create a kinetics plot for the entire data set. When the calculations/plotting is finished you should be presented with a “Summary” worksheet like this one:
This is the concentration versus time data for the spectra analyzed in this example, as well as (the top part) of the kinetics plots. Note that the peaks in column A correspond to the entries in the [Peak#] sections, while row 1 contains the time offsets calculated by CreateBLOCKFile using the entry in the SequenceStart field. The rest of the data are actual concentrations, calculated by using the Methyl peak as an internal standard (which is of course why its concentration remains constant). The growth of the imine peak, and concurrent loss of aldehyde, are apparent. The kinetics of these reactions can be studied with the kinetics plots.
In addition to this worksheet, the results are also saved in XML format, with the file name <OutputFile>.Summary.xls. An example of this file is shown below.
<?xml version="1.0" encoding="utf-8" ?>
<summary>
<peak name="Methyl (2.1-2.3 PPM)" response="750" units="mmol">
<peak name="imine (8.2-8.3 PPM)" response="69.990229061292" units="mmol">
<peak name="aldehyde (9.7-9.8 PPM)" response="342.4731068236" units="mmol">
</time>
<peak name="Methyl (2.1-2.3 PPM)" response="750" units="mmol">
<peak name="imine (8.2-8.3 PPM)" response="175.355496494093" units="mmol">
<peak name="aldehyde (9.7-9.8 PPM)" response="262.953188975104" units="mmol">
</time>
<peak name="Methyl (2.1-2.3 PPM)" response="750" units="mmol">
<peak name="imine (8.2-8.3 PPM)" response="196.558712273177" units="mmol">
<peak name="aldehyde (9.7-9.8 PPM)" response="245.172397666404" units="mmol">
</time>
<peak name="Methyl (2.1-2.3 PPM)" response="750" units="mmol">
<peak name="imine (8.2-8.3 PPM)" response="233.131664764647" units="mmol">
<peak name="aldehyde (9.7-9.8 PPM)" response="218.374584166605" units="mmol">
</time>
<peak name="Methyl (2.1-2.3 PPM)" response="750" units="mmol">
<peak name="imine (8.2-8.3 PPM)" response="364.182213112691" units="mmol">
<peak name="aldehyde (9.7-9.8 PPM)" response="110.220554319361" units="mmol">
</time>
<peak name="Methyl (2.1-2.3 PPM)" response="750" units="mmol">
<peak name="imine (8.2-8.3 PPM)" response="460.692963297133" units="mmol">
<peak name="aldehyde (9.7-9.8 PPM)" response="31.4825284968032" units="mmol">
</time>
<peak name="Methyl (2.1-2.3 PPM)" response="750" units="mmol">
<peak name="imine (8.2-8.3 PPM)" response="473.891906730923" units="mmol">
<peak name="aldehyde (9.7-9.8 PPM)" response="18.6522035285094" units="mmol">
</time>
<peak name="Methyl (2.1-2.3 PPM)" response="750" units="mmol">
<peak name="imine (8.2-8.3 PPM)" response="476.217523575164" units="mmol">
<peak name="aldehyde (9.7-9.8 PPM)" response="17.111315851817" units="mmol">
</time>
<peak name="Methyl (2.1-2.3 PPM)" response="750" units="mmol">
<peak name="imine (8.2-8.3 PPM)" response="499.173044708272" units="mmol">
<peak name="aldehyde (9.7-9.8 PPM)" response="2.64182945669399" units="mmol">
</time>
<peak name="Methyl (2.1-2.3 PPM)" response="750" units="mmol">
<peak name="imine (8.2-8.3 PPM)" response="514.74935840871" units="mmol">
<peak name="aldehyde (9.7-9.8 PPM)" response="0.554508078018416" units="mmol">
</time>
<peak name="Methyl (2.1-2.3 PPM)" response="750" units="mmol">
<peak name="imine (8.2-8.3 PPM)" response="496.82837596795" units="mmol">
<peak name="aldehyde (9.7-9.8 PPM)" response="0.305176908201525" units="mmol">
</time>
</summary>
<kinetics>
<peak name="imine (8.2-8.3 PPM)">
<times units = "Minutes">
0.983333333333333 7.71666666666666 9.71666666666666 13.1833333333333 36.6666666666666 90.4166666666666
119.65 122.2 297.583333333333 410.183333333333 454.183333333333
</times>
<responses units="Ln(mmol)">
4.2483556474678 5.1668153219409 5.2809611768747 5.4516033787922 5.89765432786005 6.13273179788803
6.16097925081742 6.16587473214757 6.21295281861871 6.24368009949374 6.20824464647595
</responses>
</peak>
<peak name="aldehyde (9.7-9.8 PPM)">
<times units = "Minutes">
0.983333333333333 7.71666666666666 9.71666666666666 13.1833333333333 36.6666666666666 90.4166666666666
119.65 122.2 297.583333333333 410.183333333333 454.183333333333
</times>
<responses units="Ln(mmol)">
5.83619313439634 5.57197602764808 5.50196162703384 5.38621186423537 4.70248339765095 3.44943274106074
2.92596429077317 2.83973999037175 0.971471653181662 -0.589673904185266 -1.18686364363506
</responses>
</peak>
</kinetics>
The concentration data are collected from the worksheets containing the individual spectra. These are listed at the bottom of the workbook, with the names “Spectrum #1”, “Spectrum #2”, etc. These spectra are read from the JCAMP BLOCK file, plotted and integrated, before the “Summary” sheet is generated. A typical spectrum worksheet is shown below (“Spectrum #4”):
The actual JCAMP BLOCK file data are in columns A and B of the worksheet (column C contains the integrated data as shown in purple on the plot), while columns D through F contain the integration regions/peaks and their summed areas or Response. In addition to the integration plot, the individual integration regions are delineated on the plot with triangular tick marks. If any fields in the [Parameters] or [Peak#] sections are changed, the spectra can be re-integrated by clicking |Reintegrate Peaks|; this is simply a convenience over closing the workbook and reprocessing with |Process BLOCK File| instead.