Petroleum Hydrocarbon Fingerprinting - Numerical Interpretation Developments

John W. Wigger, P.E. Environmental Liability Management, Inc., Tulsa, Oklahoma

Bruce E. Torkelson, Torkelson Geochemistry, Inc., Tulsa, Oklahoma

ABSTRACT

A feasibility study was conducted to assess the use of mathematical algorithms as aids for interpreting hydrocarbon fingerprint data. The first algorithm developed applies a correlation routine to determine the degree of similarity among different hydrocarbon samples. The second algorithm models the evaporation portion of the weathering process in gasoline. Controlled evaporation experiments of different grades of gasoline samples were used to create a matrix of numerical multipliers that describes individual component volatilization. This matrix can be used to: 1) estimate the composition of gasoline after a release, and inversely 2) estimate the original composition of a gasoline obtained subsequent to a release (for example, from a monitoring well).

Algorithms like these will probably never fully replace the visual process that is now used to interpret hydrocarbon fingerprints, however, they have the potential to add a more objective and quantitative perspective to the process.

INTRODUCTION

Hydrocarbon fuels and derivative products discovered in soils and groundwater at environmental release sites are often characterized by use of a laboratory technique known as capillary column gas chromatography (also referred as hydrocarbon fingerprinting, gas chromatography or GC fingerprinting). GC fingerprinting is an extremely useful tool in the investigation of subsurface contamination of soil and groundwater. (Bruce and Schmidt) GC fingerprints are used to obtain information from liquid hydrocarbon samples (free product) by determining the composition of the hydrocarbons present. The identification and interpretation of GC fingerprints, however, is largely a qualitative practice and dependent upon the skill and experience of the individuals(s) involved.

BACKGROUND

Petroleum Hydrocarbon Chemistry

Petroleum hydrocarbons consist of a very large number of compounds that, by definition, are found in crude oil, as well as other sources of petroleum such as natural gas, coal, and peat. Petroleum hydrocarbons consist of three major groups of compounds. These are alkanes (paraffins), alkenes (olefins), and aromatics.

Paraffins, are one of the major constituents of crude oil and are found in refined petroleum products such as gasoline, kerosene, diesel fuel, heating oil, etc. There are three major classes of paraffins; these are linear alkanes, branched alkanes, and naphthenes. The linear alkanes have carbon atoms arranged in a line and there are only two ends to these molecules. Linear alkanes are also referred to in the literature as n-alkanes. Branched alkanes have the carbon atoms arranged similar to the n-alkanes, however, some of the carbon atoms are branched, thus creating many differing configurations. Naphthenes are molecules in which the carbon atoms are arranged in one or more rings.

Olefins are formed during the refining process of creating petroleum products from crude oil. These molecules have a double bond and two less hydrogen atoms than their corresponding alkane.

Aromatics contain one or more 6 carbon rings with 3 of the carbons containing double bonds. Examples of 1 ring (or mononuclear) aromatics are Benzene, Toluene, Ethylbenzene, and Xylene (BTEX). Multiple ring aromatics (polynuclear) are aromatic compounds with multiple 6 carbon ring molecules. Examples of these are naphthalene, anthracene, pyrene, and many more.

Hydrocarbon products such as gasoline, diesel fuel, and asphalts are all created from crude oil by a variety of refining and distillation processes. Each product is produced by the combination of multiple individual hydrocarbon compounds all of which have slightly different vaporization and boiling temperatures. For example, gasoline is the combination of many lower boiling range compounds including C4 to C12 alkanes, C4 to C7 alkenes, and aromatics BTEX. The middle boiling range compounds are used in differing proportions to create products such as kerosene, diesel, and heating oil. These products predominantly contain C10 to C24 alkanes, and polynuclear aromatics with little to no olefins. (Zemo, Graff, and Bruya)

Hydrocarbon Fingerprinting

GC fingerprints are created by injecting a small portion of the sample into a gas chromatograph. Once injected, the product is heated and vaporized and carried into a capillary column by a flow of inert gas. After injection the temperature of the column is slowly raised. As the temperature increases the compounds begin to move through the column, in general the more volatile and lower boiling compounds start moving first. A flame ionization detector connected to the end of the column detects the components of the product as they elute from the column. The time that it takes for individual components to go through the column depends on the temperature, length of column, column characteristics, and the character of the compound itself.

Five GC fingerprints of various hydrocarbon products (gasoline, kerosene, diesel, JP-8 jet fuel and crude oil) are presented in Figures 1 through 5. A few of the peaks have been labeled identifying some of the compounds in each of the products.

Figure 1. Gasoline I (Regular Unleaded Gasoline)

Figure 2. Kerosene I

Figure 3. Diesel Fuel

Figure 4. JP-8 Jet Fuel

Figure 5. Crude Oil

MATERIALS & METHODS

Gas Chromatography

Hydrocarbon samples were analyzed on a Hewlett Packard 5890 instrument equipped with a split/splitless injector, J&W 30 meter DB-1 column and an FID detector: All gas flow rates were set to manufacturer specifications. Injections were made in split mode with a split ratio of 1:100. The column oven was programmed from -10 o to 350 o C at 10 o C/minute with 4 minute hold at 350o C. The injector temperature is set at 350o C and the detector temperature is 360o C. Data was acquired and processed with an EZChrom Chromatography data system.

Gasoline Weathering Simulation

One of the weathering processes that can affect released gasoline is evaporation. To simulate evaporation under controlled conditions, three different grades of gasoline were obtained from a local retailer and allowed to evaporate to controlled volumes.

The gasoline components with the lowest boiling points tend to volatilize more rapidly than the components with higher boiling points. On the GC fingerprint the components that have the shortest retention times (left side of the GC fingerprint) are the most volatile and will tend to decrease in peak intensity preferentially as more volatilization takes place. This is clearly illustrated in Figure 6, where GC fingerprints from the same gasoline are shown under differing levels of volatilization.

Figure 6. Controlled Evaporation Of Regular Grade Of Unleaded Gasoline (Note: Chromatograms Have Been Normalized To Make The Heights of Naphthalene Peaks Equal)

Evaporation Procedures

The evaporation procedure that was used is described below:

Samples of three grades of gasoline (87, 89, and 93 octane) were acquired at a local service station.
Equal volumes of each grade of gasoline were placed in four (4) 40 milliliter vials, making a total of 12 vials.
Three (3) vials of each sample were uncovered (total of 9) and allowed to volatilize at room temperature.
The uncovered vials were closely monitored and capped when the gasoline was reduced to the desired volume resulting in one vial each at 75%, 50%, and 25% of original volume for each of the three grades of gasoline.

Algorithms Described

Correlation Coefficient

The correlation coefficient, denoted by , measures the relationship between two data sets that are scaled to be independent of the unit of measure and is given by the formula:

Where and are values in each corresponding data set.

The value of the correlation coefficient is always between -1 and +1. A value of equal to -1 indicates a perfect linear relationship between the sample values of x and y, with the value of y decreasing as the value of x increases. A value of equal to +1 also indicates a perfect linear relationship between the sample values, but one in which the value of y increases as x increases. Larger values of y are associated with larger values of x; and smaller values of y are associated with smaller values of x. If there is no linear relationship between the sample values of x and y, then will have a value near or equal to zero (Hayslett).

The correlation coefficient determines whether two data sets move together; that is, whether large values of one set are associated with large values of the other (positive correlation), whether small values of one set are associated with large values of the other (negative correlation), or whether the values in the two sets are unrelated.

In this study, 71 hydrocarbon chromatogram peaks, each representing a different hydrocarbon compound, were used in the analysis. Table 1, presents a list of the compounds. Integrated peak areas were measured and then tabulated for each of the five hydrocarbon samples in figures 1 through 5. The integrated peak is a measure of the intensity of the response of the flame ionization detector to each of the individual compounds measured in millivolt seconds. The library samples includes gasoline, kerosene, diesel, JP-8 jet fuel, and crude oil. Once the peak area data were collected and tabulated for these hydrocarbon samples, three additional hydrocarbon samples were also collected. The first sample collected had been identified as a kerosene from the provider, however, the GC fingerprint clearly illustrated a much broader range of hydrocarbons than the kerosene run earlier. The second sample was a laboratory standard mixture of gasoline and diesel fuel. The third sample was a regular grade of unleaded gasoline from a different service station. Figures 7, 8, and 9 present the GC fingerprints of these three respective samples.

Table 1. 71 Hydrocarbon Compounds Used For Analysis

1 / iC4 = Isobutane / 37 / IP14 = C14 Isoprenoid
2 / nC4 = Butane / 38 / 1 M naph = 1 Methylnaphthalene
3 / iC5 = Isopentane / 39 / nC13 = Tridecane
4 / nC5 = Pentane / 40 / IP15 = Farnesane
5 / 2 M Pent = 2 Methylpentane / 41 / nC14 = Tetradecane
6 / 3 M Pent = 3 Methylpentane / 42 / IP16 = C16 Isoprenoid
7 / nC6 = Hexane / 43 / nC15 Pentadecane
8 / C6 Olefin = C6 Olefin / 44 / nC16 = Hexadecane
9 / M Cycl Pent = Methyl cyclopentane / 45 / IP18 = Norpristane
10 / 2,4 DMP = 2,4 Dimethylpentane / 46 / nC17 = Heptadecane
11 / Bnz = Benzene / 47 / Pristane = Pristane
12 / Cyclo Hexane = Cyclohexane / 48 / nC18 = Octadecane
13 / 2 M Hex = 2 Methylhexane / 49 / Phytane = Phytane
14 / 3 M Hex = 3 Methylhexane / 50 / nC19 = Nonadecane
15 / Isooctane = Isooctane or 2,2,4 Trimethylpentane / 51 / nC20 = Eicosane
16 / nC7 = Heptane / 52 / nC21 = Heneicosane
17 / MCHX = Methylcyclohexane / 53 / nC22 = Docosane
18 / Tol = Toluene / 54 / nC23 = Tricosane
19 / nC8 = Octane / 55 / nC24 = Tetracosane
20 / EB = Ethylbenzene / 56 / nC25 = Pentacosane
21 / m/p-xyl = m/p Xylene / 57 / nC26 = Hexacosane
22 / o-xyl = o Xylene / 58 / nC27 = Heptacosane
23 / nC9 = Nonane / 59 / nC28 = Octacosane
24 / propylbenzene = n Propylbenzene / 60 / nC29 = Nonacosane
25 / 1M 3E benz = 1 Methyl 3 ethylbenzene / 61 / nC30 = Triacontane
26 / 1M 4E benz = 1 Methyl 3 ethylbenzene / 62 / nC31 = Hentriacontane
27 / 1,3,5 T M Benz = 1,3,5 Trimethylbenzene / 63 / nC32 = Dotriacontane
28 / 3,3,4 T M Hept = 3,3,4 Trimethylheptane / 64 / nC33 = Tritriacontane
29 / 1,2,4 T M Benz = 1,2,4 Trimethylbenzene / 65 / nC34 = Tetratriacontane
30 / nC10 = Decane / 66 / nC35 = Pentatriacontane
31 / 1,2,3 T M Benz = 1,2,3 Trimethylbenzene / 67 / nC36 = Hexatriacontane
32 / nC11 / 1,2,4,5 TeMB = Undecane and 1,2,4,5 Tetramethlybenzene / 68 / nC37 = Heptatriacontane
33 / Naph = Naphthalene / 69 / nC38 = Octatriacontane
34 / nC12 = Dodecane / 70 / nC39 = Nonatriacontane
35 / IP13 = C13 Isoprenoid / 71 / nC40 = Tetracontane
36 / 2 M naph = 2 Methylnaphthalene

Figure 7. Kerosene II

Figure 8. Gasoline Diesel Laboratory Mixture

Figure 9. Gasoline II (Regular Unleaded Gasoline)

Once the peak areas for all of the 71 individual components were established for each of the samples, the correlation coefficient was calculated between the samples presented in Figures 1 through 5, and those presented in Figures 7, 8, and 9. Table 2 presents the results of these calculations.

Table 2. Results of Correlation Coefficient Determinations

Gasoline I / Kerosene I / Diesel / JP-8 Jet Fuel / Crude Oil
Figure 1 / Figure 2 / Figure 3 / Figure 4 / Figure 5
Kerosene II
Figure 7 / -0.156 / 0.732 / 0.882 / 0.932 / 0.379
Gas/Diesel Mixture
Figure 8 / 0.638 / 0.333 / 0.528 / 0.505 / 0.440
Gasoline II
Figure 9 / 0.894 / -0.112 / -0.213 / -0.065 / 0.134

To evaluate the reproducibility of this process, the regular unleaded gasoline presented in Figure 9, was run on the GC five separate times, thus creating five GC fingerprints and five slightly differing numerical data sets. The data collected for all 71 compounds were then compared to each other, thus creating a total of twenty (20) correlation coefficient comparisons.

Gasoline Weathering

An algorithm was developed to model the volatilization process of gasoline released into the environment. This was accomplished by using experimental data obtained from the controlled evaporation of the three different grades of gasoline described earlier. GC fingerprint data were used to create a numerical function that describes the volatilization process. This numerical function can then be applied to fresh gasoline samples to predict what the product GC fingerprint would look like if weathered in an environmental release.

As discussed earlier, the gasoline components with the lowest boiling points tend to volatilize more rapidly than the rest of the components. The components with the highest boiling points (components at the right hand side of the GC fingerprints with retention times > 10 minutes) experience little volatilization under the weathering conditions described above.

With the above in mind, it was assumed that the actual volume of the naphthalene stayed relatively constant during the weathering simulation and can be used similar to an internal standard. By utilizing this, the GC data from each stage of the weathering process were normalized to the naphthalene peak. Once each GC fingerprint was normalized to naphthalene, each component was then evaluated as the total volume of product decreased. Once this process was completed for all components, then a matrix of volatilization multipliers was created. This matrix consists of a table of factors ranging from 0.0 to 1.0 describing the amount of volatilization of each of the 71 components at differing stages of evaporation of the total product.

To demonstrate how the matrix was created, Table 3 presents the integrated peak areas for the first 8 of the 71 components from the premium grade unleaded gasoline used in the experiment. Table 4, presents the same component integrated peak areas after they have been normalized to naphthalene. And Table 5, presents each component integrated peak area from table 4 normalized from 0 to 1. Table 5 represents the matrix of multipliers. Two other similar tables were also produced for the mid-grade and regular grades of gasoline. The entire matrix of multipliers for each of the grades of gasoline was not presented because of the size of the tables.

Table 3. Peak Areas For Components At Differing % Volatilization

(Premium Grade Gasoline)

Sample Id / iC4 / nC4 / iC5 / nC5 / 2 M Pent / 3 M Pent / nC6 / C6 Olefin
Gasoline
% Vol. / Peak Area / Peak Area / Peak Area / Peak Area / Peak Area / Peak Area / Peak Area / PeakArea
Prem-0 / 5139 / 53846 / 129407 / 21998 / 34183 / 18112 / 14922 / 2865
Prem-25 / 0 / 737 / 23654 / 6196 / 20421 / 11421 / 10368 / 1940
Prem-50 / 0 / 0 / 0 / 0 / 1666 / 2305 / 2894 / 547
Prem-75 / 0 / 0 / 0 / 0 / 0 / 0 / 0 / 0

Table 4. Peak Areas For Components At Differing % Volatilization After Normalizing With Naphthalene (Premium Grade Gasoline)

Sample Id / iC4 / nC4 / iC5 / nC5 / 2 M Pent / 3 M Pent / nC6 / C6 Olefin
Gasoline
% Vol. / Peak Area / Peak Area / Peak Area / Peak Area / Peak Area / Peak Area / Peak Area / PeakArea
Prem-0 / 5139 / 53846 / 129407 / 21998 / 34183 / 18112 / 14922 / 2865
Prem-25 / 0 / 531 / 17047 / 4465 / 14717 / 8231 / 7472 / 1398
Prem-50 / 0 / 0 / 0 / 0 / 1165 / 1612 / 2024 / 383
Prem-75 / 0 / 0 / 0 / 0 / 0 / 0 / 0 / 0

Table 5. Matrix Of Multipliers For Individual Components Of Gasoline Under Differing Percentages Of Overall Product Volatilization (Premium Grade Gasoline)

Sample Id / iC4 / nC4 / iC5 / nC5 / 2 M Pent / 3 M Pent / nC6 / C6 Olefin
Gasoline% Vol. / Peak Area / Peak Area / Peak Area / Peak Area / Peak Area / Peak Area / Peak Area / PeakArea
Prem-0 / 1 / 1 / 1 / 1 / 1 / 1 / 1 / 1
Prem-25 / 0 / 0.01 / 0.132 / 0.203 / 0.4305 / 0.4544 / 0.501 / 0.488
Prem-50 / 0 / 0 / 0 / 0 / 0.0341 / 0.089 / 0.136 / 0.1336
Prem-75 / 0 / 0 / 0 / 0 / 0 / 0 / 0 / 0

RESULTS & DISCUSSION

Reproducibility

The reproducibility of the gas chromatography analysis technique was evaluated by analyzing the gasoline sample presented in Figure 9 a total of 5 times. The 20 correlation coefficients calculated between each of the 5 analyses and the other four had a minimum of 0.99545, a maximum of 0.999989, an average of 0.997921 and a standard deviation of 0.001704. From this evaluation, truly alike GC chromatograms will probably have correlation coefficients of 0.99 or better. It is possible that other product types may have different reproducibilities since their data may include different peaks that come from a different part of the GC fingerprint.

Correlation

Table 2 presents the results of the correlation coefficients calculated when comparing the samples in Figures 1 through 5 to those in Figures 7 through 9. Prior to calculating the correlation coefficients it was expected that similar products, for example gasolines, would show higher correlations among themselves and less correlation when compared to other product types. The exact numbers, however, could not be anticipated nor how the correlation coefficients would vary between similar and different product types. Significant, logical, and reproducible differences and similarities in the correlation coefficient numbers are crucial for this process to be a useful tool. The correlation coefficient results must also make sense and compare favorably with visual inspection of the GC fingerprints. From this feasibility study, it appears that there are meaningful similarities and differences in correlation coefficient numbers calculated using GC fingerprint data. This study suggest that similar product types such as gasolines could be expected to have correlation coefficients of about 0.9 or better. Dissimilar product types have a much lower correlation coefficient of perhaps 0.6 or 0.5 or even less. The correlation coefficients shown here also make sense when compared to the visual evaluation of the GC fingerprints.

An unexpected and interesting result of the correlations was that the JP-8 jet fuel and the kerosene II sample had a high correlation coefficient. At first this seemed unusual, but it must be remembered that JP-8 and kerosene are often times from the same distillation range of the crude oil. Visual comparison of the two GC chromatograms confirms the rather high degree of similarity of the two products.

Gasoline Weathering

The matrix of multipliers created for the evaporation sequences for the three different grades of gasoline numerically models how the volatile components tend to evaporate from the sample. The matrix of multipliers can be used to numerically alter ("evaporate") the data from a fresh sample in an attempt to estimate the composition of the sample after a weathering process. Once the sample has been artificially altered, it can then be numerically compared to other controlled weathered samples.

This weathering algorithm can also be used in the inverse. For example, if one had a hydrocarbon sample from a site but did not know the extent of weathering that has already taken place, the sample could be correlated with the library of samples of known weathered gasolines. Once a library sample with the highest correlation has been determined, a matrix of multipliers of the sample with the highest correlation could be used to reconstruct the composition of the original gasoline sample when fresh. This matrix of multipliers needed to estimate the original gasoline composition would be constructed by simply using the inverse of the individual compounds within the matrix. that most closely correlated with the weathered sample. (For example, if Benzene's multiplier is 0.25, then to reconstruct the original amount of Benzene, one would multiply the peak area by 1/0.25 = 4.0.)