Supplementing Information

Fast Photochemical Oxidation of Proteins (FPOP) Maps the Epitope of EGFR Binding to Adnectin

Yuetian Yan1, Guodong Chen2, Hui Wei2, Richard Y.-C. Huang2, Jingjie Mo2, Don L. Rempel1, Adrienne A. Tymiak2, Michael L. Gross*1

Center for Biomedical and Bioorganic Mass Spectrometry, Department of Chemistry1, Washington University in St. Louis, St. Louis, MO

Bioanalytical and Discovery Analytical Sciences, Research and Development, Bristol-Myers Squibb2, Princeton, NJ

1. Modeling of product distribution at protein global level after FPOP labeling

The custom software performed steps to measure the intensities of components that contribute to the mass spectrum, extract the intensities of the series of components representing the modifications due only to the addition of oxygens, and then, fit a Poisson probability density function to the oxygen series intensities. The quality of the fit was reported by the coefficient of determination (R2).

The spectrum profile was windowed to reduce the amount of computation. The window had two parts, each is anchored to the lowest mass exhibited by the protein experiencing its greatest lose owing to postulated modifications. The second part was used for the discovery of peaks and had a start mass that was 6.5 Da below the anchor and extends to an end mass that was 170 Da higher. The first part was used to measure the contribution of chemical noise to immediate area of the spectrum and begins 40 Da below the start mass of the second window and extends to 0.6 Da below the start mass of the second window. The baseline used for the calculations was the average of the profile intensities in the first window. The windows defined above in terms of neutral mass were rescaled to mass-to-charges ratios by the charge state specified for processing.

The measurement of component intensities began with the processing of peaks from a profile spectrum. First, the estimates of the peak locations were marked, and the beginning and end of each peak was found. The m/z centroid, peak area, and second moment of each peak was calculated. The peak intensity was calculated as the scale factor required by a model Gaussian with the same centroid and second moment to produce the same peak area when sampled at the peak profile m/z values between the peak beginning and end.

A model isotopic pattern was constructed for each component that was postulated to contribute to the spectrum. The isotopic pattern is a discrete probability density function based on its molecular formula, which is the molecular formula of the protein plus the molecular formula of a modification. The density function takes into account the number of each atom in the formula and the natural abundance of each atom’s isotopes. A hydrogen atom was added for each charge on the ions. The mass of an electron was deducted for each charge when ion masses were calculated.

The list of peaks was filtered to form a series (there may be more than one peak for each nominal neutral mass). Each peak must have at least three points on it in the profile. The peak mass-to-charge ratio centroid measured from the profile must be within the expected full width at half height of a theoretical mass-to-charge ratio computed from the spectrum model in order for it accepted for the series. If the measured full width at half height of the peak was more than three standard deviations from the average of all the measured full widths at half height of the peaks in the list, it was not included in the series.

The series of peak intensities was least-squares fitted by a superposition of the isotopic probability density functions for the components in the spectrum. The areas of the component density functions are postulated in each trial of the search. The fitted areas were used as the intensities of the components. The fraction unmodified was calculated as the signal intensity of the one unmodified component divided by the sum of all the signal intensities. The calculations were performed by using Mathcad version 14.0 M010 (Parametric Technology Corporation, Needham, MA). A table of postulated modifications follows.

Description / # of C, H, N, O, S, Ca, Na, Mg, K
"unox" / 0 0 0 0 0 0 0 0 0
"plusO" / 0 0 0 1 0 0 0 0 0
"carbonyl" / 0 -2 0 1 0 0 0 0 0
"plusO2" / 0 0 0 2 0 0 0 0 0
"Ocarbonyl" / 0 -2 0 2 0 0 0 0 0
"carbonyl2" / 0 -4 0 2 0 0 0 0 0
"plusO3" / 0 0 0 3 0 0 0 0 0
"O2carbonyl" / 0 -2 0 3 0 0 0 0 0
"Ocarbonyl2" / 0 -4 0 3 0 0 0 0 0
"carbonyl3" / 0 -6 0 3 0 0 0 0 0
"lossCO" / -1 0 0 -1 0 0 0 0 0
"lossCO2" / -1 0 0 -2 0 0 0 0 0
"decarbox" / -1 -2 0 -1 0 0 0 0 0
"plusO4" / 0 0 0 4 0 0 0 0 0
"carbonyl4" / 0 -8 0 4 0 0 0 0 0
"plusO5" / 0 0 0 5 0 0 0 0 0
"carbonyl5" / 0 -10 0 5 0 0 0 0 0
"plusO6" / 0 0 0 6 0 0 0 0 0
"carbonyl6" / 0 -12 0 6 0 0 0 0 0
"carbonyl7" / 0 -14 0 7 0 0 0 0 0
"HistRingOpen" / -1 -2 -2 2 0 0 0 0 0
"Oxygenloss" / 0 0 0 -1 0 0 0 0 0
"Hydrogen2loss" / 0 -2 0 0 0 0 0 0 0
"Carbonloss" / -1 0 0 0 0 0 0 0 0
"carbonylminusH2" / 0 -4 0 1 0 0 0 0 0
"water" / 0 2 0 1 0 0 0 0 0
"Na" / 0 -1 0 0 0 0 1 0 0
"K" / 0 -1 0 0 0 0 0 0 1
"water_O" / 0 2 0 2 0 0 0 0 0
"Na_O" / 0 -1 0 1 0 0 1 0 0
"K_O" / 0 -1 0 1 0 0 0 0 1
"HISplus5" / -1 -1 -1 2 0 0 0 0 0

Table S1. Modifications by FPOP on all peptides identified by ProtmapMS

peptide / Modification species
15.995 / 13.979 / -10.032 / -22.032 / 31.99 / -30.011 / 4.979 / -22.032 / -31.99
14-29 / + / + / - / - / - / - / - / - / -
30-48 / + / - / - / - / - / - / - / - / -
49-56 / + / - / - / - / - / - / - / - / -
57-74 / + / - / - / - / - / - / - / - / -
75-84 / + / - / - / - / - / - / - / - / -
85-105 / + / - / - / - / - / - / - / - / -
106-114 / - / - / - / - / - / - / - / - / -
110-114 / - / - / - / - / - / - / - / - / -
115-125 / + / - / + / + / - / - / - / - / -
126-141 / + / - / - / - / + / - / - / - / -
142-165 / + / - / - / - / + / - / - / - / -
166-185 / + / - / - / - / - / - / - / - / -
189-198 / + / - / - / - / - / - / - / - / -
201-220 / + / - / - / - / - / - / - / - / -
203-220 / + / - / - / - / - / + / + / - / -
221-228 / + / - / - / - / - / - / - / - / -
230-260 / + / - / - / - / + / - / - / - / -
238-260 / + / - / - / - / + / - / - / - / -
261-269 / + / - / - / - / - / + / + / - / -
261-270 / + / - / - / - / - / - / - / - / -
274-285 / + / - / - / - / + / + / + / + / -
286-300 / + / - / - / - / - / - / - / - / -
286-301 / + / + / - / - / + / - / - / - / +
311-322 / + / - / - / - / - / - / - / - / -
312-322 / + / - / - / - / - / - / - / - / -
323-333 / + / + / - / - / - / - / - / - / -
334-353 / - / - / - / - / - / - / - / - / -
337-353 / + / - / - / - / + / - / - / - / -
354-372 / + / + / + / + / + / - / - / - / -
376-390 / + / - / - / - / + / - / - / - / -
391-403 / + / + / - / - / - / - / + / - / -
408-427 / - / - / + / - / - / - / + / - / -
428-443 / + / - / - / - / - / - / - / - / -
431-443 / + / + / - / - / - / - / - / - / -
444-445 / + / - / - / - / + / - / - / - / -
444-454 / + / - / - / - / + / - / - / - / -
456-463 / + / + / - / - / - / - / - / - / -
477-497 / + / - / - / - / + / + / + / - / -
515-523 / + / - / - / - / - / - / - / - / -
524-550 / + / - / - / - / + / + / - / - / -
“+” indicates the modification is detected and “-” indicates the modification is not detected

Figure S1. Global level analysis of ENV E DIII after FPOP labeling at protein charge state of +8

Figure S2. Coverage map obtained for the tryptic digest of exEGFR.

Table S2. Student’s t-tests for all triplicates of both EGFR free and EGFR-Adnectin1 bound states.

peptide / P value / Peptide / P value
14-29 / 0.006 / 286-300 / 0.641
30-48 / 0.822 / 286-301 / 0.157
49-56 / 0.421 / 311-322 / 0.143
57-74 / 0.007 / 312-322 / 0.177
75-84 / 0.044 / 323-333 / 0.048
85-105 / 0.449 / 334-353 / NA
106-114 / NA / 337-353 / 0.069
110-114 / NA / 354-372 / 0.352
115-125 / 0.032 / 376-390 / 0.517
126-141 / 0.123 / 391-403 / 0.122
142-165 / 0.807 / 408-427 / 0.063
166-185 / 0.617 / 428-443 / 0.186
189-198 / 0.713 / 431-443 / 0.686
201-220 / 0.087 / 444-454 / 0.052
203-220 / 0.147 / 456-463 / 0.403
221-228 / 0.598 / 477-497 / 0.231
230-260 / 0.253 / 515-523 / 0.104
238-260 / 0.453 / 524-550 / 0.541
261-269 / 0.446 / 551-569 / 0.111
261-270 / 0.902 / 570-585 / 0.635
274-285 / 0.425 / 586-618 / 0.570

Table S3. Student’s t-tests for all seven residues or short regions labeled in peptide 14-29 and 57-74 of both EGFR free and EGFR-Adnectin1 bound states.

residue / p value
L14 / 0.003
L17 / 0.017
F20 / 0.014
F24 / 0.021
L17-H23 / 0.204
57-74unknown / 0.030
L69 / 0.032