18O labelling of C-termini of cross-linked peptides using Trypsin and Glu-C

Pascal van Alphen

Swammerdam Institute for Life Sciences (SILS), Mass Spectrometry Group, University of Amsterdam, Amsterdam, The Netherlands.

The pH dependency of the carboxyl oxygen exchange reaction catalysed by Glu-C has been studied. This resulted in a protocol for efficiently labelling cross-linked proteins, digested by more than one protease,by18O incorporation into the C-termini.Cross links between aminoacid residues in close proximity can provide distance constraints in order to validate computer models of the 3D structure of proteins. An 18O labelled cross-link differs from unlabelled cross-linksby 8 amu whereas surface-labels (mono-link) or loop-links shift only 4 amu. Bis(succinimidyl)-3-azidomethyl-glutarate (BAMG) was used to cross-link cytochrome c and Gas2p, respectively. BAMG is a cross-linking agent with an azido group that allows for selective and efficient purification of peptide mixtures.Here, it is shown that Glu-C is able to efficiently label peptides with 18O in conditions similar to those normally chosen for trypsin.Using this method, several cross-links have been identified in cytochrome c and Gas2p.

Keywords:cross-linking-double digestion - trypsin - Glu-C -18O labelling - mass spectrometry

1

Introduction

For the development of drugs, knowledge of the tertiary or quaternary structure and active site of a protein or protein complex is of great importance.The technique most suited for obtaining knowledge is X-ray diffraction, but suffers from the difficulty with which proteins form crystals.

Computer models predicting the structure of proteins by their amino acid sequence are becoming more and more sophisticated and may remove the need for X-ray diffraction.However, they still require experimental validation.Chemical cross-linking of surface residues of proteins can provide distance constraints with which those models can be validated[1,2]. Unfortunately, flexibility in protein structure is required for susceptibility to proteolytic digestion[3] which implies cross-linking can only be partial and therefore results in a low abundance of cross-linker-modified peptides. This hampers efficient analysis by mass spectrometry and sequence elucidation by MS/MS.

Recently, a cross-linker was synthesised that enables selective reactions with modified peptides in order to purify peptide mixtures[4]. However, this purification does not distinguish between highly informative cross-links and surface-labels (modified by cross-linker but not actually cross-linked).

A problem with cross-linking large proteins and protein complexes is that the amount of possible cross-links increases dramatically and therefore the amount of false positives. An elegant method to identify actual cross-links has been described based on the notion that cross-links have two C-termini that can be labelled with 18O whereas surface-labels only have one[5]. Labelled cross-links will show a shift of 8 amu whereas surface-labels will only shift by 4 amu.

It has long been known that trypsin can catalyse this carboxyl oxygen exchange reaction[6] and evidence has been presented that Glu-C can catalyse this reaction as well[7]. However, conditions for efficient labelling by both Glu-C and trypsin have not yet been defined. A double digestion with Glu-C and trypsin would be useful as it will increase the amount of cross-linked peptides in the mass range optimal for mass spectrometry, i.e. approx. 1000-3000 Da.

In this study, cross-linked horse heart cytochrome cwill be used to define conditions under which efficient double labelling occurs. Various BAMG-cross-links in cytochrome c have been mapped previously[4] which is used for validation of our method. Subsequently the technology will be used for Gas2p to identify new cross-links which will be used to construct a model for the 3D structure of Gas2p.

Mass spectra will be analysed with VIRTUALMSLAB[8]. VIRTUALMSLAB can produce virtual mass spectra of digested and modified proteins and is used to find candidate cross-links. The next step isestablishing a generally applicable protocol for providing distant constraints with which 3D structures predicted by computer models can be validated.

Materials and Methods

Materials.Mass spectrometry grade modified porcine trypsin (Trypsin Gold) was obtained from Promega (Madison, WI). Glu-C from S. aureuswas obtained from Roche (Switzerland). Bovine Insulin was purchased from Sigma. The peptide fragment monitored for the optimisation of the carboxyl oxygen exchange reaction catalysed by Glu-C was Phe-Val-Asn-Gln-His-Leu-Cys-Gly-Ser-His-Leu-Val-Glu (1539.7 m/z) from a Glu-C digest of reduced and alkylated bovine insulin. Insulin was reduced using tris(2-carboxyethyl)phosphine (TCEP) and alkylated by iodoacetamide (IAA).18O-enriched water (>95%) was obtained from Spectra Stable Isotopes. Bis(succinimidyl)-3-azidomethyl-glutarate (BAMG) and purified Gas2p werekindly provided by Luitzen de Jong[4,9]. The amino acid sequences of cytochrome c and Gas2p used in this research can be found in theSupporting Information. All chemicals used were either research grade or of the highest purity commercially available.

Cross-linking and digesting cytochrome c and Gas2p. A solution of cross-linker bis(succinimidyl)-3-azidomethyl-glutarate (BAMG), 20 mM dissolved in DMF, was added to 10 µM Gas2p and 40 µM cytochrome c, respectively, to a final concentration of 0.1 mM BAMG (0.5% v/v DMF) in 50 mM sodium phosphate buffer (pH 7.5) and 100 mM NaCl. The reaction mixture was incubated for 30 min at room temperature.

Subsequently the pH was raised to pH 9 by addition of sodium carbonate and incubated for another 30 min in order to quench the reaction by saponifying unreacted N-hydroxy succinimidyl esters and scavenge any esters that may have formed from the reaction of cross-linker with Ser, Tyr and Thr side chains. The remaining reaction mixture was concentrated to 50 μl by Biomax cutoff filter (5 kDa and 30 kDa for cytochrome c and Gas2p, respectively) and washed twice by phosphate/NaCl buffer.

Subsequently the proteins were denatured by washing with 9 M Urea/50 mM citrate (pH 3) and incubating for 10 min. Cysteines were reduced by incubating with TCEP (final concentration 10 mM) for 20 min at room temperature and alkylated by incubating with IAA (final concentration 9 M Urea, 0.2 M IAA, 0.2 M ammonium bicarbonate) for 60 min in the dark. The protein solution was then washed four times with a solution of 9 M Urea/50 mM phosphate (pH 7.5) and diluted to 1 M Urea/50 mM phosphate by washing with 50 mM phosphate. Subsequently trypsin (1:25 w/w) was added for digestion overnight at 37 °C followed by digestion by Glu-C overnight at 25 °C. Samples containing 5 μg peptides were taken for MALDI-TOF analysis.

Measurement of the pH dependency of theCarboxyl Oxygen Exchange Reactioncatalysed by Glu-C.The efficiency ofcarboxyl oxygen exchange reaction catalysed by Glu-C was determined by MALDI-TOF analysis of 18O incorporation into the C-terminus of a bovine insulin fragment.For the pH studies, a 40 μM solution of insulin digest in aqueous buffers at various pHswas lyophilised and reconstituted with a 400 nM Glu-C solution in a total volume of 10 μl 95% H218O after which it was incubated at 25 °C. The buffer solutions used were 100 mM phosphate at pH 5.8, 6.2, 6.8, 7.4 and 7.8 and 100 mM citric acid/phosphate at pH4, 5 and 6.Theduration of the incubation was 2 hours and was stopped by the addition of one volume of 4% formic acid/0.2% TFA in water. The final percentage of 18O incorporation was determined by measuring relative heights of the 12C monoisotopic peaks containing two, one or no 18O atoms. The influence of the second 13C isotope peak on 18O containing peaks is assumed to be negligible due to the low mass of the peptide. Ionisation efficiency is assumed not to be affected by 18O incorporation, this leads to the following formula:

In which 18O1and 18O2representthe relative height of peaks corresponding to peptides withone 18O atom or two 18O atoms, respectively. The contribution of unlabelled peptides is implied in that it subtracts from 18O1 and 18O2 peaks, which results a lower incorporation. The theoretical maximum is 18O1=0 and 18O2=1 which results in 100% incorporation.

18O-labelling of BAMG-treated cytochrome c and Gas2p digests.A solution of 30 µM cytochrome c and 6 µM Gas2p, respectively, in 100 mM sodium phosphate buffer (pH 6.2) was lyophilised and reconstituted with a solution of2.1 µM and 1.5 µM trypsin, respectively,in a total volume of 10 µl H218O. The mixture was incubated for 2 hours at 37 °C,after which Glu-C (final concentration 2.1 µM and 1.5 µM forcytochrome cand Gas2p, respectively) was added and incubation resumed at 25 °C overnight. Incubation was stopped by addition one volume of 4% formic acid/0.2% TFA in water.

BAMG capture on cyclooctyne-functionalised beads. To purify BAMG-linked peptides a solution of 100 µg Gas2p and 18 µg cytochrome c, respectively, in 60 µl 50% acetonitrile/50% 50mM potassium phosphate buffer (pH 7.5) was prepared from previously cross-linked and digested Gas2p and cytochrome c. Subsequently it was added to approximately 2 mg of dry PL-DMA beadsand incubated on a turning rotavap at 40 °C for 24 hours. The reaction was stopped by spinning down the suspension at 12,000 g for 1 minute after which the supernatant was collected by pipette and stored for future reference.The supernatant from the first washing step was added to the supernatant removed prior to washing. This combined fraction contains the majority of unmodified peptides.

The beads were washed seven times: with a solution of 60 µl 50% acetonitrile/50% 50 mM potassium phosphate buffer (pH 7.5, twice), 100% acetonitrile, 50% acetonitrile/50% 50 mM potassium phosphate buffer (pH 7.5), 50 mM potassium phosphate buffer (pH 7.5), 2 M NaCl and finally with 50 mM potassium phosphate buffer (pH 7.5). Each washing step consisted of turning on a rotavap for 15min at room temperature and spinning down at 12,000 g for 1 min after which the supernatant was removed by pipette.

To release the captured peptides, the washed beads were incubated with a solution of 5 mM TCEP in 50 mM potassium phosphate buffer (pH 7.5) on a turning rotavap at room temperature for 60 min followed with 2.5 mM TCEP in 50% acetonitrile/50% 50 mM potassium phosphate buffer (pH 7.5) for 15 min at room temperature. After each step, the beads were spun down and the liquid was collected and combined by pipette. Finally, 250 mM IAA in 50 mM potassium phosphate (pH 7.5) was added to a final concentration of 55 mM IAA and incubated for 30 min at room temperature in the dark. The resulting mixtures were cleaned by ZipTip µC18 pipette tips as per the manufacturer instructions (Millipore, Bedford, USA).

Mass spectrometry. Peptides were collected on ZipTip μC18 pipette tips, washed with 0.1% TFA and eluted with 50% acetonitrile/0.1% TFA. Mass spectrometry was performed by MALDI-TOF, LC-ESI-Q-TOF and LC-ESI-FTICR.For MALDI-TOF, 0.5 µl samples were mixed with an equal volume of a 10 mg/ml α-cyano-4–hydroxycinnaminic acid solution in 50% acetonitrile/50% ethanol. The mixture was spotted on a MALDI target plate and allowed to dry. MALDI-TOF spectra were acquired on a TofSpec 2EC mass spectrometer (Micromass, Wythenshawe, U.K.) in reflectron mode and externally mass calibrated using a standard peptide mixture.

For MS/MS analysis samples were diluted to <5% acetonitrile/0.1% TFA and loadedonto a precolumn of an Ultimate nano-HPLC system (LC Packings, Amsterdam, The Netherlands) and separated on a PepMap C18 nano-reversed-phase column (75 µm i.d.). Elution was performed using a gradient of 5-50% acetonitrile with 0.1% TFA. The flow was infused directly into an ESI-QTOF mass spectrometer (Micromass)via a modified nanoelectrospray device (New Objective, Woburn, MA).Argon was used as a collision gas at 4 x 10-5bar measured at the quadrupole pressure gauge. External mass calibration was done using a standard tryptic cytochrome c digest.

Accurate mass was determined by an Apex QFTICR mass spectrometer (Bruker Daltonics, Billerica, MA, USA) coupled in-line to an HPLC equipped with an ESI ion source for which samples were dried in a vacuum centrifuge and reconstituted in 0.1% TFA. Mass spectra were internally calibrated using exact masses of known unmodified peptides.

Mass spectra were analysed with VIRTUALMSLAB to identify unmodified (linear) peptides, mono-link, loop-link and cross-link candidates. VIRTUALMSLAB can perform in silicodigestions to create mass spectrometry reference spectra and match that to mass spectrometry data from the real experiment with expected mass shifts from modifications.

Results

pH dependency of the Carboxyl Oxygen Exchange Reaction for Oxidised Insulin Digest catalysed by Glu-C. The pH dependency and 18O incorporation over time of the carboxyl oxygen exchange reaction is shown in figure 1a and b, respectively.

Figure 1.Effect of pH on the carboxyl oxygen exchange activity of Glu-C (a) and 18O incorporation followed in time (b). Efficiency is calculated fromrelative peak intensities (see materials and methods).a) A 40 µM solution of bovine insulin (reduced, alkylated and digested with Glu-C) was incubated with 400 nM Glu-C (1:100 molar ratio) in 100 mM sodium phosphate buffered at various pH pHs. Incubation time was 2 hours. b) -●-: 1:100 molar ratio in sodium phosphate buffer pH 6.2. -■-: 1:20 molar ratio in sodium phosphate buffer pH 6.2.

The pH dependency of the carboxyl oxygen exchange reaction catalysed by Glu-C was found to have an optimum in sodium phosphate buffer at pH 6.2. This is 1 unit lower than the reported optimum for amidase activity at pH 7.2[10].The pH 4 to 6 range in citric acid/phosphate buffer showed a significantly lower efficiency (data not shown). The optimum found for Glu-C is conveniently close to the reported optimum of pH 6 for the reaction catalysed by trypsin[11]. This facilitates a double labelling experiment without the need for buffer adjustments.

Cross-links in cytochrome c and Gas2p. Cross-linking was done with bis(succinimidyl)-3-azidomethyl-glutarate (BAMG) by the formation of amide bonds with the amine group of the lysine side chain.Amine reactive cross-linkers are often used for protein cross-linking due to the presence of multiple lysine residues on the surface of most soluble proteins.

1

Table 1. Overview of cross-link candidates from cytochrome c and the observed shift after labelling.
Experimental [M+H]+ / Calculated [M+H]+ / Error (ppm)[a] / Sequence[b] / 18O Shift (amu)[c,d]
931.53521 / 931.53598 / 1 / K8-K13 (ML) / 4
975.52484 / 975.52581 / 1 / K73-K79 (ML) / 4
1060.54257 / 1060.54219 / 0 / Y67-K73 (ML) / 4
1098.64089 / 1098.64184 / 1 / G6-K13 (LL) / 4
1186.67616 / 1186.67651 / 0 / M80-K88 (LL) / 4
1261.67880 / 1261.67868 / 0 / E92-K100 (ML) / 4
1400.75224 / 1400.74922 / 2 / K100-E104~V3-K8 (XL) / 8
1602.82478 / 1602.82479 / 0 / H26-R38 (ML) / Not Found
1658.83608 / 1658.83843 / 1 / E92-E104 (LL) / Not Found
1701.89490 / 1701.89589 / 1 / Y67-K79 (LL) / 4
1767.82832 / 1767.82966 / 1 / K39-K53 (ML) / Not Found
1802.98964 / 1802.99118 / 1 / K73-K79~N54-K60 (XL) / 4
1826.95136 / 1826.95211 / 0 / G23-R38 (LL) / 4
1861.02163 / 1861.01779 / 2 / K5-K7~D93-E104 (XL) / Not Found
1861.02163 / 1861.01779 / 2 / G6-K8~D93-E104 (XL)
1861.02163 / 1861.02182 / 0 / L94-K100~G56-E62 (XL)
1864.05187 / 1864.05134 / 0 / Y74-K88 (ML) / Not Found
1864.05187 / 1864.05134 / 0 / K73-K87 (ML)
1864.05134 / 0 / K87-K88~Y74-K86 (XL)
1864.05187 / 1864.05134 / 0 / K73-K79~M80-K87 (XL)
1881.87191 / 1881.87259 / 0 / T40-K55 (ML) / Not Found
1949.06764 / 1949.06772 / 0 / Y67-K73~M80-K87 (XL) / 8
1976.04754 / 1976.04876 / 1 / G56-K62~D93-K100 (XL) / Not Found
1991.95540 / 1991.95699 / 1 / K39-K55 (LL) / 4**
2105.09018 / 2105.09135 / 1 / G56-E62~E92-K100 (XL) / 8
2449.18704 / 2449.18951 / 1 / T40-K60 (LL) / Not Found
2480.33670 / 2480.33702 / 0 / Y67-K86 (ML) / Not Found
2480.33702 / 0 / Y67-K73~Y74-K86 (XL)
2595.29347 / 2595.29503 / 1 / K39-K60 (ML) / 8*
2595.29503 / 1 / N54-K60~K39-K53 (XL)
2595.29347 / 2595.29503 / 1 / A51-K60~K39-D50 (XL)
2611.24368 / 2611.24233 / 1 / G56-E62~K39-K53 (XL) / 8*
[a] Numbers in blue represent a mass surplus whereas numbers in red a mass deficit. [b] ML, mono-link; LL, loop-link; XL, cross-link [c] Shifts marked with an asterisk are observed in low abundance. [d] Shifts marked with ** show very poor 18O incorporation.

1

Table 2. Overview of cross-link candidates from Gas2p and the observed shift after labelling.
Experimental [M+H]+ / Calculated [M+H]+ / Error (ppm)[a] / Sequence Matched[b] / 18O Shift (amu)
928.53682 / 928.536313 (LL) / 1 / F308-K313 (LL) / 4
1378.65475 / 1378.653214 (ML) / 1 / L430-R439 (ML) / 4
1450.62924 / 1450.630748 (ML) / 1 / Y389-D398 (ML) / 0
1507.73162 / 1507.732193 (ML) / 0 / S462-R472 (ML) / 4
1539.72635 / 1539.728548 (XL) / 1 / K313-E314~A352-D361 (XL) / 4
1647.72645 / 1647.727898 (ML) / 1 / V397-E410 (ML) / 4
1677.81172
1677.81172 / 1677.807861 (XL) / 2 / K313-E314~I402-E413 (XL) / 4
1677.807861 (XL) / 2 / T236-E238~N289-D298 (XL)
1767.90105 / 1767.898407 (XL) / 1 / E234-E238~K355-R362 (XL) / 4
1850.96391 / 1850.964695 (XL) / 0 / F308-K312~Y328-E336 (XL) / 4
1904.95190 / 1904.95348 (ML) / 1 / T52-R66 (ML) / 4
1931.87003 / 1931.870479 (ML) / 0 / H440-E453 (ML) / 4
1958.03384 / 1958.034172 (XL) / 0 / K7-K12~N289-D298 (XL) / Not Found
2131.99714 / 2131.989075 (XL) / 4 / S383-E388~A352-R362 (XL) / 4
2209.12233 / 2209.124779 (LL) / 1 / I296-K313 (LL) / 4
2287.09895 / 2287.091748 (XL) / 3 / A172-E174~T236-E250 (XL) / 4
2356.17786 / 2356.177938 (ML) / 0 / I296-E314 (ML) / 4
2356.177938 (XL) / 0 / K313-E314~I296-K312 (XL)
2391.07045 / 2391.059148 (XL) / 5 / D171-E174~V81-E95 (XL) / 4
2409.07788 / 2409.084968 (ML) / 3 / Y389-K407 (ML) / 4
2485.24643
2485.24643 / 2485.235787 (XL) / 4 / F308-E314~I13-E25 (XL) / Not Found
2485.257786 (XL) / 5 / S462-D469~L423-K433 (XL)
2549.23417 / 2549.237913 (XL) / 1 / Y481-K490~V81-D91 (XL) / 4
2625.16694 / 2625.167452 (XL) / 0 / S383-E388~H440-E453 (XL) / Not Found
2641.21699 / 2641.217379 (XL) / 0 / L430-R439~Y389-D398 (XL) / Not Found
[a] Numbers in blue represent a mass surplus whereas numbers in red a mass deficit. [b] ML, mono-link; LL, loop-link; XL, cross-link.

1

Cytochrome c is a protein with a relative large content of lysine residues. Another useful property of BAMG is the aptly positioned azido group which can be used to purify peptide mixtures.

After cross-linking and proteolytic digestion, peptide mixtures will contain multiplecross-linker-modified peptides in addition to a vast majority of unmodified peptides: a cross-link within the same peptide (loop-link), a cross-link between different peptides (cross-link) and peptides modified by partially hydrolysed cross-linker (mono-link).A modification adds 151.038 Da in case of a cross-link or loop-link and 169.049 Da in case of a mono-link. MALDI-TOF mass spectrometry was used to quickly assess whether any BAMG-linked peptides were present.

Of the most abundant BAMG-linked peptides, satellite signals at Δ = -26 to -28 Da from the main peak can be found in MALDI-Tof mass spectra (Figure 3a) which is probably due to in-source loss of N2 from the azido group and subsequent uptake of either two, one or no hydrogen atoms.Accurate mass was determined by FTICR mass spectrometry. FTICR data was calibrated and matched to virtual mass spectra in VIRTUALMSLAB. In order to confirm a cross-link, the 18O incorporation experiment as described in materials and methods was done.A cross-link can incorporate four 18O atoms whereas a mono- or loop-link can only incorporate two. This leads to a difference in mass-shift and provides a simple method to distinguish between actual cross-links and mono- or loop-links. A typical result of a peptide displaying a shift of 4 amu from incorporation of two 18O atoms is shown in Figure 2.

Figure 2.MALDI-TOF mass spectra before (A) and after (B) 18O labelling of a cross-link candidate from Gas2p, matched sequences are K313-E314~I402-E413 and T236-E238~N289-D298. High 18O incorporation is shown but the shift of 4 amu indicates it is a false positive.

Unfortunately, deconvoluting FTICR mass spectra of 18O labelled peptides proved to be impossible with the available software. Based on the retention time of unlabelled peptides, cross-link candidates matched in VIRTUALMSLAB were found manually. A list of cross-link candidates from cytochrome cis shown in Table 1.

Five out of ten cross-link candidates could be confirmed by a shift of 8 amu of which four are consistent with previously published data by L. de Jong et al[4]. For cross-links that could not be confirmed by a shift of 8 amu, three no longer appeared in the mass spectrum after labelling whereas one candidatecross-link, K73-K79~N54-K60 (m/z 1802.99), was confirmed as a cross-link by L. de Jong et al. The unambiguous 4 amu shift after labelling indicates that either that particular C-terminus is blocked for labelling, despite not being blocked for cleavage, or it is a false positive. Sequence analysis is required to elucidate the correct interpretation.

The cross-link candidate K100-E104~V3-K8(m/z 1400.75) that showed a shift of 8 amu but was not confirmed by L. de Jong et al was validated by fitting it to the known 3D structure of cytochrome c in solution (PDB ID: 1AKK). The distance between the Cα(12.07 Å) and Nε(18.57 Å) atoms of linked lysine residues fit well within spacer length of BAMG (7.5 Å) given the usually flexible lysine side-chains.

The promising results with cytochrome c indicated that the method works and was applied directly to Gas2p. In Gas2p, however, no cross-link candidates found in VIRTUALMSLAB could be confirmed by a shift of 8 amu. Four candidate cross-links did not appear in the mass spectrum after labelling. However, various peptides displaying a shift of 8 amu were found (m/z 2970.36 and 2549.23, data not shown) that could not be matched in VIRTUALMSLAB. In general, relatively few matches could be made in VIRTUALMSLAB considering the size and complexity of Gas2p compared to cytochrome c.It is likely that various possible modifications that have not been taken into account have caused this. On the other hand, many cross-link candidates were proven to be false positives. With an increase in size of a protein, and thus the amount of lysine residues, it becomes more likely to find a mass match for any given cross-link candidate.