Considerations for developing and validating an LC–MS biomarker assay using the surrogate peptide approach

Written by Timothy Sikorski

silkorski-headshots_v2Timothy Sikorski, PhD is an Investigator and Associate Fellow in the Exploratory Biomarker Assay Group within the In Vitro / In Vivo Translation Platform at GlaxoSmithKline, and is based in King of Prussia, PA, USA.

After graduating from the University of Pennsylvania (PA, USA), Tim completed his PhD at Harvard University (MA, USA), where he developed proteomic methods to study the dynamics of protein complexes during transcription. Tim joined GSK (PA, USA) as a member of the Biological Mass Spectrometry group in Molecular Discovery Research. There, he developed mass spectrometry-based methods to map post-translational modifications, such as acetylation and phosphorylation, on a proteome-wide scale for mechanism-of-action studies and to identify potential biomarkers. Tim then transitioned to the Exploratory Biomarkers Group, where he has been working on developing novel methods for measuring endogenous protein and metabolite biomarkers in systemic matrices to support early Experimental Medicine clinical trials. These assays are serving as important pharmacodynamic endpoints in proving target engagement and mechanisms of action of GSK medicines.

GlaxoSmithKline, PTS – In Vivo /In Vitro Translation; Bioanalysis, Immunogenicity & Biomarkers; 709 Swedeland Road, King of Prussia, PA 19406 USA

Keywords: LC–MS, Surrogate Peptide, Protein Biomarker, Digestion Efficiency, Protease

In recent years, com­prehensive incorporation of biomarker measurements into drug development strategies and clinical protocols is becoming standard, as it is clear biomarker-guided trial design can mitigate the risk of failure and enable more informative clinical experiments [1,2]. Analysis of protein biomarkers was traditionally done by activity-based or immuno-based assays, as these often could provide the sensitivity required to measure low concentration but clinically important proteins in complex matrices [3,4]. With the advent of modern high sensitivity instrumentation, liquid chromatography in conjunction with mass spectrometry (LC–MS) has become a key tool for biomarker concentration measurements, as stratifying biomarker assays often require highly selective assays for novel proteoforms [5]. LC–MS assays can differentiate highly homologous proteins, isoforms, or post-translational modifications without the need to develop highly specific immunoreagents that can be cost or time prohibitive.

Ideally, LC–MS assays would detect and quantify intact protein molecules. However, large proteins become highly charged in the mass spectrometer, with the signal for a given population of protein molecules being split across many peaks. This can make low-level proteins difficult to detect and quantify. Instead, most protein biomarker assays quantify an enzymatically-derived peptide to represent a protein or a domain of that protein, known as the surrogate peptide approach [6]. Protein digestion with a site-specific protease requires an extra sample processing step that needs to be carefully considered during assay development and validation. It is vital to ensure that the surrogate peptide sequence chosen is unique to the biomarker of interest and can differentiate between homologous proteins. The peptides that are chosen should ultimately be driven by the biology that is being probed. For example, selecting peptides from a tissue-specific isoform region of the protein might provide the most informative marker that a therapeutic is engaging with its target in the desired site of action.

There are several open-access tools that can be used to help pick the most mass spectrometry friendly surrogate peptides for your protein of interest. Applications such as Skyline work with instrument vendor software to help identify ideal surrogate peptides and refine targeted methods to optimize for selectivity and sensitivity and provide quantitation of the surrogate peptides [7]. Databases such as Peptide Atlas store experimental peptide spectra from a variety of organisms that have been compiled from a diverse set of tandem mass spectrometry proteomics experiments [8]. These spectra can be analyzed to identify which peptides and their fragments will potentially give the best selectivity and sensitivity as surrogates for the protein biomarker of interest.

Robust and reproducible digestion of the protein into peptides is just as crucial for accurate quantification of the protein biomarker across samples [9]. Assessing and optimizing digestion efficiency during method development is critical for developing a robust and accurate LC–MS protein biomarker assay. Traditionally, processes for optimizing digestion efficiency were highly informed from the pharmacokinetic space. Scientists with expertise in developing LC–MS assays for biotherapeutics using the surrogate peptide method would approach method development for biomarker digestion parameters in an analogous manner. This usually included spiking the reference standard into a matrix and varying conditions such as time, temperature, enzyme concentration and denaturing additives. Conditions were optimized to provide a method the highest sensitivity and required selectivity as measured by the mass spectrometer.

There are important differences between biotherapeutics and endogenous proteins that make this approach less effective for biomarker assays. Whereas biotherapeutic reference standard is equivalent to the material that is dosed and measured in patient samples, endogenous protein biomarkers in matrix typically have very different properties than that of a recombinant standard of the protein that is used as a reference. Endogenous proteins are often found in complexes with other proteins and can have numerous post-translational modifications that are not found on recombinant proteins produced in an unrelated organism or cell line. Also, endogenous proteins may have a different folding structure than a recombinant version or be found in a different multimeric state. Taken together, differences may make the endogenous protein significantly more or less resistant to proteolytic digestion compared with the recombinant standard. Therefore, optimization of digestion conditions using a reference standard is not sufficient for biomarker assays, as it can lead to under or overestimation of the true endogenous analyte during study support.

It is critical to perform all digestion optimization experiments with both reference standard as well as with a true matrix, measuring the endogenous analyte. This begins by testing the LC–MS response as a function of trypsin concentration with the goal to identify a plateau where increasing enzyme concentration no longer increases the LC–MS response for the selected peptide. For an optimal method, the choice of enzyme concentration for the assay should remain somewhere in the middle of this plateau. This will ensure that small variations in enzymatic activity due to temperature fluctuation or digestion time have insignificant effect on digestion and therefore do not affect analyte quantification.

As an example, when our group was developing a protein biomarker assay in plasma, we found a reference standard protein was fully digested with just six micrograms of trypsin protease enzyme. The endogenous protein, however, was fully digested with 10 micrograms of trypsin, but only partially digested with 8 micrograms of trypsin. Therefore, for this assay, we needed to use more than 10 micrograms of enzyme to ensure that lot-to-lot enzyme activity variability would not affect the assay’s robustness while ensuring full digestion of both reference standard and endogenous analyte. Notably, if this method development was only done with reference standard, an assay could be validated that would only lead to partial digestion of the endogenous analyte, resulting in an inaccurate measurement of our protein biomarker.

Although the above example varied the amount of trypsin in the sample, other assay conditions, such as temperature or digestion time, could be investigated to provide a similar result. In addition, several additives have proven useful to increase digestion efficiency. Chaotropic agents like urea and guanidine can make proteins more amenable to digestion [9]. Reducing disulfide linkages with compounds such as dithiothreitol or TCEP can also help to unfold proteins. Similarly, detergents or small amounts of organic solvents can facilitate disruption of protein interactions and unfolding biomarker proteins, allowing easier access of the proteolytic enzymes and enhancing digestion. However, with all these additives, care needs to be taken to ensure that they do not substantively affect the activity of the protease enzyme. In addition, they often need to be removed prior to MS analysis, as they can have effects on chromatographic performance and/or ionization efficiency. MS friendly detergents have been successfully implemented into sample processing protocols, which either degrade during protein digestion, or can be precipitated and removed at low pH prior to MS analysis.

Indeed, with the myriad of options available, optimizing digestion conditions can be a long and arduous process, but this step is critical to ensure accurate and robust measurement of protein biomarkers with LC–MS. In our lab, automation is becoming increasingly utilized for many aspects of sample preparation during study support, yet it remains seldom applied to the method development process [10,11]. However, we see a convincing case for using many of these automation tools for systematic method development optimization, including protein digestion. The incorporation of liquid handlers and digital dispensers can allow for hundreds of different conditions to be set up in a matter of minutes, which can then be screened with LC–MS to identify what the optimal digestion conditions. This allows for more systematic procedures that avoid analyst bias and ensure that the best protocol is identified for the analyte of interest.

In addition, learning more about the endogenous analyte and how it differs from the reference standard should become a more standard part of protein biomarker method development and validation. A thorough characterization of the post-translational modifications, isoforms, and protein interaction partners of the analyte in the matrix of interest can be obtained with proteomic approaches like data dependent acquisition mass spectrometry and database searching [12]. This strategy provides a deeper understanding of the protein of interest in a disease relevant setting. This information can aid in determining what can serve as a suitable reference standard, what region of the protein to pick the surrogate peptide of interest from, and whether a multiplex assay could be developed to monitor multiple proteoforms, helping to identify the most stratifying biomarker.

In conclusion, there are important differences between biotherapeutic proteins and endogenous protein biomarkers that need to be considered and therefore a new paradigm for method development and validation is necessary. This process begins with a thorough characterization of reference standards and authentic analytes to help identify which peptides will be the most useful to measure as surrogates for the total protein. Then, with help from the ever-expanding box of automation tools available, a systematic approach is taken to identify the best sample processing conditions that ensure that the surrogate peptide will provide an accurate measurement of endogenous protein in the matrix. Finally, we can move to a validation that assesses parallelism of the recombinant reference protein and the endogenous biomarker, to ensure quantitative accuracy across the range of the assay [13]. The use of authentic quality control samples during validation and study sample analysis is key to ensure reproducible digestion of the endogenous analyte across assay runs. Although here we focused on the endogenous and recombinant protein analytes, similar care should be taken for the selection of the proper internal standards that normalize for sample to sample variance in digestion efficiency [14]. Through this process, specific, quantitative and robust assays for the desired analyte can be developed while minimizing unforeseen obstacles during study support and ultimately providing direct evidence of target engagement and clinical efficacy.


  1. Morgan P, et al. Impact of a five-dimensional framework on R&D productivity at AstraZeneca. Nat. Rev. Drug Discov. 17(3), 167–181(2018).
  2. Townsend MJ and Arron JR. Reducing the risk of failure: biomarker-guided trial design. Nat. Rev. Drug Discov. 15(8), 517–518 (2016).
  3. Geyer PE, et al. Revisiting biomarker discovery by plasma proteomics. Mol. Syst. Biol. 13(9), 942 (2017).
  4. Hottenstein C, et al. Platforms and techniques used for biomarker assays: where are we now? Bioanalysis, 9(14), 1029–1031 (2017).
  5. Hoofnagle AN and Wener MH. The fundamental flaws of immunoassays and potential solutions using tandem mass spectrometry. J. Immunol .Methods, 347(1–2), 3–11 (2009).
  6. Anderson NL, et al. A human proteome detection and quantitation project. Mol. Cell Proteomics, 8(5), 883–886 (2009).
  7. Pino LK, et al. The Skyline ecosystem: Informatics for quantitative mass spectrometry proteomics. Mass Spectrom. Rev. (2017).
  8. Kusebauch U, et al. Human SRMAtlas: A Resource of Targeted Assays to Quantify the Complete Human Proteome. Cell, 166(3), 766–778 (2016).
  9. Szapacs M, et al. Utilizing enzymatic digestion procedures in the bioanalytical laboratory. Bioanalysis, 8(1), 29–36 (2016).
  10. Patel V, et al. Automating bioanalytical sample analysis through enhanced system integration. Bioanalysis, 5(13), 1649–1659 (2013).
  11. Li M. Automation in the bioanalytical laboratory: what is the future? Bioanalysis, 5(23), 2859–2861 (2013).
  12. Aebersold R and Mann M. Mass-spectrometric exploration of proteome structure and function. Nature, 537(7620), 347–355 (2016).
  13. Jones BR, et al. Surrogate matrix and surrogate analyte approaches for definitive quantitation of endogenous biomolecules. Bioanalysis, 4(19), 2343–2356 (2012).
  14. Shuford CM, et al. Absolute Protein Quantification by Mass Spectrometry: Not as Simple as Advertised. Anal. Chem. 89(14), 7406–7415 (2017).

Financial & competing interests disclosure

The author has no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.