Large-Scale Label-Free Quantitative Mapping of the Sputum Proteome
Analysis of induced sputum supernatant is a minimally invasive approach to study the epithelial lining fluid and, thereby, provide insight into normal lung biology and the pathobiology of lung diseases. We present here a novel proteomics approach to sputum analysis developed within the U-BIOPRED (unbiased biomarkers predictive of respiratory disease outcomes) international project. We present practical and analytical techniques to optimize the detection of robust biomarkers in proteomic studies. The normal sputum proteome was derived using data-independent HDMSE applied to 40 healthy nonsmoking participants, which provides an essential baseline from which to compare modulation of protein expression in respiratory diseases. The ""core"" sputum proteome (proteins detected in >40% of participants) was composed of 284 proteins, and the extended proteome (proteins detected in >3 participants) contained 1666 proteins. Quality control procedures were developed to optimize the accuracy and consistency of measurement of sputum proteins and analyze the distribution of sputum proteins in the healthy population. The analysis showed that quantitation of proteins by HDMSE is influenced by several factors, with some proteins being measured in all participants' samples and with low measurement variance between samples from the same patient. The measurement of some proteins is highly variable between repeat analyses, susceptible to sample processing effects, or difficult to accurately quantify by mass spectrometry. Other proteins show high interindividual variance. We also highlight that the sputum proteome of healthy individuals is related to sputum neutrophil levels, but not gender or allergic sensitization. We illustrate the importance of design and interpretation of disease biomarker studies considering such protein population and technical measurement variance.