Optimisation of urine sample preparation for shotgun proteomics

Abstract Urine reflects the renal function and urinary and kidney systems, but it may also reflect the presence of cancer in other parts of the body. Urine also has potential for providing prognostic information during therapeutic treatments thanks to non-invasive monitoring. A quick and reproducible protein purification procedure is essential to allow data comparison between proteomic studies in urine biomarker discovery. The article describes a simple, reproducible and cheap sample preparation procedure with a maximum protein yield (400 µg) obtained from only 10 mL of urine utilising cut-off filter desalting and digestion. The reported procedure removes yellowish background coloration residues and thus prevents the errors in spectrophotometric protein concentration determination. Different extraction solvents used in the presented procedure point to the possibility of partial elimination of abundant proteins (albumin and keratin family), as well as to the improvement of the sequence coverage of proteins identified, which helps to reveal changes in the urinary proteome. With this workflow, proteins can be easily obtained on standard laboratory equipment within 3 h. Data are available via ProteomeXchange with identifier PXD019738.


Introduction
The aim of preventive diagnostics in medicine is to find suitable biomarkers in easily accessible body fluids. The standardized definition of a proteomic biomarker was proposed as "a specific peptide or protein that is associated with a specific condition, such as the onset, manifestation, or progression of a disease or a response to treatment" [1]. Urine is a body fluid that can be collected non-invasively and risk free, but for a long time was neglected in proteomic analysis in the search for biomarkers. It was thought that urine could not contain sufficient amounts of peptides and proteins for qualitative as well as quantitative proteomics. A significant breakthrough in this field came with great improvement in LC/MS measurements at the beginning of the twentyfirst century with first publications about proteins identified in urine. Recently, a number of genome-wide association studies have revealed that some proteins present in urine (uromodulin and albumin) can serve as biomarkers of developing hypertension and chronic kidney diseases in the general population [2,3]. Although, advancements in analytical techniques in proteomic analysis provide a possibility of identifying proteins still more precisely. A recent study shows that the single use of existing urinary biomarkers is not accurate enough to predict cancer [4,5]. Thus, monitoring the whole panel of promising easily obtainable biomarkers is needed.
The human urine is a very complex matrix comprising 95% water and a mixture of water-soluble components such as urea, NaCl, KCl, various amounts of organic acids such as oxalic and citric acid, phosphates, poorly water-soluble uric acid and creatinine. It contains soluble and insoluble proteins, peptides, extracellular vesicles, nucleic acids, cells and cell debris. In the case of infection, leucocytes and higher quantity of proteins are also present. The yellowish colour of the urine is mainly caused by urochrome, the haemoglobin degradation product, but also by urobilin, an orange-brown pigment, and uroerythrin, which has a pink colour [6]. All these colour components complicate proteomic determination, usually by clogging the HPLC nano-columns, and they influence protein concentration measurement by the Bradford method. Having a molar mass of less than 600 Da, they are expected to be eliminated from the sample together with other small molecules, e.g. salts. The standard procedure for sample desalting is protein precipitation with a suitable organic solvent such as ethanol, various concentrations of acetone, trichloroacetic acid, etc. Precipitation is usually conducted overnight, as it takes 12-18 h. The deficiency of the precipitation procedure is that a small portion of the yellow components and salts remains in the pellet. Washing the protein pellet with water or a mixture of organic solvent with water can partially improve the technique, but on the other hand, can lower the amount of protein yield. A well-known technique to lower the salt amountdialysis through a cut-off membraneis very time and solvent consuming and uncomfortable for analysts, but with little loss of protein recovery. Moreover, the proteins remain in a large volume of solvent and certain amounts of pigments and salts are still present. To evaporate the solvent, vacuum concentrators are employed, but this step is time-consuming, and the stability of the proteins during evaporation should be taken into account. Nowadays, ultrafiltration through cut-off filter membranes is used with the same or even better results. Thanks to centrifugation forces, the sample preparation time is shorter and proteins remain in a small amount of solvent. Currently, there are many different membranes on the market in the molecular range from usually 3 up to 100 kDa. To cut-off salts and other small molecules in urinary sample preparation, 3 or 10 kDa cut-off filters are used depending on the protein of interest. Our urine sample preparation for subsequent LC/MS analysis was aimed to optimize the ultrafiltration preparation process for maximum protein yield from 10 and 20 mL of urine. This was followed by protein digestion with comparison of the digestion steps in the cut-off membrane, so called filter-aided sample preparation (FASP), an in-house modified Wisniewski protocol [7], and in a vial. A standard solid phase extraction method for desalination of the urine sample was also tested (results are not included). Our observations were consistent with those of Sigdel [8], and urinary contaminants still persisted.

Sample collection
To optimize the method, urine samples were collected from two patients (male and female, age 35-46, Caucasian race, healthy volunteers) with signed written informed consent approved by the local ethics committee. All method modifications were processed for the pooled sample consisting of male and female urine samples to eliminate differences between the sexes. The patients did not use any medication during the last 30 days prior to the urine collection, nor did they have a clinically significant history. It is well known that the number of proteins in the urine varies significantly in 1 day, with more proteins present in the morning urine compared to a common 24-h urine [9,10]. All urine samples were collected as first morning urine as a midstream, clean catch specimen, in a sterile urine container. During transport, the samples were kept at room temperature and analysed within 1 h. Protease inhibitors were not added to the samples, since some studies have shown that they reduce protein identification and may interfere with the subsequent digestion procedure in untargeted urine proteomics [11,12].

Sample preparation
Insoluble materials were removed from the urine by centrifugation at 2,500 × g at 21°C for 30 min. The filter unit Centriprep was initially washed with 10 mL of deionized water to remove traces of glycerine by centrifuging at 3,000 × g at 21°C for 20 min. Water in both the container and filtrate collector was discarded. A total of 10 mL of previously centrifuged urine sample was applied to the sample container of Centriprep and centrifuged at 3,000 × g at 21°C for 30 min, and the filtrate was discarded. In the case of a 20 mL urine sample, another 10 mL of urine supernatant was applied to the sample container and the centrifugation step was repeated. To reduce salt concentration in the urine concentrate, filtration against three solvents was tested: 0.005 mol dm −3 NaCl (method numbers 1, 4, 5 and 8 in Table 1), 0.1 mol dm −3 AMB (method numbers 2 and 6 in Table 1) and water (method numbers 3 and 7 in Table 1). The filtration process was the same for all tested solvents; the urine concentrate in the sample container was diluted with 10 mL of solvent and centrifuged at 3,000 × g at 21°C for 30 min, and the filtrate was discarded. The dilution step was repeated, the sample was centrifuged again for 40 min, and the filtrate was discarded. The sample was centrifuged again at 3,000 × g at 21°C for 30 min to obtain approximately 500 µL of sample. If the sample volume obtained was larger, the spin time was extended. The obtained samples were next subjected to two different digestion processes together with their modifications. The first digestion process involved digestion on an Amicon centrifugal filter unit (method numbers 1, 2, 3, 5, 6 and 7 in Table 1). The second precipitation of proteins was with ethanol/acetone followed by digestion in a vial, methods 4 and 8 in Table 1.
Details of all the digestion steps are as follows: FASPconcentrated urine was transferred to the Amicon filter unit and centrifuged at 7,000 × g for 30 min at 4°C. The flowthrough filtrate was discarded. To reduce disulphide bonds, 100 µL of 0.01 mol dm −3 DTT in 0.1 mol dm −3 AMB was added, and the filter unit was placed in a thermomixer for 45 min at 37°C. Next, the filter unit was centrifuged for 10 min at 3,000 × g at 21°C. Alkylation was performed by adding 100 µL of 0.05 mol dm −3 IAA in 0.1 mol dm −3 AMB to the sample, and the reaction was performed in a thermomixer for 30 min at 37°C in the dark. Subsequently, the filter unit was centrifuged for 10 min at 3,000 × g at 4°C. The proteins on the filter were washed with 500 µL of ACN/water 1/1 v/v solution, vortexed properly and centrifuged for 10 min at 3,000 × g at 4°C. Afterwards, the proteins were washed twice with 500 µL of 0.1 mol dm −3 AMB and centrifuged for 10 min at 4,000 × g at 4°C. A volume up to 400 µL should be obtained. If the volume was higher, an additional spin step was added. For digestion, 100 µL of 0.002 mol dm −3 CaCl 2 in 0.1 mol dm −3 AMB with 30 µL of trypsin was added to the sample, and the filter unit was placed in a thermomixer at 37°C overnight. The next day, the digest was centrifuged for 45 min at 7,000 × g at 4°C, and the reaction was stopped by the addition of approximately 10 µL of concentrated formic acid to adjust the pH to 3-4, and the sample was submitted to HPLC-MS/MS analysis.
The precipitation procedure with ethanol/acetone and in-solution digestion in a vial is as follows: the concentrated urine sample after centrifugation on Centriprep was precipitated with 3 mL of chilled ethanol and placed for 1 h at −20°C. Then, 3 mL of chilled acetone was added, and the sample was left overnight at −20°C. The following day, the sample was centrifuged for 15 min at 13,000 × g at 4°C. The supernatant was discarded, and after acetone evaporation, the pellet was dissolved in 300 µL of 8 mol dm −3 urea in 0.1 mol dm −3 TRIS pH = 8, and sonication was used to improve solubility. Protein concentration was determined by the Bradford assay. DTT reducing solution (0.1 mol dm −3 DTT in 0.1 mol dm −3 TRIS) was used in an amount of 1/10 weight of the protein amount taken for digestion with a NaCl -0.005 mol dm −3 , AMB -0.1 mol dm −3 ammonium bicarbonate, waterdistilled water, TRIS urea -8 mol dm −3 urea in 0.1 mol dm −3 TRIS pH = 8, FASPdisulphide bond reduction, alkylation and digestion processes on an Amicon centrifugal filter unit 3 kDa, precipitationdisulphide bond reduction, alkylation and digestion processes after protein precipitation in solution, trypsin volume fixed -30 µL, countedtrypsin amount was calculated according to the measured protein concentration in the samplein a trypsin-protein ratio of 1:50.
10% excess, and the samples were incubated for 30 min at 37°C. The alkylation solution (0.5 mol dm −3 IAA in 0.1 mol dm −3 TRIS) was added in an amount of 1/10 volume of the protein sample. The reaction was carried out in the dark at 37°C. To remove the excess IAA, 1/5th of the previously added volume of DTT was added to the sample and incubated for 30 min at 37°C. Subsequently, the sample was diluted eight times with 0.002 mol dm −3 CaCl 2 in TRIS to reduce the urea content below 1 mol dm −3 . Trypsin was added in a trypsin-protein ratio of 1:50, and the sample was incubated at 37°C overnight. The reaction was quenched by adding concentrated formic acid to achieve a pH in the range of 3-4. The solution was centrifuged at 13,000 × g at 4°C for 15 min, and the supernatant was concentrated in a speed vac to 200 µL and subjected to HPLC-MS/MS analysis.

HPLC-MS/MS analysis and protein identification
An AmaZon speed ETD ion trap mass spectrometer (Bruker Daltonik, Germany) coupled with an Ultimate 3000 RSLC NCP system (Thermo Scientific, USA) was used for HPLC-MS analysis. Trypsin-treated samples were analysed using a Compass 1.   Table 3. Fractions were collected in a 70 min elution gradient, starting from 7 min of run at 4-6 min intervals to obtain six fractions, which was terminated after 33 min. Peptide analysis was performed by HPLC-MS/MS as described above.

Results and discussion
The aim of the study was to optimize the method of protein extraction from urine samples. Two basic digestion processes were compared: on the filter unit and in the vial for LC-MS/MS analysis, to achieve maximum protein recovery using two sample amounts (10 and 20 mL). The results are shown in Table 4. The reported numbers of proteins identified are averages of at least three biological samples. Each sample was measured as a technical duplicate. As expected, the amount of urine collected for analysis plays an important role in the amount of isolated proteins. Using the Bradford concentration assay, on average 400 µg protein was obtained from 10 mL of urine and about 800 µg from 20 mL. However, the higher amount of urine significantly worsens the noise in MS measurements. By doubling the urine volume by 1D LC-MS/MS analysis, only a 20% increase in the number of proteins identified was achieved (124 IDs from 10 mL of urine compared to 156 proteins from 20 mL, performed in AMB solution). In addition, significant non-reproducibility of protein identifications obtained from 20 mL of urine by 1D analysis was observed, as shown in Figure 1. The coefficient of variation calculated from proteins identified was 13.8% for a 20 mL urine sample in comparison to 8.5% for a 10 mL sample.
By changing the solvent for sulphide bond reduction and alkylation from AMB to 4 mol dm −3 urea in 0.1 mol dm −3 TRIS, while the other reactants remained in AMB, the number of identified proteins increased by about 25%. When performing all digestive processes in TRIS solution, we observed a significantly higher wash peak in LC analysis compared to that in AMB solution. The spectra obtained by HPLC-MS/MS analysis are always purer in AMB than in TRIS, but with a slightly lower protein yield. This may be due to lower protein solubility in AMB than in urea in TRIS. The use of the precipitation step instead of the Amicon filter unit provided a comparable protein yield, but the percentage of compounds identified by the search engine was always lower. Numerically, it was 2.2-2.4% for the precipitated samples diluted in TRIS compared to 2.5-2.8% for samples digested on an Amicon filter. However, the number of accepted proteins was without significant differences. Figure 2 shows the percentage of proteins identified against their sequence coverage (SC) to compare the quality of protein identification depending on the isolation procedure.
Upon comparison of proteins identified by the five isolation methods tested, namely (1) FASP with ultracentrifugation against AMB, (2) FASP with ultracentrifugation against water, (3) FASP with ultracentrifugation against NaCl with two different volumes of urine, (4) FASP with ultracentrifugation against water with reduction/alkylation reagents in TRIS urea and (5) ethanol/acetone precipitation, the FASP method achieved a higher protein yield than precipitation.
In all tested methods, we did not observe a statistically significant difference in the number of peptides identified per protein using a statistical comparison by means of a permutation test using the Benjamini-Hochberg correction as well as the Bonferroni correction. Although the highest number of proteins with an SC greater than 40% was identified with the precipitation method, it was noticed only for abundant proteins such as immunoglobulin kappa constant, serum albumin, prostaglandin, protein AMBP, zinc α-2-glycoprotein and apolipoprotein. Reducing the complexity of the sample using SCX prior to MS/MS   analysis resulted in an increase in the number of identified proteins to approximately 300 from 10 mL of urine and more than 400 proteins from a 20 mL urine sample. This indicates the high potential of a 20 mL urine sample to reveal higher proteome/PTM coverage. Non-reproducibility in 1D analysis with a 20 mL urine sample can be overcome using SCX as a pre-separation step.
To facilitate selection of the appropriate procedure, we monitored the number of peptides identified for the selected protein groups. The first group contains abundant proteins, such as ALBserum albumin, KERkeratin (we summarize the peptides identified for keratin I and II cytoskeletal), TRFEserotransferrin and UROMuromodulin, affecting the identification of minor proteins. The second group consists of proteins that play a role in screening biomarkers: EGFurogastrone, PSAPprosaposin and VTNCvitronectin.
EGFepidermal growth factor, known in humans as urogastronestimulates the growth of various epidermal and epithelial tissues in vivo and in vitro and of some fibroblasts in cell culture. Many aggressive types of cancers have excessive signalling through the EGF system. They either create excess amounts of EGF or develop mutant forms of the receptor that are unnaturally active [14]. Sasaki revealed that in the primary colon, tumours contribute to the spread of tumour cells to the lymph nodes and liver [15].
PSAPa secreted proteinis a well-known pleiotropic growth factor. Several studies have shown that PSAP is overexpressed in breast cancer cell lines [16]. Koochekpour et al. found that PSAP is overexpressed in prostate cancer [17].
VTNC is a cell adhesion and spreading factor found in serum and tissues. Vitronectin interacts with glycosaminoglycans and proteoglycans. Kadowaki et al. in their study demonstrated that vitronectin expression was significantly elevated in the serum of early and more advanced breast cancer patients compared to normal controls [18]. Turan et al. have proposed vitronectin as a marker in ovarian cancer [19]. Figure 3 shows that the precipitation method provides higher SC of the keratin family proteins and albumin. This leads to inferior identification of minor proteins than that of PSAP and VTNC. The FASP method against water with reducing and alkylating agents in TRIS urea showed the lowest number of keratin peptide identifications. Control blank samples confirmed the absence of contamination by keratin proteins in the samples.
A combination of a Centriprep centrifugal filter followed by an Amicon filter unit and subsequent transfer of the protein solution from the filter into a vial for digestion was also tested. The protein yield was significantly lower, possibly due to protein adsorption on the cut-off membrane. The amount of proteins depended on the membrane washing process, while we still identified approximately 30 proteins.  waterultrafiltration against water; NaCl 10 mLultrafiltration against 0.005 mol dm −3 NaCl; NaCl 20 mLultrafiltration against 0.005 mol dm −3 NaCl; amount of urine 20 mL, water/ TRISultrafiltration against water, reducing and alkylating agents dissolved in TRIS urea and precip.ethanol/acetone precipitation method.
Urine protein analysis can be a tool to provide information about the urinary tract, the state of the organism and the risk of cancer. In order to monitor proteomic profile changes in the urine, a rapid, simple, robust and especially reproducible method is needed. Our procedure combines two cut-off filter units: 10 kDa for urine purification and sample volume reduction, followed by a digestion process on a 3 kDa centrifuge unit. This leads to the removal of naturally coloured compounds, such as urobilin and urochrome, which falsely increase the protein concentration determined by the Bradford assay. Our new modified FASP method acquires approximately 400 µg of proteins just from 10 mL of urine, resulting in 220 proteins identified in a simple 80 min gradient on an ion trap mass spectrometer. By using an SCX preseparation step with only six fractions, more than 400 proteins were identified under the same LC-MS/MS conditions. For the label-free quantitative analysis, such as spectral counting, the SC of the desired proteins can be partially influenced by the selection of various diluents. We present this procedure to offer a robust and repeatable method that allows a large group of samples to be analysed without frequent LC/MS system cleaning and can serve as a standardized method for urine proteomics.