An ambitious EC-funded research initiative on epigenetics advancing towards systems biology 79

Arne H. Smits and Michiel Vermeulen


Most proteins assemble into multi-subunit complexes to perform their cellular functions. To understand the biological role of a protein of interest, it is therefore important to identify its protein-protein interactions (PPIs). Besides the qualitative identification of PPIs, the protein complex architecture is also of major importance. By distinguishing 2 core-subunits from substoichiometric complex subunits, key subunits and their intrinsic binding domains, enzymatic activities and/or regulatory functions can be identified.

Recent developments in quantitative mass spectrometry (qMS) have made it possible to screen for these protein-protein interactions in a comprehensive and unbiased manner (Vermeulen et al, 2008). Most recent qMS methods are based on label-free quantification, which do not rely on isotope labeling and are therefore ideally suited for PPI identification in any kind of tissue or cell type (Hubner et al, 2010). To determine the protein complex stoichiometry, information on the abundance of interactors needs to be gained. This used to be accomplished by spike-in of isotope labeled reference peptides, a laborious and expensive method. However, recently new computational methods have been developed that are able to approximate the abundance from qMS intensity, one of which is Intensity Based Absolute Quantification (iBAQ) (Schwanhausser et al, 2011).

We recently combined the label-free quantitative PPI identification method QUBIC with the iBAQ algorithm, which enables easy and fast abundance determination of the identified PPIs. Thereby, we are able to determine the stoichiometry of protein complex subunits and can easily distinguish core-subunits from substoichiometric, transient interactions (Smits et al, 2013).

In our approach, we make use of single-step affinity purification of GFP-tagged proteins of interest and an untagged control sample, followed by on-bead trypsin digestion and LC-MS/MS. In principle, our workflow can be adapted to any kind of single-step affinity purification. GFP-tagging is our method of choice, as tagging is optimized using bacterial artificial chromosomes (BACs) and tagged exogenous expression is near endogenous levels (Poser et al, 2008). Besides stable transfection using BACs, GFP-tagged proteins can also be expressed by other means, such as transient transfection. Additionally, the use of GFP-trap beads (Chromotek) enables highly efficient affinity enrichments without heavy and light chain contaminants (Rothbauer et al, 2008) (and EpiGeneSys protocol PROT44).

Once the GFP fusion protein is expressed in tissue or cell type of choice, the cells are collected and nuclear extract is obtained as described before (Dignam et al, 1983). This nuclear extract and that of the untagged control are applied to GFP-trap affinity purification. In order to perform statistics on the label-free quantification, GFP and control purifications are done in triplicate. After extensive washing, the remaining proteins are digested using trypsin, as described (Hubner & Mann, 2011). Next, the peptides are purified on StAGE-tips (Rappsilber et al, 2007) and applied to LC-MS/MS. Raw data are analyzed by MaxQuant (Cox & Mann, 2008) and PPI are identified using label-free quantification (Hubner et al, 2010). For each of the identified PPIs, we correct for background binding by subtracting the obtained iBAQ value in the control from the obtained iBAQ value in the GFP sample. These corrected values of abundance are scaled to one protein, most commonly the bait protein, which reveals the stoichiometries of interacting proteins relative to the protein scaled to one.

In this approach, iBAQ is used to estimate the abundance of identified interactors,based on the mass spec intensity of the tryptic peptides. Some peptides, however, can belong to multiple proteins, as these proteins are (partially) homologous. These peptides are assigned to the protein with the most unique peptides in the MaxQuant software, however, this leads to overestimation of the stoichiometry of this particular protein and 3 underestimation of the stoichiometry for the other proteins that give rise to this peptide. To circumvent this problem, we collapse the stoichiometries of proteins with shared peptides into a single value. Thereby, we lose resolution, as we cannot distinguish the stoichiometry of these proteins, yet we can be confident of the combined stoichiometry value.

PDF version

Arne H. Smits and Michiel Vermeulen

Molecular Cancer Research – Division of Biomedical Genetics –
University Medical Center Utrecht – Universiteitsweg 100, 3584 CG Utrecht, The Netherlands.

Corresponding author: Arne H. Smits
Email feedback to: