20.08.2024

Tracing the origin of truncated protein variants

A new study led by Claire McWhite at the Lewis-Sigler Institute for Integrative Genomics and with the collaboration of Silvia Ramundo has established a novel system to identify protein isoforms in plant and human proteomics analyses. These truncated versions of full-length proteins, which can have unique interactions and specialized functions, can now be detected at a large scale.

Proteins perform many functions in our cells, from mediating responses to external stimuli to building other cellular components. Interestingly, truncated proteins, which miss specific regions of the full-length protein, often have different activity or interactions than their full-length counterparts, and therefore carry out specialized functions. 

But how are these truncated versions produced by the cell? And how can scientists effectively identify and study truncated protein isoforms? A new study led by Claire McWhite at the Lewis-Sigler Institute for Integrative Genomics at Princeton University and Edward Marcotte at the University of Texas, and with the collaboration of Silvia Ramundo at the Gregor Mendel Institute of Molecular Plant Biology (GMI) and Masayuki Onishi at Duke University, has established a novel system to identify protein isoforms in plant and human proteomics analyses.

Their results, published on August 20th in Molecular Systems Biology, suggest that up to one percent of proteins have truncated isoforms, with many of these isoforms being produced by proteolytic cleavage. 

 

Uncovering protein variants using co-fractionation mass spectrometry  

Identifying truncated protein variants is no easy task: “When we break open a cell and extract its proteins, there are many chances for the proteins to be partially broken down or degraded,” explains Claire McWhite, the study’s first author. “For this reason, when we detect shorter versions of proteins, we must distinguish whether they are an artifact of our method or real variants.”  

To distinguish between artifacts and real variants, the team developed a new way of looking at proteomics datasets. Their approach concurrently analyses several pre-existing co-fractionation mass spectrometry datasets from different cell types and even species, identifying short variants that are found frequently either in a dataset or across several datasets. “This allows us to discard artifacts, which will be less frequent and/or conserved, and instead identify many previously unknown truncated proteins,” McWhite comments. “Our results indicate that up to one percent of all proteins have these variants, which represents a great deal of unstudied protein function.”  

 

Proteolytic events are key to the formation of truncated protein variants 

The team observed that a sizable proportion of the truncated protein variants could not be readily explained by alternative mRNA splicing, a process during which an mRNA molecule is reshuffled to produce different versions of a protein. Instead, many of the truncated proteins were likely generated through a process known as proteolysis: “Proteolysis often serves as a regulatory process in signaling pathways,” Silvia Ramundo, co-author and Group Leader at the GMI, explains. “For example, specific stimuli induce cleavage of the human NOTCH1 protein into two fragments. One of these fragments then translocates to the nucleus, where it regulates gene transcription.”   

The collaboration between McWhite, Ramundo and Onishi led to the unexpected identification of a similar processing mechanism occurring in the unicellular alga Chlamydomonas reinhardtii. “It is fascinating that this type of signaling, usually associated with multicellularity, is conserved in a relatively simple unicellular organism like Chlamydomonas,” Masayuki Onishi notes. These results confirm that by analyzing proteomics data across different organisms, it is possible to identify proteolytic processes conserved throughout evolution. “We’re hopeful that our approach will help reveal new proteolytic mechanisms responsible for controlling protein activities,” adds co-author Edward Marcotte. 

 

A new way to study proteolytic events 

The system developed by McWhite and colleagues offers a new way of identifying proteolytic events on a large scale. “Before our study, the main way to identify these processes was to look at one protein at a time, which is very time-consuming and labor-intensive,” McWhite explains. This new paradigm in proteomics analysis streamlines the process, enabling broader and more efficient detection of proteolytic events and promising many applications. “We can use this approach to compare proteomics datasets from healthy and stressed cells and identify protein isoforms involved in stress responses,” Silvia Ramundo comments. 

We are also curious to explore whether post-translational modifications can be specific to full-length or truncated proteins,” adds Claire McWhite. “A deeper understanding of how proteolytic events occur and their function will expand our knowledge of how cells work,” Masayuki Onishi remarks. For instance, while mutations in fibrocystin are known to cause autosomal recessive polycystic kidney disease, the protein’s exact function remains unclear. “Our findings reveal that both human and unicellular algal homologs of this protein undergo similar proteolytic processing, making Chlamydomonas an ideal and simplified genetic model for studying fibrocystin and its regulated proteolysis.” Masayuki Onishi adds.