Authors: Michael P Washburn
Publish Date: 2015/06/20
Volume: 26, Issue: 11, Pages: 1799-1803
Abstract
Over 20 years ago a remarkable paper was published in the Journal of American Society for Mass Spectrometry This paper from Jimmy Eng Ashley McCormack and John Yates described the use of protein databases to drive the interpretation of tandem mass spectra of peptides This paper now has over 3660 citations and continues to average more than 260 per year over the last decade This is an amazing scientific achievement The reason for this is the paper was a cutting edge development at the moment in time when genomes of organisms were being sequenced protein and peptide mass spectrometry was growing into the field of proteomics and the power of computing was growing quickly in accordance with Moore’s law This work by the Yates lab grew in importance as genomics proteomics and computation all advanced and eventually resulted in the widely used SEQUEST algorithm and platform for the analysis of tandem mass spectrometry data This commentary provides an analysis of the impact of this paper by analyzing the citations it has generated and the impact of these citing papersWord clouds of all titles of all citing papers From wwwwordlenet a shows the word cloud using all the terms in all the titles of all citing papers and b shows the word cloud using only biological terms after the mass spectrometry and proteomics related terms have been removedIt is important to note that while Eng et al 1 is widely associated with SEQUEST the first time the term SEQUEST appeared in print was in a paper published in Rapid Communications in Mass Spectrometry entitled ‘Direct database searching with MALDIPSD spectra of peptides’ 3 and this paper has been cited 71 times The first time SEQUEST appeared in PubMed was in 1996 in another paper in JASMS 4 which has been cited 52 times In between the two important papers in JASMS the Yates lab published two papers in Analytical Chemistry that expanded on the landmark Eng et al study in 1994 1 Both of these papers are highly cited in their own right with 298 citations for one 5 and 923 citations for the other 6 A strong argument can be made that the less cited of these two papers which describes a method to mine genomes by using a sixframe translation of a nucleotide sequence database to identify peptides 5 provided an underappreciated foundation for the current practice of proteogenomics The second of these two papers is significant because it describes the analysis of covalently modified peptides including posttranslational modifications like phosphorylation 6 Both proteogenomics and the analysis of posttranslational modifications are important applications of modern proteomics and these two papers played important early roles in these fields
Keywords: