Partially sequenced organisms, decoy searches and false discovery rates.

B. Victor, S. Gabriel, K. Kanoba, E. Mostovenko, K. Polman, P. Dorny, A.M. Deelder, M. Palmblad

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Tandem mass spectrometry is commonly used to identify peptides, typically by comparing their product ion spectra with those predicted from a protein sequence database and scoring these matches. The most reported quality metric for a set of peptide identifications is the false discovery rate (FDR), the fraction of expected false identifications in the set. This metric has so far only been used for completely sequenced organisms or known protein mixtures. We have investigated whether FDR estimations are also applicable in the case of partially sequenced organisms, where many high-quality spectra fail to identify the correct peptides because the latter are not present in the searched sequence database. Using real data from human plasma and simulated partial sequence databases derived from two complete human sequence databases with different levels of redundancy, we could demonstrate that the mixture model approach in PeptideProphet is robust for partial databases, particularly if used in combination with decoy sequences. We therefore recommend using this method when estimating the FDR and reporting peptide identifications from incompletely sequenced organisms. © 2012 American Chemical Society.
Original languageEnglish
Article number11
Pages (from-to)1991-5
JournalJournal of Proteome Research
Volume2012
Issue number3
DOIs
Publication statusPublished - 2012

Fingerprint

Dive into the research topics of 'Partially sequenced organisms, decoy searches and false discovery rates.'. Together they form a unique fingerprint.

Cite this