Proteomics is the foundational layer that decides which peptides count as proteins, which microproteins are functional, and which sequences a clinical-development pipeline should care about. The 2026 work has been collaborative and standard-setting — international consortia cleaning up annotation databases that drug discovery, vaccine design, and immunopeptidomics all depend on.
Lead pieces covered here: the May 6 Nature paper from the international TransCODE Consortium, which expanded the human proteome by ~10% by adding 1,785 microproteins from non-canonical open reading frames and codified the conceptual model of 'peptideins' for microproteins with indeterminate functional potential; the April 28 Nature Communications DDA-BERT transformer model that improves peptide identification across species and HLA immunopeptidomics; and the March 28 Stanford breakthrough on reverse translation enabling peptide-to-DNA sequencing — a foundational tool for matching peptide hits back to genomic context.
Stories here cover annotation frameworks, mass-spec advances, and the bridge between proteomics and peptide therapeutics. See #microproteins, #peptideins, and #nature-communications.
The international TransCODE Consortium published a Nature paper May 6 reporting that roughly 25% of 7,264 non-canonical open reading frames (ncORFs) give rise to detectable peptides, based on a meta-analysis of 95,520 proteomics experiments. The work identifies 1,785 previously unrecognized microproteins, expands the human proteome by ~10%, and introduces the conceptual model of 'peptideins' — microproteins with indeterminate functional potential. Most peptideins are under 50 amino acids and lack similarity to traditional proteins. The consortium, launched in 2022, includes GENCODE, PeptideAtlas, HUPO-HPP, and HUPO-HIPP and aims to set the reference annotation standard for ncORF-encoded microproteins.
A Nature Communications paper published April 27 introduced DDA-BERT, an end-to-end transformer-based deep learning model for peptide identification in data-dependent acquisition (DDA) proteomics. The model improves peptide identification accuracy across multiple species and outperforms existing methods on HLA immunopeptidomics — directly relevant to neoantigen peptide vaccine discovery for personalized cancer immunotherapy. The work joins the broader 2026 wave of AI-driven peptide tooling captured in this month's ACS Biochemistry and Nature Biotechnology papers.
Stanford researchers developed a method converting peptide sequences into DNA for standard sequencing, published in Nature Biotechnology. Could dramatically accelerate peptide drug discovery.