AI-designed peptides — how computational protein design is changing drug discovery
10 min read · Uplevel editorial
In November 2020, a system called AlphaFold2 solved a problem that structural biologists had spent fifty years treating as practically unsolvable. Given a protein's amino acid sequence, it predicted the three-dimensional shape that protein would fold into — with an accuracy that stunned the field, matched experimental crystallography in many cases, and made the entire Protein Data Bank look like a starting point rather than an endpoint. The news traveled fast, landed in the scientific press like a thunderclap, and then did what major scientific advances usually do: it started rewriting the assumptions underneath a whole industry.
The question that followed was obvious. If you could predict a protein's structure from its sequence, could you run the process in reverse? Could you design a sequence that would fold into a structure you wanted — a binding pocket, a receptor interface, a scaffold with specific functional properties — and then synthesize it and test it? Could you, in other words, design proteins from scratch?
It turned out you could.
David Baker's lab at the University of Washington had been working on computational protein design for years before AlphaFold, developing tools like Rosetta that could predict and design protein structures, and the lab's work sits at the center of what's become a genuine revolution in how biologic drugs — including peptides — get discovered. The combination of structure prediction and generative machine learning has now produced tools like RFdiffusion, a diffusion model that treats protein backbone generation the way image diffusion models treat pixels, capable of generating novel protein structures conditioned on almost any constraint you specify. You tell the model what the target looks like, what the binding interface should accomplish, and it generates candidate backbones. Other tools — ProteinMPNN, ESMFold, LigandMPNN — handle different parts of the design and validation pipeline. The entire toolkit is open-source. You can run parts of it on academic compute budgets. The barrier to entry for computational protein design has dropped from a dedicated structural biology department to a competent machine learning team with access to a GPU cluster.
This matters for peptides specifically, and in ways that aren't obvious if you think of peptides primarily as the compounded compounds circulating in wellness medicine. The peptide drug discovery problem has always been a sampling problem. The possible sequence space for even a modest twelve-amino-acid peptide is astronomically large — twenty amino acids at each position gives you something like twenty to the twelfth power possible sequences, which is roughly four quadrillion. Historically, peptide drug discovery worked by starting from natural peptides that evolution had already selected for biological function, modifying them systematically, running in vitro assays on thousands of variants, and hoping something with improved properties emerged. It was expensive, it was slow, and it was heavily biased toward the regions of sequence space that happened to neighbor known natural compounds.
AI approaches can navigate that space differently. Instead of random or systematic sampling, a trained model can design sequences predicted to satisfy multiple constraints simultaneously: fold into a specific shape, bind a specific target at a specific interface, avoid sequences likely to be proteolytically degraded, avoid sequences likely to trigger immune responses. The model isn't searching blindly. It's proposing candidates from a prior distribution shaped by everything structural biology knows about how proteins fold and bind. The result is a dramatic compression of the early discovery phase — from years of experimental screening toward months, sometimes weeks, of computational design followed by experimental validation of a much smaller, more targeted set of candidates.
Several AI-designed proteins and peptides have now entered or are approaching clinical development. Absci, Generate:Biomedicines, Profluent, and a growing number of other companies have AI-designed biologics in their pipelines. The mechanisms range from receptor-targeted peptides to engineered antibodies to peptide-based immunotherapies. None of these are household names yet, and most are still in early clinical phases — but the pipeline five years from now will look substantially different from the pipeline today, and a meaningful fraction of those differences will trace back to computational design.
The improvements aren't only about speed of initial discovery. Machine learning is also being applied to peptide optimization problems that were previously intractable. Metabolic stability — how quickly a peptide gets degraded by proteases in the bloodstream or gut — has always been a central limitation of peptide therapeutics. Traditional approaches involved systematically substituting D-amino acids for L-amino acids at vulnerable sites, cyclizing the peptide backbone, adding PEG chains, or making other modifications that empirically improved stability but required extensive testing to find the right combination. Predictive models trained on large stability datasets can now make those recommendations computationally, identifying the substitution pattern most likely to preserve binding activity while improving half-life. The same applies to membrane permeability, target selectivity, and immunogenicity prediction. Each of these optimization dimensions used to require independent experimental campaigns; increasingly they're being handled in silico before synthesis begins.
This changes the economics of peptide drug development in ways that will eventually reshape the field. Development costs for small molecules and biologics at major pharmaceutical companies routinely run into the hundreds of millions of dollars before a drug reaches approval, with most of that cost concentrated in the long tail of failed candidates. If computational design meaningfully compresses the discovery and lead optimization phases — and the early evidence suggests it does, at least for some target classes — the number of programs a given research budget can support increases substantially. Smaller companies can now run peptide discovery programs that would have required big-pharma infrastructure a decade ago. Academic labs can produce credible peptide candidate series. The democratization of the toolkit genuinely lowers the barrier.
What AI does not change is everything that comes after discovery. The FDA's requirements for clinical validation of peptide therapeutics are the same whether the initial candidate was designed computationally or discovered by screening a natural product library. Phase I through Phase III. Safety studies. Efficacy studies. Manufacturing validation. Regulatory submission. Approval. All of that still happens on timelines that are governed by biology — by how long it takes to enroll trials, observe outcomes, accumulate safety data. A peptide that was computationally designed in three months still takes eight to fifteen years to reach approval if it proceeds through the full clinical pathway. The discovery acceleration is real. The clinical timeline compression is modest at best. People in the drug development world who talk about AI as if it's about to produce approved drugs at dramatically accelerated rates are conflating the very beginning of the pipeline with the pipeline as a whole.
There's also an honest conversation to have about what AI-designed peptides are not, at least not yet. The field is genuinely early. The models are impressive at generating structurally plausible candidates, somewhat less reliable at predicting functional activity in complex biological contexts, and substantially less reliable at predicting off-target effects, toxicity, or in vivo behavior. Computational design is making better starting points. It's not yet replacing the iterative biological characterization that turns a starting point into a drug. The ratio of computationally designed candidates to experimentally validated successes remains humbling, even if it's better than the previous ratio from conventional screening.
The consumer peptide space — the compounded peptides circulating in clinical wellness contexts — is not directly adjacent to this revolution. AI-designed peptides are entering clinical development as novel therapeutic candidates for specific diseases; they're not being designed for the lifestyle optimization uses that drive most of the wellness peptide conversation. The regulatory pathway for an AI-designed peptide is the same as for any other novel therapeutic: it needs to go through clinical trials and achieve approval before it can be prescribed. The compounding context that makes wellness peptides accessible doesn't apply to novel designed sequences that lack any existing pharmacopeial monograph or clinical history. So the excitement about computational design isn't excitement about the near-term expansion of the wellness peptide catalog. It's excitement about something different in kind.
The more interesting question for the next decade is what categories of therapeutic become possible through AI-designed peptides that weren't feasible before. Highly selective, multi-specific peptides that can simultaneously engage two or three targets — previously nearly impossible to design without computational assistance — are now tractable design problems. Peptides designed to adopt specific binding orientations at receptor interfaces with sub-angstrom precision are being published. Conditional peptides — sequences designed to change conformation and activate only in specific tissue environments, like the low-pH environment of a tumor — are in early research. These are not things that natural peptide diversity hands you. They require design.
The decade ahead will likely produce a new category of peptide therapeutics that looks substantially different from the GLP-1 analogs and growth hormone secretagogues that have defined the last two decades of peptide medicine. Designed rather than discovered, optimized by models trained on more structural data than any single laboratory could experimentally generate, built to engage targets that natural evolution never had reason to address. That is the actual shape of the AI-peptide future — specific, incremental in its clinical development even if radical in its design methodology, and still bound by the same biological timelines that govern all of medicine. The computation accelerates the beginning. The rest remains work.
Frequently asked