Beyond Structure: Predicting How Tightly Drugs Bind
Boltz-2 is the first deep learning model to approach the accuracy of physics-based free-energy perturbation (FEP) methods for binding affinity prediction — while running 1,000× faster. Open-sourced by MIT Jameel Clinic and Recursion Pharmaceuticals.
🧬 What Boltz-2 Predicts
- 3D Complex Structures — protein-ligand, protein-protein, protein-DNA/RNA, antibody-antigen
- Binding Affinity (continuous) — log₁₀(IC₅₀) for hit-to-lead optimization
- Binder Probability (binary) — active vs. decoy classification for hit discovery
- Molecular Dynamics — RMSF prediction competitive with AlphaFlow/BioEmu
- Confidence Scores — iPTM, pLDDT for reliability assessment
🔑 Key Innovations
- Unified Structure + Affinity — first co-folding model to jointly predict both
- Method Conditioning — specify X-ray, NMR, or MD emulation mode
- Template Steering — input reference structures as prior knowledge
- Contact Constraints — enforce specific distance constraints
- MD Ensemble Training — trained on MISATO, ATLAS, mdCATH dynamics
Discovery Use Cases
Hit Discovery
Screen large chemical libraries. Boltz-2 discriminates binders from decoys on MF-PCBA benchmark, doubling average precision over docking/ML methods.
Hit-to-Lead
Rank-order compounds by binding affinity. Approaches FEP accuracy (Pearson 0.62) on the protein-ligand-benchmark at 1,000× the speed.
Lead Optimization
Predict how small chemical modifications affect binding. Pairwise intra-assay difference training captures subtle SAR relationships.
De Novo Generation
Paired with SynFlowNet generative model: top-10 TYK2 compounds all predicted to bind via ABFE simulation. Validated generative design workflow.
📅 Boltz Timeline
Model Architecture
Boltz-2 extends the AlphaFold3/Boltz-1 architecture with an affinity module, controllability features, and GPU optimizations. Four main components: Trunk → Denoising → Confidence → Affinity.
Architecture Components
| Component | Role | Key Details | New in Boltz-2? |
|---|---|---|---|
| Trunk | Core representation learning | PairFormer stack with triangle attention, bf16, trifast kernel, 768-token crops | Optimized ✓ |
| MSA Module | Multiple sequence alignment processing | Column attention over evolutionary sequences | — |
| Diffusion Module | Structure generation via denoising | Atom transformer, AF3 σ hyperparameters | Updated ✓ |
| Steering (Boltz-2x) | Physical quality enforcement | Inference-time physics potentials, clash removal | ✓ |
| Method Conditioning | Experimental mode specification | X-ray / NMR / MD emulation modes | ✓ New |
| Template Steering | Prior structure integration | Multi-chain templates, soft/strict modes | ✓ New |
| Confidence Module | Prediction reliability | iPTM, pLDDT, PAE, pDE scores | — |
| Affinity Module | Binding strength prediction | Pocket PairFormer → binary head + affinity value head | ✓ New |
| B-factor Head | Local dynamics prediction | Supervised on experimental + MD B-factors | ✓ New |
🔢 Training Data Sources
⚡ Compute Optimizations
- Mixed precision (bf16) — Reduced memory, faster matmuls
- trifast kernel — Optimized triangle attention
- 768-token crops — Matching AF3 training scale
- cuEquivariance — NVIDIA GPU acceleration
- Pre-computed pockets — Efficient affinity training
Drug Discovery Pipeline
From protein sequence to binding affinity prediction — the complete Boltz-2 workflow for computational drug discovery.
Input Preparation
Define the biomolecular system in YAML format: protein sequences, ligand SMILES, DNA/RNA chains. Optionally provide MSA, templates, method conditioning, and contact constraints.
MSA Generation
ColabFold MSA server generates multiple sequence alignments from protein databases. Evolutionary co-variation patterns inform structural contact predictions.
Trunk Processing
The PairFormer stack processes pair and single representations through triangle attention layers (bf16, trifast kernel). B-factor supervision captures local dynamics. Crop size: 768 tokens.
Structure Prediction (Denoising)
Diffusion module generates 3D coordinates through iterative denoising. Atom transformer refines all-atom positions. Boltz-2x applies physics-based steering to remove clashes and fix stereochemistry.
Confidence Scoring
Confidence module outputs iPTM (interface predicted TM-score), pLDDT (predicted local distance difference test), PAE (predicted aligned error), and pDE (predicted distance error) for each prediction.
Binding Affinity Prediction
The affinity module's pocket PairFormer processes protein-ligand interactions. Two output heads: binary P(binder) for hit discovery screening, and log₁₀(IC₅₀) continuous affinity for lead optimization.
Generative Design (Optional)
Coupled with SynFlowNet for de novo molecule generation. Iteratively generates and scores synthesizable compounds. Validated on TYK2 target: all top-10 compounds predicted as binders by ABFE simulations.
Speed Comparison: Boltz-2 vs Traditional Methods
📥 Input Modalities
| Input | Format | Required? |
|---|---|---|
| Protein | Amino acid sequence | ✓ |
| Ligand | SMILES / SDF | For affinity |
| DNA | Nucleotide sequence | Optional |
| RNA | Nucleotide sequence | Optional |
| MSA | Auto-generated or custom | Recommended |
| Templates | CIF files | Optional |
| Constraints | Distance / contact pairs | Optional |
📤 Output Predictions
| Output | Unit | Use Case |
|---|---|---|
| 3D Structure | CIF coordinates | Visualization, docking |
| P(binder) | 0 → 1 probability | Hit discovery |
| Affinity value | log₁₀(IC₅₀) in μM | Lead optimization |
| iPTM | 0 → 1 | Interface quality |
| pLDDT | 0 → 100 | Local confidence |
| B-factors | Ų | Flexibility |
Benchmarks & Performance
Boltz-2 sets new standards across structure prediction, binding affinity, and virtual screening benchmarks.
Affinity Prediction: FEP+ Benchmark
Pearson correlation on 4-target FEP+ subset (CDK2, TYK2, JNK1, P38). Higher = better. Unseen proteins held out of training.
Structure Prediction: Cross-Modality Accuracy
Structural accuracy across biomolecular modalities. Boltz-2 matches or exceeds Boltz-1 across all categories, with notable gains on antibody-antigen and DNA-protein complexes.
Virtual Screening: MF-PCBA Hit Discovery
Average precision for binder/decoy discrimination in high-throughput screens. Boltz-2 doubles average precision over docking and ML baselines.
Benchmark Summary
| Benchmark | Task | Metric | Boltz-2 | Best Competitor | Note |
|---|---|---|---|---|---|
| FEP+ (4-target) | Affinity (lead opt.) | Pearson r | 0.62 | 0.65 (OpenFE) | 1,000× faster than FEP |
| CASP16 Affinity | Affinity (140 complexes) | Ranking | #1 | All submitted methods | Retrospective, no fine-tuning |
| MF-PCBA | Hit discovery | Avg. Precision | 2× baseline | Docking / ML methods | Binder vs decoy screen |
| Protein-Ligand | Structure | Success rate | ≥ Boltz-1 | AlphaFold3 | Improved over Boltz-1 |
| Antibody-Antigen | Structure | DockQ | Notable gains | Boltz-1 | Challenging modality |
| DNA-Protein | Structure | LDDT | Improved | Boltz-1 | Better with distillation |
| RNA Structure | Structure | TM-score | Improved | Boltz-1 | Expanded training data |
| Dynamics (RMSF) | Flexibility | Correlation | Competitive | AlphaFlow / BioEmu | MD-conditioned mode |
| TYK2 Generative | De novo design | Top-10 validated | 10/10 binders | — | Via ABFE simulation |
Interactive Affinity Predictor
Simulate Boltz-2 binding affinity predictions. Select a protein target and ligand, then explore how structural features influence predicted binding strength.
🎯 Select Target & Ligand
Prediction Parameters
📊 Prediction Results
Binding Pocket Interactions
🧪 Compound Series Comparison
Simulated SAR (Structure-Activity Relationship) analysis across the compound series for the selected target.
Model Arena
Compare Boltz-2 against leading biomolecular structure prediction and affinity models across key dimensions.
| Model | Organization | Structure | Affinity | Open Source | License | Modalities | Speed |
|---|---|---|---|---|---|---|---|
| Boltz-2 | MIT + Recursion | ★★★★☆ | ★★★★★ | ✅ Full | MIT | Protein, Ligand, DNA, RNA, Ab-Ag | ~30s / complex |
| AlphaFold3 | DeepMind | ★★★★★ | ★★☆☆☆ | ⚠️ Server only | Restricted | Protein, Ligand, DNA, RNA, ions | Server queue |
| Boltz-1 | MIT + Recursion | ★★★★☆ | — | ✅ Full | MIT | Protein, Ligand, DNA, RNA | ~25s / complex |
| Chai-1 | Chai Discovery | ★★★★☆ | ★★☆☆☆ | ✅ Weights | Non-commercial | Protein, Ligand, DNA, RNA | ~30s / complex |
| OpenFold | Columbia | ★★★☆☆ | — | ✅ Full | Apache 2.0 | Protein (monomer/multimer) | ~45s / chain |
| FEP+ (Schrödinger) | Schrödinger | — | ★★★★★ | ❌ Commercial | Commercial | Protein-Ligand only | ~8h / compound |
| OpenFE | Open Source | — | ★★★★☆ | ✅ Full | MIT | Protein-Ligand only | ~4h / compound |
| RoseTTAFold2 | UW Baker Lab | ★★★★☆ | — | ✅ Full | BSD | Protein, NA, Ligand | ~40s / complex |
| ESMFold | Meta FAIR | ★★★☆☆ | — | ✅ Full | MIT | Protein (single-sequence) | ~5s / chain |
| DiffDock | MIT | ★★★☆☆ | ★★★☆☆ | ✅ Full | MIT | Protein-Ligand docking | ~10s / pose |
| NeuralPLexer | Iambic | ★★★★☆ | ★★☆☆☆ | ✅ Weights | Research | Protein-Ligand complexes | ~20s / complex |
| BoltzGen | MIT + Recursion | ★★★☆☆ | — | ✅ Full | MIT | Generative protein design | ~15s / sample |
Capability Radar
Why Boltz-2 Stands Out
-
🏆 Only model combining structure + affinity
AlphaFold3 does structure; FEP does affinity. Boltz-2 does both. -
🔓 Fully open-source (MIT)
Weights, code, and training pipeline all under MIT license. AF3 is server-only. -
⚡ 1,000× faster than FEP
Approaches FEP accuracy in seconds, not hours. Enables large-scale virtual screening. -
🎛️ Controllable predictions
Method conditioning, template steering, contact constraints — no retraining needed. -
📊 CASP16 #1 in affinity
Outperforms all submitted methods on the 140-complex CASP16 affinity challenge.
References & Resources
Primary literature, code repositories, and community resources for Boltz-2.
Primary Papers
- Passaro, S., Corso, G., Wohlwend, J., et al. (2025). Boltz-2: Towards Accurate and Efficient Binding Affinity Prediction. bioRxiv. doi:10.1101/2025.06.14.659707
- Wohlwend, J., Corso, G., Passaro, S., et al. (2024). Boltz-1: Democratizing Biomolecular Interaction Modeling. bioRxiv. doi:10.1101/2024.11.19.624167
- Abramson, J., Adler, J., Dunger, J., et al. (2024). Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature, 630, 493–500.
- Ross, G.A., et al. (2023). Large-scale protein-ligand binding free energy benchmark. J. Chem. Inf. Model.
- Buterez, D., et al. (2023). MF-PCBA: Multi-fidelity high-throughput screening benchmarks. NeurIPS Datasets and Benchmarks.
- Cretu, A., et al. (2024). SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints. arXiv.
- Jing, B., et al. (2024). AlphaFlow: autonomous molecular dynamics with diffusion models. ICML.
- Lewis, M., et al. (2025). BioEmu: scalable and accurate biomolecular dynamics with ML. bioRxiv.
- Mirdita, M., et al. (2022). ColabFold: making protein folding accessible to all. Nature Methods.
- Kim, S., et al. (2023). PubChem 2023 update. Nucleic Acids Research.
- Zdrazil, B., et al. (2024). The ChEMBL Database in 2023. Nucleic Acids Research.
- Lin, T.Y., et al. (2017). Focal Loss for Dense Object Detection. ICCV.
Resources
GitHub Repository
github.com/jwohlwend/boltz
MIT-licensed code, weights, training pipeline.
Full Manuscript
PDF (jeremywohlwend.com)
Complete technical details and supplementary.
NVIDIA NIM
NVIDIA NIM API
Cloud inference via NIM microservice.
Tamarind Bio
Tamarind web UI
Run Boltz-2 in browser, upload templates.
Slack Community
Join Slack
Discuss with developers and users.
Recursion (RXRX)
rxrx.ai/boltz-2
Industry partner, NASDAQ-listed TechBio.