Self-Evaluating Diffusion Embeddings Improve Reliability of Biomolecular Structure Predictions

Here, we introduce CODE (Chain of Diffusion Embedding), a latent-space trajectory–based metric, and CONFIDE, an integrated score that jointly quantify topological and energetic frustrations underlying unreliable AlphaFold3 predictions. You can use these tools to detect hallucination in structure prediction, as loss functions to enhance de novo binder design, and also for virtual screening.

Overview

Inspired by the topological frustration theory in protein folding, we reformulated the diffusion embedding trajectories of the AF3 series structure predictors into CODE, a metric that captures topological frustration overlooked by the conventional confidence metric pLDDT. We propose CONFIDE, a unified framework integrating topological frustration and pLDDT-represented energetic frustration to comprehensively characterize the protein folding energy landscape.

Getting Started

Prerequisites

Python 3.10
Required Python packages (see Installation section)

Installation

For basic installation instructions, please refer to the Boltz-1 repository.
Clone the repository:

git clone https://github.com/zjgao02/CONFIDE.git
cd CONFIDE

Usage

Data Preparation

First, you need to construct YAML files for Boltz1 predict. You can refer to 'examples/prot_no_msa.yaml'.

Next, for saving the CODE trajectory, you can modify the path in

CONFIDE/src/boltz/data/write/writer.py

Line 82 in b436e7c

    
           torch.save(prediction['token_a_list'], f'/data/home/luruiqiang/guchunbin/boltz/examples/{record.id}.pt')

Inference

Then you can run inference using Boltz with:

boltz predict input_path --out_dir output_path --cache ./

Data Analysis

You can get the confidence score directly from the Boltz output, which represents the energy frustration.

After that you can run the code.py code to perform energy combination analysis in CONFIDE:

python code.py

Contact

If you have any questions, please feel free to contact the authors.

Zijun Gao (zjgao24@cse.cuhk.edu.hk)
Chunbin Gu (guchunbin200888@gmail.com)
Changyu Hsieh (kimhsieh@zju.edu.cn)

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
docs		docs
examples		examples
scripts		scripts
src		src
tests		tests
LICENSE		LICENSE
README.md		README.md
code.py		code.py
framework.png		framework.png
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Self-Evaluating Diffusion Embeddings Improve Reliability of Biomolecular Structure Predictions

Overview

Getting Started

Prerequisites

Installation

Usage

Data Preparation

Inference

Data Analysis

Contact

About

Uh oh!

Releases

Packages

Languages

License

zjgao02/CONFIDE

Folders and files

Latest commit

History

Repository files navigation

Self-Evaluating Diffusion Embeddings Improve Reliability of Biomolecular Structure Predictions

Overview

Getting Started

Prerequisites

Installation

Usage

Data Preparation

Inference

Data Analysis

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages