Skip to content
Simone Chiarella edited this page Dec 8, 2024 · 3 revisions

Output overview

The output you get from a ProtACon run changes depending on the command executed: on_chain or on_set. Let's start.

on_chain output

Being 1HPV the PDB code of an example protein, the following files are saved during the run.

Files

  • 1HPV_residue_df.csv

    The data frame containing the information about the amino acids that constitute the peptide chain.

Plots

  • 1HPV_distance_and_contact.png

    The distance and the normalized contact maps between each couple of amino acids.

    1HPV_distance_and_contact

  • 1HPV_binary_contact_map.png

    The binary thresholded contact map between each couple of amino acids.

    1HPV_binary_contact_map

  • 1HPV_att_layer_30.png

    The attention matrices from each head of the last layer (whose head n.14 is the one that specializes in detecting contacts between residues).

    1HPV_att_layer_30

  • 1HPV_att_layer_avg.png

    The averages of the attention matrices independently computed for each layer.

    1HPV_att_layer_avg

  • 1HPV_att_model_avg.png

    The average of the layer attention averages, relative to the whole model.

    1HPV_att_model_avg

  • 1HPV_att_to_aa.png

    The heatmaps of the total attention given to each amino acid.

    1HPV_att_to_aa

  • 1HPV_att_sim.png

    The heatmap of the attention similarity between each couple of amino acids.

    1HPV_att_sim

  • 1HPV_att_align_heads.png

    The heatmap of the attention alignment of each head.

    1HPV_att_align_heads

  • 1HPV_att_align_layers.png

    The bar plot of the attention alignment of each layer.

    1HPV_att_align_layers

on_set output

Beside the files and plots listed below, the output files relative to the single chains—described above—are saved as well, if the dedicated flag is specified.

Files

  • protein_codes.txt

    The list of PDB codes corresponding to the proteins included in the analysis.

  • total_residue_df.csv

    The data frame containing the information about the amino acids that constitute the peptide chains included in the analysis.

  • PT_att_to_aa.pt

    The percentage of total attention given to each amino acid

  • PWT_att_to_aa.pt

    The percentage of total attention given to each amino acid, weighted by the occurrences of that amino acid in all the proteins of the set.

  • PH_att_to_aa.pt

    The percentage of attention of each head given to each amino acid.

  • att_sim_df.csv

    The global attention similarity between each couple of amino acids.

  • avg_head_att_align.npy

    The attention-contact alignment in the attention heads, averaged over the whole set of proteins.

  • avg_layer_att_align.npy

    The attention-contact alignment across the layers, averaged over the whole set of proteins.

Plots

  • PT_att_to_aa.png

    The heatmaps of the percentage of attention given to each amino acid.

    PT_att_to_aa

  • PWT_att_to_aa.png

    The heatmaps of the percentage of attention given to each amino acid, weighted by the occurrences of that amino acid in all the proteins of the set.

    PWT_att_to_aa

  • PH_att_to_A.png

    The heatmap of the percentage of each head's attention given to the amino acid Alanine (A). An image like that is generated for every amino acid appearing in the set.

    PH_att_to_A

  • att_sim.png

    The heatmap of the global attention similarity between each couple of amino acids.

    att_sim

  • avg_att_align_heads.png

    The heatmap of the average head attention alignment.

    avg_att_align_heads

  • avg_att_align_layers.png

    The bar plot of the average layer attention alignment.

    avg_att_align_layers

Clone this wiki locally