Hierarchical Clustering of Correlation Patterns (HCCP)

HCCP is an algorithm and software for dynamic domain identification in proteins. See the following papers for details:

Yesylevskyy, S. O., V. N. Kharkyanen, and A. P. Demchenko. Hierarchical clustering of the correlation patterns: New method of domain identification in proteins. Biophys. Chem. 2006. 119:84-93.
Yesylevskyy, S. O., V. N. Kharkyanen, and A. P. Demchenko. Dynamic protein domains: identification, interdependence and stability. Biophys J, 2006, 91, 670-685.

The code was written in 2006-2009 in Fortran90 and was never published being available by request only. In 2025 I resurrected it by fixing minor compatibility issues with modern gfortran and uploaded to GitHub.

Despite being very old and written in Fortran HCCP is rather user-friendly and easy to modify.

When using HCCP please always cite the papers above.

Compiling

You need gfortran to compile. Just run make.

Running hccp

hccp < inp where inp is your input file

Use the sample_input file as a template. The comments there are detailed enough. The input file is format-specific! Do not add or remove lines - it won't work. Only standard PDB files can be read, no CHARMm pdbs and the like.

Output

Standard output contains:

Natural number of clusters (most probable for this protein)
Second natural number of clusters (the second probable)
Mean intra-domain correlation and
Mean interdomain correlation for the natural number of clusters. NOTE: These features are not published yet.

The most useful information is written to the file color.tcl. The human-readable lines begin with #. Typical output looks like:

 #-- Number of clusters  2 -------
 ...
 # (  1|   2: 108)
 # (  2| 109: 254)
 # (  1| 255: 284)
 # (  2| 285: 306)
      ^  ^    ^
      |  |    End of segment
      |  Start of segment
      Domain

Domains can consist of many segments, which are continuos in sequence. Chain identifiers are added if needed.

Such blocks of information are written for ALL hierarchical levels. It is a good idea to read the file from the end - largest clusters are there.

File color.tcl is re-written each time you run hccp with another protein. Be careful not to overwrite your data!

Using VMD to visualize domains

Start VMD
Type cd <path to the directory where your files reside> in console
Load your pdb
Type source color.tcl

Now you can use the following commands:

next - show next hierarchical level.
prev - show previous hierarchical level.
last - show last level (2 domains).
goto <N> - go to the level number N.

The clusters are color-coded. Segments, which appear in blue are not assigned to any cluster larger then 5 reidues (they still can be assigned to smaller clusters).

Possible problems

PDB can not be read: Check if this PDB is valid. Try to remove the ligands and hetero atoms. Check the histidine residues - they should be marked HIS, other names are not supported.
Incorrect number of residues is detected: The number of residues is determined as a number of CA atoms. If you have some CA missed, you'll got fewer residues. Non-standard residues are also ignored, even if they have CA.
Other problems: Please open GitHub issue

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
sample_proteins		sample_proteins
Makefile		Makefile
README.md		README.md
comm.f90		comm.f90
dfire_scm.pot		dfire_scm.pot
eigen.f90		eigen.f90
functions.f90		functions.f90
hccp.f90		hccp.f90
hccp_classic.f90		hccp_classic.f90
input_parser2.f90		input_parser2.f90
input_parser_util.f90		input_parser_util.f90
nr.f90		nr.f90
nrtype.f90		nrtype.f90
nrutil.f90		nrutil.f90
pdb.f90		pdb.f90
rmsd_fit.f90		rmsd_fit.f90
sample_input		sample_input

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hierarchical Clustering of Correlation Patterns (HCCP)

Compiling

Running hccp

Output

Using VMD to visualize domains

Possible problems

About

Uh oh!

Releases

Packages

Languages

yesint/hccp

Folders and files

Latest commit

History

Repository files navigation

Hierarchical Clustering of Correlation Patterns (HCCP)

Compiling

Running hccp

Output

Using VMD to visualize domains

Possible problems

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages