Skip to content

Improve speed of OFF molecule/network molecule comparisons and isomorphisms #353

Open
@j-wags

Description

@j-wags

Description

System is a box of 1000 hexadecanes.

from openforcefield.typing.engines.smirnoff import ForceField
from openforcefield.topology import Molecule, Topology
from simtk.openmm import app
import time

hexadecane = Molecule.from_smiles('CCCCCCCCCCCCCCCC')
pdbfile = app.PDBFile('output.pdb')
start_time = time.time()
top = Topology.from_openmm(pdbfile.topology, unique_molecules=[hexadecane])
print("--- Topology creation finished in %s seconds ---" % (time.time() - start_time))
ff = ForceField('test_forcefields/smirnoff99Frosst.offxml')

--- Topology creation finished in 1585.9495000839233 seconds ---

start_time = time.time()

system = ff.create_openmm_system(top)
print("--- Ssytem creation finished in %s seconds ---" % (time.time() - start_time))

--- Ssytem creation finished in 57.06095290184021 seconds ---

output.pdb.gz

Thoughts

I suspect that this slowdown occurs in the graph matching stage of topology creation. Hexadecane has a huge number of self-symmetries. While the function only takes the first molecular topology match that's found, it may be generating all of them unnecessarily.

Relevant code is here: https://github.com/openforcefield/openforcefield/blob/master/openforcefield/topology/topology.py#L1635-L1664

We should poke around NetworkX to see if there's a faster algorithm that this code could use.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions