Open
Description
Description
System is a box of 1000 hexadecanes.
from openforcefield.typing.engines.smirnoff import ForceField
from openforcefield.topology import Molecule, Topology
from simtk.openmm import app
import time
hexadecane = Molecule.from_smiles('CCCCCCCCCCCCCCCC')
pdbfile = app.PDBFile('output.pdb')
start_time = time.time()
top = Topology.from_openmm(pdbfile.topology, unique_molecules=[hexadecane])
print("--- Topology creation finished in %s seconds ---" % (time.time() - start_time))
ff = ForceField('test_forcefields/smirnoff99Frosst.offxml')
--- Topology creation finished in 1585.9495000839233 seconds ---
start_time = time.time()
system = ff.create_openmm_system(top)
print("--- Ssytem creation finished in %s seconds ---" % (time.time() - start_time))
--- Ssytem creation finished in 57.06095290184021 seconds ---
Thoughts
I suspect that this slowdown occurs in the graph matching stage of topology creation. Hexadecane has a huge number of self-symmetries. While the function only takes the first molecular topology match that's found, it may be generating all of them unnecessarily.
Relevant code is here: https://github.com/openforcefield/openforcefield/blob/master/openforcefield/topology/topology.py#L1635-L1664
We should poke around NetworkX to see if there's a faster algorithm that this code could use.