Skip to content

Conversation

@davidlmobley
Copy link
Contributor

Some of our users (eg Dan McKay) could benefit from being able to easily drop SMIRNOFF parameters into existing workflows, such as by swapping out GAFF parameters for a ligand with OpenFF parameters.

This attempts to add an example of that, but it gets stalled on re-combining the parameterized protein with the OpenFF-parameterized ligand -- an issue with LJ ACOEFF parameters I haven't been able to sort out:

AmberError: FLAG LENNARD_JONES_ACOEF has 171 elements; expected 105

This odd, in that a very similar procedure works when setting up from scratch in one of our examples (https://github.com/openforcefield/openforcefield/blob/master/examples/using_smirnoff_with_amber_protein_forcefield/BRD4_inhibitor_benchmark.ipynb). I haven't been able to sort out what the critical difference is here, so I wanted to get this up in case someone can help.

Tagging @slochower and @j-wags who I think have touched some of these issues before.

@codecov-io
Copy link

codecov-io commented Dec 18, 2019

Codecov Report

Merging #476 into master will not change coverage.
The diff coverage is n/a.

@slochower
Copy link

slochower commented Dec 18, 2019 via email

@davidlmobley
Copy link
Contributor Author

Ugh, sorry about that, missed the most important part, @slochower ! Should be here now.

@slochower
Copy link

I haven't dug in yet, but curiously, it seems like it's getting more LJ parameters than expected! Usually, it's less :)

@j-wags
Copy link
Member

j-wags commented Dec 19, 2019

Root cause is probably our lack of atom types -- ParmEd is expecting to fill a numerical array with the same number of pre-computed LJ interaction coefficients as the old ligand (apparently 14) [1].

However, since our new system has "more atom types", the size comparison fails. Oddly, the ligand has 26 atoms, not 18 (which should yield 171, per the formula in the code). My guess for the discrepancy is that ParmEd is clever enough not to pre-compute terms for atoms that only ever appear one or two bonds away from each other (since scale12 and scale13 are 0 in the normal FF)

[1]

>>> for i in range(14,20):
...     print(i, (i*(i+1))//2)
... 
14 105
15 120
16 136
17 153
18 171
19 190

@j-wags
Copy link
Member

j-wags commented Dec 19, 2019

My suspicion is that pieces[0][0] still has ties to the full system that was loaded initially. Even though it should just have the protein, asking it for its LJ types returns 47 unique types.

print(pieces[0][0].LJ_types)
print(len(pieces[0][0].LJ_types))

{'N3': 1, 'H': 2, 'CX': 3, 'HP': 4, '2C': 3, 'HC': 5, 'C': 6, 'O': 7, 'N': 1, 'CT': 3, 'H1': 8, 'CO': 6, 'O2': 7, '3C': 3, 'OH': 9, 'HO': 10, 'C8': 3, 'N2': 1, 'CA': 6, 'HA': 11, 'C*': 6, 'CW': 6, 'H4': 12, 'NA': 1, 'CN': 6, 'CB': 6, 'CC': 6, 'NB': 1, 'CR': 6, 'H5': 13, 'S': 14, 'SH': 14, 'HS': 2, 'na': 1, 'nc': 1, 'nd': 1, 'cc': 6, 'cd': 6, 'c3': 3, 'ca': 6, 'ha': 11, 'hc': 5, 'h5': 13, 'Na+': 15, 'Cl-': 16, 'OW': 17, 'HW': 18}
47

I'm not all that familiar with ParmEd, but I think that what we need to do is to "truly" separate off pieces[0][0] into its own Structure (with no preconcieved notions of the ligand), and then combine the systems.

@davidlmobley
Copy link
Contributor Author

That's what split is SUPPOSED to do, as I understand it, @j-wags .

Do you have other ideas how to do this?

@davidlmobley
Copy link
Contributor Author

@nividic have you dealt with this? You probably have, perhaps you can give a couple quick pointers?

@slochower
Copy link

I know this is old now, but this is how I did this exact thing before: https://github.com/slochower/smirnoff-host-guest/blob/master/01-convert-benchmarkset.ipynb I think some of the functions here could be useful: https://github.com/slochower/smirnoff-host-guest/blob/master/smirnovert/utils.py but I haven't cross-checked with @davidlmobley's notebook yet.

@slochower
Copy link

Might also be worthwhile to try remake_parm(): https://parmed.github.io/ParmEd/html/api/parmed/parmed.amber.html?highlight=remake_parm#parmed.amber.AmberParm.remake_parm if you are splitting and recombining topologies.

@j-wags
Copy link
Member

j-wags commented Dec 20, 2019

@slochower's code had the trick we needed. To combine them successfully, we can make a new Structure, and add the pieces we want to it a la carte. For example, like this:

complex_structure = parmed.Structure()
complex_structure += pieces[0][0]
complex_structure += ligand_structure

I've updated the notebook to produce the combined system, and run a brief minimization/simulation. So, there's still a lot of debug output, but this produces at least a system that doesn't explode immediately. It shouldn't be hard to carry the solvent through using the same method.

One thing I'd like another set of eyes on: I've also added paranoid checks to ensure that ParmEd is not incorrectly merging protein and ligand atom parameters when they have the same "atom type". The number of unique sigmas/epsilons DOES seem to be preserved, but we come up one "atom type" short after the merge. "H1" appears in both ligand and protein systems. I hope that's just cosmetic, and the number of unique params seems to indicate that everything is alright.

@davidlmobley
Copy link
Contributor Author

Let me take a quick look, @j-wags . Thanks for looking at this!

@davidlmobley
Copy link
Contributor Author

@j-wags -- this looks like this should be fine. It's just the atom NAMES, which are cosmetic, which collide. The number of atom types (sigmas and epsilons) is correct, indicating that indeed we're just seeing a cosmetic issue.

So, this looks good to go once we re-insert the documentation, etc. (most of it can be copied over from the other BRD4 example). Do you have a few minutes to do that? If not LMK and I may be able to squeeze it in tomorrow.

Also, we need to get the water back in too-- the goal of this one was to take the original system, pull out the ligand and re-parameterize it, and stick it back into the original system. LMK if you need me to do that. It should be just a matter of re-adding the water pieces too.

@j-wags
Copy link
Member

j-wags commented Dec 20, 2019

I'll take a shot at this tonight. My main goals will be to clean up and comment the code, and I'll draft the text fields as well. I find that polishing up wording and adding friendly links takes me unbounded amounts of time, so I'll leave the finalization to you.

@slochower
Copy link

slochower commented Dec 20, 2019 via email

@j-wags
Copy link
Member

j-wags commented Dec 20, 2019

Ok, I've polished up the documentation for the first half of the notebook, and I've figured out how to efficiently recombine the components to create the system. Note that complex_structure += pieces[2][0] * pieces[2][1] doesn't work, since trying to use the * operator on whatever object is returned by structure.split() yields the same error as we were getting initially (about a mismatched number of atom types).

The code is now all in place, and the resulting brief simulations look reasonable. I want to get ChargeIncrementModel in reviewable shape before I head out, and I'm already pretty booked up tomorrow. @davidlmobley, could you take over from here? The remaining work is to document the second half of the notebook.

@davidlmobley
Copy link
Contributor Author

Yes, I'll document from here. Thanks, @j-wags !

@davidlmobley davidlmobley changed the title [WIP] This attempts to add an example of swapping SMIRNOFF for GAFF parameters for a protein-ligand complex This adds an example of swapping SMIRNOFF for GAFF parameters for a protein-ligand complex Dec 20, 2019
@davidlmobley
Copy link
Contributor Author

@j-wags this is ready to go unless we want to also:
a) do something to enable testing on this notebook (or is it already/automatically enabled?)
b) Get the MyBinder stuff working as in (some of?) the other examples -- there's a link in the README.md for some of the other examples.

I'll go ahead and point Dan here, so that you can either merge now, or wait to get that info in before merging.

Once merged, LMK and I'll tweet about this. I imagine this is an example which will help a lot of people get started since this allows them to drop OpenFF ligands into existing workflows.

@davidlmobley davidlmobley requested a review from j-wags December 20, 2019 20:32
Copy link
Member

@j-wags j-wags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a few changes, but this looks really good overall. I corrected two statements about hbond length constraints. Remember -- Even if we DO want to apply hbond length constraints in the combined system, we load the UNCONSTRAINED forcefield, parameterize everything, and then run parmed_structure.create_system(..., constraints=Hbonds, ...).

As proof that this works, the ligand begins at atom 2036, and here's the constraints section in the resulting system:

from simtk.openmm import XmlSerializer
XmlSerializer.serialize(system)

...<Constraint d=".10850661293899999" p1="2040" p2="2051"/>\n\t\t<Constraint d=".1092888378383" p1="2043" p2="2052"/>...

These values match (to high precision) the hbond lengths in Parsley

Anyway, I'm really happy with this, and I enjoyed learning more about ParmEd. If we wait to binder-ize this now, it'll never get done, so I think this is good to merge!

@j-wags
Copy link
Member

j-wags commented Dec 21, 2019

Oh, but maybe let's wait for tests. Something really strange happened in the last build.

@j-wags
Copy link
Member

j-wags commented Dec 21, 2019

Alright. Tests look good. Clear to merge.

@j-wags j-wags merged commit 7858706 into master Dec 21, 2019
@j-wags j-wags deleted the amber_swap branch December 21, 2019 01:04
@j-wags j-wags added this to the 0.7.0 milestone Feb 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants