This adds an example of swapping SMIRNOFF for GAFF parameters for a protein-ligand complex #476

davidlmobley · 2019-12-18T23:13:41Z

Some of our users (eg Dan McKay) could benefit from being able to easily drop SMIRNOFF parameters into existing workflows, such as by swapping out GAFF parameters for a ligand with OpenFF parameters.

This attempts to add an example of that, but it gets stalled on re-combining the parameterized protein with the OpenFF-parameterized ligand -- an issue with LJ ACOEFF parameters I haven't been able to sort out:

AmberError: FLAG LENNARD_JONES_ACOEF has 171 elements; expected 105

This odd, in that a very similar procedure works when setting up from scratch in one of our examples (https://github.com/openforcefield/openforcefield/blob/master/examples/using_smirnoff_with_amber_protein_forcefield/BRD4_inhibitor_benchmark.ipynb). I haven't been able to sort out what the critical difference is here, so I wanted to get this up in case someone can help.

Tagging @slochower and @j-wags who I think have touched some of these issues before.

…ountered error.

codecov-io · 2019-12-18T23:32:14Z

Codecov Report

Merging #476 into master will not change coverage.
The diff coverage is n/a.

slochower · 2019-12-18T23:43:29Z

I might be missing something obvious, but I only see the topology in this PR. Can you post the code you tried?

…

Sent from my iPhone

On Dec 18, 2019, at 15:13, David L. Mobley ***@***.***> wrote: Some of our users (eg Dan McKay) could benefit from being able to easily drop SMIRNOFF parameters into existing workflows, such as by swapping out GAFF parameters for a ligand with OpenFF parameters. This attempts to add an example of that, but it gets stalled on re-combining the parameterized protein with the OpenFF-parameterized ligand -- an issue with LJ ACOEFF parameters I haven't been able to sort out: AmberError: FLAG LENNARD_JONES_ACOEF has 171 elements; expected 105 This odd, in that a very similar procedure works when setting up from scratch in one of our examples (https://github.com/openforcefield/openforcefield/blob/master/examples/using_smirnoff_with_amber_protein_forcefield/BRD4_inhibitor_benchmark.ipynb). I haven't been able to sort out what the critical difference is here, so I wanted to get this up in case someone can help. Tagging @slochower and @j-wags who I think have touched some of these issues before. You can view, comment on, or merge this pull request online at: #476 Commit Summary Try to prepare example of swapping SMIRNOFF for AMBER parameters. Encountered error. File Changes M examples/README.md (1) A examples/swap_amber_parameters/tmp.top (305) Patch Links: https://github.com/openforcefield/openforcefield/pull/476.patch https://github.com/openforcefield/openforcefield/pull/476.diff — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

davidlmobley · 2019-12-18T23:50:38Z

Ugh, sorry about that, missed the most important part, @slochower ! Should be here now.

slochower · 2019-12-19T03:48:05Z

I haven't dug in yet, but curiously, it seems like it's getting more LJ parameters than expected! Usually, it's less :)

j-wags · 2019-12-19T04:49:00Z

Root cause is probably our lack of atom types -- ParmEd is expecting to fill a numerical array with the same number of pre-computed LJ interaction coefficients as the old ligand (apparently 14) [1].

However, since our new system has "more atom types", the size comparison fails. Oddly, the ligand has 26 atoms, not 18 (which should yield 171, per the formula in the code). My guess for the discrepancy is that ParmEd is clever enough not to pre-compute terms for atoms that only ever appear one or two bonds away from each other (since scale12 and scale13 are 0 in the normal FF)

[1]

>>> for i in range(14,20):
...     print(i, (i*(i+1))//2)
... 
14 105
15 120
16 136
17 153
18 171
19 190

j-wags · 2019-12-19T06:00:08Z

My suspicion is that pieces[0][0] still has ties to the full system that was loaded initially. Even though it should just have the protein, asking it for its LJ types returns 47 unique types.

print(pieces[0][0].LJ_types)
print(len(pieces[0][0].LJ_types))

{'N3': 1, 'H': 2, 'CX': 3, 'HP': 4, '2C': 3, 'HC': 5, 'C': 6, 'O': 7, 'N': 1, 'CT': 3, 'H1': 8, 'CO': 6, 'O2': 7, '3C': 3, 'OH': 9, 'HO': 10, 'C8': 3, 'N2': 1, 'CA': 6, 'HA': 11, 'C*': 6, 'CW': 6, 'H4': 12, 'NA': 1, 'CN': 6, 'CB': 6, 'CC': 6, 'NB': 1, 'CR': 6, 'H5': 13, 'S': 14, 'SH': 14, 'HS': 2, 'na': 1, 'nc': 1, 'nd': 1, 'cc': 6, 'cd': 6, 'c3': 3, 'ca': 6, 'ha': 11, 'hc': 5, 'h5': 13, 'Na+': 15, 'Cl-': 16, 'OW': 17, 'HW': 18}
47

I'm not all that familiar with ParmEd, but I think that what we need to do is to "truly" separate off pieces[0][0] into its own Structure (with no preconcieved notions of the ligand), and then combine the systems.

davidlmobley · 2019-12-19T16:31:10Z

That's what split is SUPPOSED to do, as I understand it, @j-wags .

Do you have other ideas how to do this?

davidlmobley · 2019-12-19T16:31:46Z

@nividic have you dealt with this? You probably have, perhaps you can give a couple quick pointers?

slochower · 2019-12-19T16:50:13Z

I know this is old now, but this is how I did this exact thing before: https://github.com/slochower/smirnoff-host-guest/blob/master/01-convert-benchmarkset.ipynb I think some of the functions here could be useful: https://github.com/slochower/smirnoff-host-guest/blob/master/smirnovert/utils.py but I haven't cross-checked with @davidlmobley's notebook yet.

slochower · 2019-12-19T16:56:39Z

Might also be worthwhile to try remake_parm(): https://parmed.github.io/ParmEd/html/api/parmed/parmed.amber.html?highlight=remake_parm#parmed.amber.AmberParm.remake_parm if you are splitting and recombining topologies.

j-wags · 2019-12-20T00:55:45Z

@slochower's code had the trick we needed. To combine them successfully, we can make a new Structure, and add the pieces we want to it a la carte. For example, like this:

complex_structure = parmed.Structure()
complex_structure += pieces[0][0]
complex_structure += ligand_structure

I've updated the notebook to produce the combined system, and run a brief minimization/simulation. So, there's still a lot of debug output, but this produces at least a system that doesn't explode immediately. It shouldn't be hard to carry the solvent through using the same method.

One thing I'd like another set of eyes on: I've also added paranoid checks to ensure that ParmEd is not incorrectly merging protein and ligand atom parameters when they have the same "atom type". The number of unique sigmas/epsilons DOES seem to be preserved, but we come up one "atom type" short after the merge. "H1" appears in both ligand and protein systems. I hope that's just cosmetic, and the number of unique params seems to indicate that everything is alright.

davidlmobley · 2019-12-20T00:59:37Z

Let me take a quick look, @j-wags . Thanks for looking at this!

davidlmobley · 2019-12-20T01:03:34Z

@j-wags -- this looks like this should be fine. It's just the atom NAMES, which are cosmetic, which collide. The number of atom types (sigmas and epsilons) is correct, indicating that indeed we're just seeing a cosmetic issue.

So, this looks good to go once we re-insert the documentation, etc. (most of it can be copied over from the other BRD4 example). Do you have a few minutes to do that? If not LMK and I may be able to squeeze it in tomorrow.

Also, we need to get the water back in too-- the goal of this one was to take the original system, pull out the ligand and re-parameterize it, and stick it back into the original system. LMK if you need me to do that. It should be just a matter of re-adding the water pieces too.

j-wags · 2019-12-20T01:34:47Z

I'll take a shot at this tonight. My main goals will be to clean up and comment the code, and I'll draft the text fields as well. I find that polishing up wording and adding friendly links takes me unbounded amounts of time, so I'll leave the finalization to you.

slochower · 2019-12-20T02:32:47Z

Glad to help. Adding water back in should be in my notebook too.

…

Sent from my iPhone

On Dec 19, 2019, at 17:34, Jeff Wagner ***@***.***> wrote: I'll take a shot at this tonight. My main goals will be to clean up and comment the code, and I'll draft the text fields as well. I find that polishing up wording and adding friendly links takes me unbounded amounts of time, so I'll leave the finalization to you. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

j-wags · 2019-12-20T03:55:15Z

Ok, I've polished up the documentation for the first half of the notebook, and I've figured out how to efficiently recombine the components to create the system. Note that complex_structure += pieces[2][0] * pieces[2][1] doesn't work, since trying to use the * operator on whatever object is returned by structure.split() yields the same error as we were getting initially (about a mismatched number of atom types).

The code is now all in place, and the resulting brief simulations look reasonable. I want to get ChargeIncrementModel in reviewable shape before I head out, and I'm already pretty booked up tomorrow. @davidlmobley, could you take over from here? The remaining work is to document the second half of the notebook.

davidlmobley · 2019-12-20T19:35:07Z

Yes, I'll document from here. Thanks, @j-wags !

davidlmobley · 2019-12-20T20:32:32Z

@j-wags this is ready to go unless we want to also:
a) do something to enable testing on this notebook (or is it already/automatically enabled?)
b) Get the MyBinder stuff working as in (some of?) the other examples -- there's a link in the README.md for some of the other examples.

I'll go ahead and point Dan here, so that you can either merge now, or wait to get that info in before merging.

Once merged, LMK and I'll tweet about this. I imagine this is an example which will help a lot of people get started since this allows them to drop OpenFF ligands into existing workflows.

j-wags

Made a few changes, but this looks really good overall. I corrected two statements about hbond length constraints. Remember -- Even if we DO want to apply hbond length constraints in the combined system, we load the UNCONSTRAINED forcefield, parameterize everything, and then run parmed_structure.create_system(..., constraints=Hbonds, ...).

As proof that this works, the ligand begins at atom 2036, and here's the constraints section in the resulting system:

from simtk.openmm import XmlSerializer
XmlSerializer.serialize(system)

...<Constraint d=".10850661293899999" p1="2040" p2="2051"/>\n\t\t<Constraint d=".1092888378383" p1="2043" p2="2052"/>...

These values match (to high precision) the hbond lengths in Parsley

Anyway, I'm really happy with this, and I enjoyed learning more about ParmEd. If we wait to binder-ize this now, it'll never get done, so I think this is good to merge!

j-wags · 2019-12-21T00:04:45Z

Oh, but maybe let's wait for tests. Something really strange happened in the last build.

j-wags · 2019-12-21T00:21:32Z

Alright. Tests look good. Clear to merge.

Try to prepare example of swapping SMIRNOFF for AMBER parameters. Enc…

4d7be46

…ountered error.

Missed committing notebook.

51f1fbc

j-wags added 3 commits December 19, 2019 16:23

Cleared parameter separation roadblock

2835d48

Add paranoid checks about number of unique parameters

a4113d8

Paste in some code to go all the way to simulation

ac1a92e

j-wags added 2 commits December 19, 2019 19:46

simplify and explain ligand parameterization

a8519bc

add renamed notebook

61bd004

Clean up and further comment notebook, provide README.md for example.

5c4bf6b

davidlmobley changed the title ~~[WIP] This attempts to add an example of swapping SMIRNOFF for GAFF parameters for a protein-ligand complex~~ This adds an example of swapping SMIRNOFF for GAFF parameters for a protein-ligand complex Dec 20, 2019

davidlmobley requested a review from j-wags December 20, 2019 20:32

j-wags added 2 commits December 20, 2019 15:37

remove misleading statement about which FF to use

9613f3e

remove unnecessary file tmp.top

c0b675b

fix another statement about hbond length constraints

fd0c7db

j-wags approved these changes Dec 21, 2019

View reviewed changes

j-wags merged commit 7858706 into master Dec 21, 2019

j-wags deleted the amber_swap branch December 21, 2019 01:04

j-wags added this to the 0.7.0 milestone Feb 14, 2020

This adds an example of swapping SMIRNOFF for GAFF parameters for a protein-ligand complex #476

This adds an example of swapping SMIRNOFF for GAFF parameters for a protein-ligand complex #476

Uh oh!

Conversation

davidlmobley commented Dec 18, 2019

Uh oh!

codecov-io commented Dec 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

slochower commented Dec 18, 2019 via email

Uh oh!

davidlmobley commented Dec 18, 2019

Uh oh!

slochower commented Dec 19, 2019

Uh oh!

j-wags commented Dec 19, 2019

Uh oh!

j-wags commented Dec 19, 2019

Uh oh!

davidlmobley commented Dec 19, 2019

Uh oh!

davidlmobley commented Dec 19, 2019

Uh oh!

slochower commented Dec 19, 2019

Uh oh!

slochower commented Dec 19, 2019

Uh oh!

j-wags commented Dec 20, 2019

Uh oh!

davidlmobley commented Dec 20, 2019

Uh oh!

davidlmobley commented Dec 20, 2019

Uh oh!

j-wags commented Dec 20, 2019

Uh oh!

slochower commented Dec 20, 2019 via email

Uh oh!

j-wags commented Dec 20, 2019

Uh oh!

davidlmobley commented Dec 20, 2019

Uh oh!

davidlmobley commented Dec 20, 2019

Uh oh!

j-wags left a comment

Choose a reason for hiding this comment

Uh oh!

j-wags commented Dec 21, 2019

Uh oh!

j-wags commented Dec 21, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov-io commented Dec 18, 2019 •

edited

Loading