Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross-check SMIRFF parm@frosst energies against parm@frosst energies from AMBER files #38

Merged
merged 22 commits into from
Jul 4, 2016

Conversation

davidlmobley
Copy link
Contributor

Working to address #37 by cross-comparing energies for OpenMM Systems/Topologies created from both starting points.

I have this working in examples/SMIRFF_comparison for a single molecule as a first attempt, and have coopted the system_checker functionality from openmoltools in order to do this.

Currently, I'm getting a force constant error on the first force parameter check I run:
AssertionError: Error: Harmonic Bond (0, 1) has k0 values of 284512.000000 and 14225.600000, respectively.

*This is exactly a factor of 20 error, so I'm expecting we DO have a units error somewhere (a factor of two from a k vs k/2 convention, maybe, and a factor of 10 from a typing error?)

This is for examples/SMIRFF_comparison/AlkEthOH_inputfiles/AlkEthOH_chain_filt1/AlkEthOH_c0. I'll attempt to troubleshoot, though it's unclear how much more time I'll have today.

@bannanc , take a quick look for the factor of 10 issue? I can see if I can sort out the factor of 2.

@davidlmobley davidlmobley self-assigned this Jul 2, 2016
@davidlmobley
Copy link
Contributor Author

Well, I checked the xml file against the parm@frosst .frcmod file and I certainly don't see any factors of 10 or 20 in the bonds section -- there are only six there and they all very much have the magnitudes matching the .frcmod file. So if there's a factor of 20 issue, it has to be on the parsing side.

@jchodera
Copy link
Member

jchodera commented Jul 2, 2016

Is the amber force constant in units of kcal/mol/angstrom or does it use nanometers?

The factor of 2 is probably because Amber uses a silly definition of harmonic energies: K r^2 instead of (K/2) r^2 or something.

I would prefer if we keep the standard (K/2) harmonic convention and convert the numbers in the XML file, or add an option to the XML HarmonicBondForce block to select between standard and weird amber conventions.

@davidlmobley
Copy link
Contributor Author

Is the amber force constant in units of kcal/mol/angstrom or does it use nanometers?

I'm trying to find the place in the docs where it describes that and this:

The factor of 2 is probably because Amber uses a silly definition of harmonic energies: K r^2 instead of (K/2) r^2 or something.

as both are consistent with my theory. But, I have a meeting in five minutes so I'll probably have to get back to this, @jchodera .

I would prefer if we keep the standard (K/2) harmonic convention and convert the numbers in the XML file, or add an option to the XML HarmonicBondForce block to select between standard and weird amber conventions.

Yes, agreed.

@davidlmobley
Copy link
Contributor Author

@jchodera : Bond force constant is kcal/(mol angstrom^2). http://ambermd.org/formats.html see parm format.

But, er, note that your file says "kilocalories_per_mole/angstrom"; if these text strings are parsed automatically (?? I haven't checked that) then missing the square might cause problems...?

Off for my meeting, will investigate factor of 2 later.

@davidlmobley
Copy link
Contributor Author

@swails - do you happen to know off the top of your head whether AMBER uses k or k/2 in front of harmonic bond forces? We're trying to sort out whether there is a difference in what k means here and I haven't figured out where to look this up.

@jchodera
Copy link
Member

jchodera commented Jul 2, 2016

Whoops! Change angstrom to angstrom**2 to fix the factor of ten issue then.

@davidlmobley
Copy link
Contributor Author

Thanks, @jchodera . Fixing a similar typo regarding radians as well.

@davidlmobley
Copy link
Contributor Author

davidlmobley commented Jul 2, 2016

@jchodera : The GAFF page (http://ambermd.org/antechamber/gaff.html) and Wikipedia (https://en.wikipedia.org/wiki/AMBER#Functional_form) both drop the 2 in the equation for the force field functional form, which I think confirms this is the origin of the issue. I'll convert all of the spring constants.

@cbayly13 will need to be reminded of this for the record. :)

@davidlmobley
Copy link
Contributor Author

AMBER also drops the factor of 2 from angles, so I'll fix that as well.

@davidlmobley
Copy link
Contributor Author

OK, bonds and angles are fixed but now I have an issue with sigma:

Error: Particle 0 has sigma of 0.339967 and 0.190800, respectively

I'll investigate.

@davidlmobley
Copy link
Contributor Author

Hmm, the ratio of 0.1908/0.339967 is 0.561. That doesn't mean anything to me, and the only thing I can think of it's even vaguely close to is related to k_B*T at 300K, but it's not exactly that. And there's no reason I can think of that would show up here... @jchodera , ideas?

@jchodera
Copy link
Member

jchodera commented Jul 2, 2016

@swails - do you happen to know off the top of your head whether AMBER uses k or k/2 in front of harmonic bond forces?

I'm quite certain AMBER uses k and not k/2, while OpenMM uses k/2, hence the need for the factor of two conversion here.

@jchodera
Copy link
Member

jchodera commented Jul 2, 2016

Hmm, the ratio of 0.1908/0.339967 is 0.561. That doesn't mean anything to me, and the only thing I can think of it's even vaguely close to is related to k_B*T at 300K, but it's not exactly that. And there's no reason I can think of that would show up here... @jchodera , ideas?

That's the conversion factor between rmin---the minimum energy distance for an LJ interaction---and sigma, which is 2^(1/6), along with division by a factor of 2 thrown in for good measure:

2^(1/6) / 2 = 0.561231...

OpenMM uses the effective sigma, from which the effective pair sigma is computed by the geometric combining rule sigma = sqrt(sigma1*sigma2). AMBER may use some sort of weird rmin/2 scheme instead.

@davidlmobley
Copy link
Contributor Author

That's the conversion factor between rmin---the minimum energy distance for an LJ interaction---and sigma, which is 2^(1/6), along with division by a factor of 2 thrown in for good measure:

2^(1/6) / 2 = 0.561231...

Ha, this is awesome. One of my old physics profs said "physicists carry minus signs and factors of two in their back pockets", and this exponentiates that maxim. :)

What's your preference on handling this one? Handle it in the XML file or internally? Seems like XML file is probably preferable again.

I'm still looking at the docs on this one. http://ambermd.org/formats.html#frcmod says this cryptic thing about NONB parameters:

 follow this card by card of type - 10A, B or C - listed
                    in the unit 10 instructions above.  E.g. if you specify
                    STDA in parm.in for the "regular" parm.dat file, this is
                    the convention that will be used when reading frcmod.
                    End with a blank line.

10A translates to something irrelevant, 10B is r and epsilon, and 10C is LJ coefficients for r6 and r12. I don't know where 10A-10C is specified, but since these definitely aren't 10A or 10C, that puts us in category 10B. It says that the r is the LJ radius.

@davidlmobley
Copy link
Contributor Author

Ahha, confirmed what you were thinking by looking for the "MOD4" flag via Google, which turns up info from Swails: https://www.researchgate.net/post/Where_are_the_charge_sigma_epsilon_parameters_in_GAFF

Specifically:

In gaff.dat, look for the section that starts with "MOD4".  It lists the atom types, then two numbers.  These are the Lennard-Jones parameters Rmin/2 and epsilon. Note that Rmin/2 is NOT the same as sigma -- it is sigma * the sixth root of 2 (see http://ambermd.org/vdwequation.pdf for more details here).

@jchodera
Copy link
Member

jchodera commented Jul 2, 2016

I would really like to keep "standard" forms for spring constants and Lennard-Jones sigma and epsilon here, much like OpenMM does, but I'm not opposed to the OpenMM style of being permissive in reading but strict in writing. We can add an option where we allow weird input formats by using different attributes, like instead of

   <Atom smirks="[#8:1]" sigma="1.6837" epsilon="0.1700"/> <!-- making OS the generic oxygen --

we support reading of

   <Atom smirks="[#8:1]" rmin_half="1.6837" epsilon="0.1700"/> <!-- making OS the generic oxygen --

as well.

@davidlmobley
Copy link
Contributor Author

@jchodera : Is that a one-line change somewhere? I haven't closely looked at that part of the code yet. Certainly it would simplify things, as I imagine Chris is going to be staring at these files in some viewer at some point and is going to want to compare them to the AMBER values he's got stuck in his head, so the rmin_half thing is going to be easier for him to wrap his mind around.

@davidlmobley
Copy link
Contributor Author

It would be really great if that tool you're using in openmoltools printed out detailed information about what was different in the two systems!

Yup, I'm working on that now. Obviously it should have been coded that way to begin with. :)

@davidlmobley
Copy link
Contributor Author

It's nice it's there, it just doesn't provide enough info now. I have a small bit of it working to print out detailed info now, but am still working.

@davidlmobley
Copy link
Contributor Author

OK, here's what I have now, which is slightly more info, @jchodera :

Compared torsion involving atoms 'O2 - C7 - C3 - H7' with that involving atoms 'O2 - C7 - C3 - H7':
Keys for system0: '[(1, 0.0)]'; keys for system1: '[(3, 0.0)]'

Have to backtrack to figure out what the keys are supposed to be. :)

I do know that Chris mentioned that his recent work uncovered a bug in parm@frosst somewhere where this implementation and parm@frosst SHOULD disagree (but I don't have the details of this written down - I'll have to re-ask tomorrow) so at SOME point we'll run into that, but it's unlikely we'd run into it on the second molecule we look at, as it was a small corner case.

@davidlmobley
Copy link
Contributor Author

OK, periodicity and phase. Looking back to check files...

@davidlmobley
Copy link
Contributor Author

The issue involves an O-C-C-H torsion involving the hydroxyl at the top of the attached image, the attached tetrahedral carbon, and one of the methyls, i.e. OH-CT-CT-H* where I'd have to look at the typing to see what that last proton is (it's part of a methyl). I'm drilling down into SMARTS matching to figure out what term is being applied and check details.

picto_4_3_0_4

@jchodera
Copy link
Member

jchodera commented Jul 4, 2016

Looks like good progress!

I'm also wondering if there will be a difference in how AMBER and OpenMM specify torsion barriers. I vaguely recall that OpenMM just adds all the torsions up, while AMBER---at the time torsions are applied in LEaP---averages them in some way dependent on how many torsions are present for a given bond. But I might be totally mistaken there.

@davidlmobley
Copy link
Contributor Author

davidlmobley commented Jul 4, 2016

@jchodera : OK, so I almost have this tracked down, in that it seems to PARTLY originate from the fact that we don't yet have all the torsions typed into the XML file.

Tangentially, did you end up implementing the id, parent_id stuff discussed here? openforcefield/open-forcefield-data#9 . If so, while I am editing the XML file I should add these.

(There was an earlier version of this file which asked about multiple parameter specifications for a particular SMARTS, but I answered this question myself. Sorry.)

@davidlmobley
Copy link
Contributor Author

@jchodera - updated the comment above. Sorry for the confusion.

@jchodera
Copy link
Member

jchodera commented Jul 4, 2016

My example from openforcefield/open-forcefield-data#9 shows that this is handled as a single XML Proper or Improper tag with (periodicity1, phase1, k1) specifying the first term, (periodicity2, phase2, k2) specifying the second term, and so on. Here's the example I wrote:

  <Proper smirks="[#6X4]-[#6X4]-[#8X2]-[#1]" periodicity1="3" phase1="0.0" k1="0.16" periodiity2="1" phase2="0.0" k2="0.25"/> <!-- HO-OH-CT-CT from frcmod.Frosst_AlkEthOH -->
  <Proper smirks="[#6X4]-[#6X4]-[#6X4]-[#6X4]" periodicity1="3" phase1="0.0" k1="0.18" periodicity2="2" phase2="180.0" k2="0.25" periodicity3="1" phase3="180.0" k3="0.20"/> <!-- CT-CT-CT-CT from frcmod.Frosst_AlkEthOH -->

Also, did you end up implementing the id, parent_id stuff discussed here?

The ForceField class just ignores attributes that it doesn't directly use, so if the XML file containsidandparent_id`, that's fine. We haven't written the sampler that would generate these XML files when sampling over parameter sets, however---that's where these would be written and read.

@davidlmobley
Copy link
Contributor Author

@jchodera : One other point of confusion. Bayly's file has lines like this:

[a,A]-[#6X4]-[#8X2]-[!#1]   parTor ( (3, 1.15, 0.0, 3) ) parNum t0003 {X -CT-OS-X from frcmod.Frosst_AlkEthOH}
[#1]-[#6X4]-[#6X4]-[#1]     parTor ( (1, 0.15, 0.0, 3) ) parNum t0004 {HC-CT-CT-HC from frcmod.Frosst_AlkEthOH}

Notice that the first entry in the torsional parameters is different for the two. What does this entry mean? It's in the AMBER formats. But I'm not seeing how it makes it into your XML file, which only has periodicity, phase, and k:

   <Proper smirks="[a,A:1]-[#6X4:2]-[#8X2:3]-[!#1:4]" periodicity1="3" phase1="00" k1="1.15"/> <!-- X -CT-OS-X from frcmod.Frosst_AlkEthOH -->
   <Proper smirks="[#1:1]-[#6X4:2]-[#6X4:3]-[#1:4]" periodicity1="3" phase1="0.0" k1="0.15"/> <!-- HC-CT-CT-HC from frcmod.Frosst_AlkEthOH -->

@davidlmobley
Copy link
Contributor Author

OK, I fixed THAT issue by adding the rest of the contents to the XML file. Onwards and upwards to the next issue, but first I have to continue to add more useful output to openmoltools.

@davidlmobley
Copy link
Contributor Author

OK, so now I'm getting issues with torsional spring constants, which possibly could be unit issues or something. I'll have to investigate; however, I have to take a break to write a grant progress report, etc. So I'll just leave this here:

Compared torsion involving atoms 'O3 - C7 - O2 - H16' with that involving atoms 'O3 - C7 - O2 - H16':
Traceback (most recent call last):
 ...
    raise Exception( "Error: (proper) PeriodicTorsionForce strength (%d, %d, %d, %d) (%d, %f) has values of %f and %f, respectively." % (i0, i1, i2, i3, per, phase, val0, val1) )
Exception: Error: (proper) PeriodicTorsionForce strength (9, 6, 8, 25) (3, 0.000000) has values of 0.697333 and 2.092000, respectively.

@davidlmobley
Copy link
Contributor Author

@jchodera : This comment could still use your attention when you have a moment. #38 (comment)

@jchodera
Copy link
Member

jchodera commented Jul 4, 2016

@jchodera : One other point of confusion. Bayly's file has lines like this:

[a,A]-[#6X4]-[#8X2]-[!#1]   parTor ( (3, 1.15, 0.0, 3) ) parNum t0003 {X -CT-OS-X from frcmod.Frosst_AlkEthOH}
[#1]-[#6X4]-[#6X4]-[#1]     parTor ( (1, 0.15, 0.0, 3) ) parNum t0004 {HC-CT-CT-HC from frcmod.Frosst_AlkEthOH}

Notice that the first entry in the torsional parameters is different for the two. What does this entry mean?

That's the periodicity, which corresponds to periodicity in our attributes list for SMIRFF XML.

@jchodera
Copy link
Member

jchodera commented Jul 4, 2016

Can you clarify what your error string means?

Exception: Error: (proper) PeriodicTorsionForce strength (9, 6, 8, 25) (3, 0.000000) has values of 0.697333 and 2.092000, respectively.

What does value mean here? You're reporting the value of what, exactly? A parameter? An energy? In what units?

@jchodera
Copy link
Member

jchodera commented Jul 4, 2016

Whatever it is, it's off by a factor of 3. For an X-C-O-H torsion with a presumably tetrahedral carbon, there would be three torsion terms acting on that central -C-O- bond. As noted earlier, I dimly recall that AMBER may divide the barrier heights (k) by the number of torsions impinging on the central bond when assigning parameters in LEaP, which differs from what OpenMM does (which is just to sum the torsions). This might be going on here.

@jchodera
Copy link
Member

jchodera commented Jul 4, 2016

If true, fixing that would first require a philosophical discussion with @cbayly13: Do we want to adopt this behavior during torsion assignment in SMIRFF forcefields, or do we want to apply torsions additively, so that each torsion term always contributes the true barrier height in the parameter?

@davidlmobley
Copy link
Contributor Author

@jchodera : There are several issues here and I'm going to go from most important to least important.

@jchodera : One other point of confusion. Bayly's file has lines like this:

[a,A]-[#6X4]-[#8X2]-[!#1] parTor ( (3, 1.15, 0.0, 3) ) parNum t0003 {X -CT-OS-X from frcmod.Frosst_AlkEthOH}
[#1]-[#6X4]-[#6X4]-[#1] parTor ( (1, 0.15, 0.0, 3) ) parNum t0004 {HC-CT-CT-HC from frcmod.Frosst_AlkEthOH}
Notice that the first entry in the torsional parameters is different for the two. What does this entry mean?
That's the periodicity, which corresponds to periodicity in our attributes list for SMIRFF XML.

OK, so the point I'm making here is that there are FOUR parameters for each line - there is the periodicity, the force constant, the phase, and (some other number). You've implemented these in a file which only gives THREE parameters for each line. Specifically:

<Proper smirks="[a,A:1]-[#6X4:2]-[#8X2:3]-[!#1:4]" periodicity1="3" phase1="0.0" k1="1.15"/> <!-- X -CT-OS-X from frcmod.Frosst_AlkEthOH -->
   <Proper smirks="[#1:1]-[#6X4:2]-[#6X4:3]-[#1:4]" periodicity1="3" phase1="0.0" k1="0.15"/> <!-- HC-CT-CT-HC from frcmod.Frosst_AlkEthOH -->

So, if the FIRST parameter is the periodicity, then you've implemented these incorrectly, since you've said in the XML that the periodicity is 3 for both of the above, but the first parameter is 3 for one of them and 1 for the other. On the other hand, if the FOURTH parameter is the periodicity then they are implemented correctly, and the question is: what's the first parameter?

Regarding the error string:

Can you clarify what your error string means?

Exception: Error: (proper) PeriodicTorsionForce strength (9, 6, 8, 25) (3, 0.000000) has values of 0.697333 and 2.092000, respectively.
<
What does value mean here? You're reporting the value of what, exactly? A parameter? An energy? In what units?

This is NOT my error string. It's what was already in openmoltools from Kyle, and I have precisely the same questions as you. I'll implement that as soon as I can sort out the answers. Right now my guess is that these are (periodicity, phase) and constant, constant, but I have no idea what the units are.

Also, it particularly bothered me that these provide no information whatsoever about which atoms these actually are aside from their numbers. I've now got it so it prints that info out, but I'll have to get back to the other parts later.

@davidlmobley
Copy link
Contributor Author

It's also worth noting that the distinction between 1 and 3 among those four parameters might explain the factor of 3 difference...

@davidlmobley
Copy link
Contributor Author

Ahh, @jchodera , from the AMBER documentation (http://ambermd.org/formats.html):

IDIVF      The factor by which the torsional barrier is divided.
                    Consult Weiner, et al., JACS 106:765 (1984) p. 769 for
                    details. Basically, the actual torsional potential is

                           (PK/IDIVF) * (1 + cos(PN*phi - PHASE))

         PK         The barrier height divided by a factor of 2.

         PHASE      The phase shift angle in the torsional function.

                    The unit is degrees.

         PN         The periodicity of the torsional barrier.
                    NOTE: If PN .lt. 0.0 then the torsional potential
                          is assumed to have more than one term, and the
                          values of the rest of the terms are read from the
                          next cards until a positive PN is encountered.  The
                          negative value of pn is used only for identifying
                          the existence of the next term and only the
                          absolute value of PN is kept.

So I think that first number in the files is IDIVF. Right now you do not have IDIVF in your XML files even though it's in the mock-up, and it seems to vary between 1 and 3 in the mock-up. If OpenMM doesn't use this, then it seems like we would need to (a) read IDIVF if present, and (b) divide the barrier height by IDIVF if present. This would allow us to specify AMBER-style parameters by specifying IDIVF, and to NOT by just leaving it out. Does this sound right to you, @jchodera ?

That would explain the factor of 3, as in this set IDIVF is always 1 or 3, it seems.

@jchodera
Copy link
Member

jchodera commented Jul 4, 2016

Thanks for the clarification!

I can implement an optional idivf attribute for torsion XML tags---no problem! Create an issue and tag me?

I can also take a look at making the feedback for that system comparison code more informative if you can create an issue in openmoltools and tag me, ideally pointing out where that function is located in the codebase.

Thanks again for digging into the details here, and apologies for the bugs!

@davidlmobley
Copy link
Contributor Author

I can implement an optional idivf attribute for torsion XML tags---no problem! Create an issue and tag me?

I'll create an issue and tag you.

I don't think you need to work on making the system comparison code more informative. I can sort it out eventually. I just have to change gears for the rest of today. I'd rather have you work on the stuff I can't do. :)

@davidlmobley davidlmobley changed the title [WIP] Cross-check SMIRFF parm@frosst energies against parm@frosst energies from AMBER files Cross-check SMIRFF parm@frosst energies against parm@frosst energies from AMBER files Jul 4, 2016
@davidlmobley davidlmobley merged commit ff6bfcc into master Jul 4, 2016
@davidlmobley
Copy link
Contributor Author

Merged this so that @jchodera can implement the IDIVF flag. I will start a new PR to continue with related testing once that flag is in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants