-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strip water molecules from all topology/coordinate files in current database #21
Comments
Which topology/coordinate files in particular are of interest? The Amber ones? I might have some time to make progress on this today. |
This should be the GROMACS ones, as I always solvated things after On Fri, Apr 17, 2015 at 10:25 AM, John Chodera [email protected]
David Mobley |
Is a workflow in which we first solvate in AMBER Also, if there's already an issue on the preferred way to generate these files, my apologies---feel free to just post a pointer. |
I have not validated whether acpype handles box conversions properly. (At
one point in the past, it did not). So normally I just prep the molecule
itself in AMBER and then solvate in GROMACS. Do you know?
(We should create an issue on GitHub to lay out the protocol for
re-generating everything from the source data. I'm working on figuring out
who in my lab can go ahead and do this, but as noted that's a separate
issue - the most immediate solution is just to strip the waters.)
|
I don't think we can invest any time in trying to fix up manually curated files with throwaway scripts. If we do put time into this, it has to be to establish automated pipelines that build this from the ground up. Creating a workflow to create unsolvated and solvated AMBER Info on |
See #22 |
For now, we absolutely ought to be doing the same thing we (in my group) (And, if my student for some reason takes a while to get this done, I'm not |
This was resolved by the full rebuild of the database for version 0.5, in #28 . |
Because of prior manual curation of files, not all topology and coordinate files contain water molecules. And additionally, I just found out (from Sereina Riniker - e-mail excerpt below) that some of these contain TIP4P-Ew water molecules rather than TIP3P. Again, this is a result of manually gathering the topology/coordinate files for these (in some cases by students). The best long-term solution is to re-generate all topology/coordinate files from original source data (Issue #20), but an interim solution is just to strip all water molecules from existing topology/coordinate files.
Riniker's e-mail said this, in part:
"Regarding the [input files] I noticed two things which I thought you might like to know if you do not already. In the most recent version v0.31, I encountered 78 molecules where the GROMACS coordinate file .gro does not contain the solvent coordinates. In addition, there are 23 molecules where the solvent model in the coordinate file is not TIP3P (it contains 4 coordinates per solvent molecule). I attach the list of molecule numbers in case you would like to have a look at them."
The compound ID numbers for setups with TIP4P are:
1323538
1728386
186894
1873346
1875719
1923244
2005792
2049967
20524
2068538
2178600
2972906
3053621
3727287
3738859
4035953
511661
5157661
525934
5449201
8427539
9055303
9979854
And those for setups with no water are:
1034539
1160109
1469079
172879
1893815
1905088
1944394
2126135
2316618
242480
2484519
2492140
2613240
2636578
2659552
2844990
2845466
2850833
2960202
2972345
3040612
3083321
3211679
3265457
3269819
3359593
3515580
3686115
3802803
3976574
4149784
4371692
4479135
4587267
4603202
4613090
4678740
4689084
486214
4936555
5003962
5006685
5282042
5371840
5456566
5510474
5538249
5561855
5616693
5917842
6102880
6190089
6195751
6198745
628951
6359156
667278
6688723
6935906
7239499
7417968
7676709
7913234
8052240
819018
8208692
8311303
8337722
8823527
8827942
8883511
9257453
9510785
9653690
9717937
9741965
9821936
9897248
The text was updated successfully, but these errors were encountered: