3D-vac pipeline #163

gcroci2 · 2023-09-26T09:32:44Z

We want to insert ready-to-use notebooks to perform the entire 3D-Vac pipeline; in particular, we can develop two notebooks:

1. 3D modeling notebook. Given a peptide-protein complex sequence as input (or multiple), create a 3D structure/s model/s using PANDORA, and output a PDB file/s. @DarioMarzella
2. Featurization and prediction script. @gcroci2
- 2.1 Use deeprank2 to featurize the structure/s and save it/them into an HDF5 file/s.
- 2.2 Run a pre-trained GNN model on the featurized data. Side note: we need to re-train the GNN architecture on all the data we have available (~100k), using the best-selected parameters as concluded in issue Finalize GNNs for the scientific paper #151.
- 2.3 Print the predictions and communicate the threshold from the shuffled config with validation

gcroci2 · 2023-10-25T16:07:47Z

The DeepRank2 part (data processing + testing) is in the script src/4_train_models/DeepRank2/GNN/pre-trained_testing.py.

The threshold selected by maximizing MCC on the validation set of the shuffled data configuration is 0.5151 (AUC on test 0.8565, MCC on test 0.5582, from exp_100k_std_transf_bs64_naivegnn1_wloss_0_230607 as described in #151).

Any suggestions for improvement? @LilySnow, @DarioMarzella. Otherwise, I am done with the DeepRank2 part.

gcroci2 · 2024-01-12T15:05:46Z

The DeepRank2 part (data processing + testing) is in the script src/4_train_models/DeepRank2/GNN/pre-trained_testing.py.

Note that the script for now runs only with this branch of DeepRank2, since the edits are still under review in PR515 (but will be merged soon).

The threshold selected by maximizing MCC on the validation set of the shuffled data configuration is 0.5151 (AUC on test 0.8565, MCC on test 0.5582, from exp_100k_std_transf_bs64_naivegnn1_wloss_0_230607 as described in #151).

Any suggestions for improvement? @LilySnow, @DarioMarzella. Otherwise, I am done with the DeepRank2 part.

Now the relevant scripts in this regard are in src/6_test_cases/; @DarioMarzella will finalize further the part for generating the PDB files (now in src/6_test_cases/generate_pdb_test_case.py)

gcroci2 added docs Improvements or additions to documentation pMHC-I GNNs production labels Sep 26, 2023

gcroci2 assigned gcroci2 and DarioMarzella Sep 26, 2023

github-project-automation bot added this to Development Aug 29, 2024

github-project-automation bot moved this to In progress in Development Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3D-vac pipeline #163

3D-vac pipeline #163

gcroci2 commented Sep 26, 2023 •

edited

Loading

gcroci2 commented Oct 25, 2023 •

edited

Loading

gcroci2 commented Jan 12, 2024 •

edited

Loading

3D-vac pipeline #163

3D-vac pipeline #163

Comments

gcroci2 commented Sep 26, 2023 • edited Loading

gcroci2 commented Oct 25, 2023 • edited Loading

gcroci2 commented Jan 12, 2024 • edited Loading

gcroci2 commented Sep 26, 2023 •

edited

Loading

gcroci2 commented Oct 25, 2023 •

edited

Loading

gcroci2 commented Jan 12, 2024 •

edited

Loading