RDkit generation of new non-SMILES representation #3

kaichop · 2024-06-11T13:54:21Z

Assess different features that can be generated from RDkit.

For example, convert the SMILES to morgan fingerprint as features, and then use a simple neural network to perform prediction. Assess the performance using testing data. Compare the performance with what is reported in kaggle currently so we know how much to improve.

Paste the code here.

wangwpi · 2024-06-14T03:49:54Z

I have generated morgan fingerprint, protein name (one hot encoding) and binds (labels) for all train and validation data as numpy array format, into trunks. Each trunk has 500,000 rows, the data are located in "/mnt/isilon/wang_lab/shared/Belka/analysis/morgan" and "/mnt/isilon/wang_lab/shared/Belka/analysis/morgan_validation"

wangwpi self-assigned this Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RDkit generation of new non-SMILES representation #3

RDkit generation of new non-SMILES representation #3

kaichop commented Jun 11, 2024

wangwpi commented Jun 14, 2024

RDkit generation of new non-SMILES representation #3

RDkit generation of new non-SMILES representation #3

Comments

kaichop commented Jun 11, 2024

wangwpi commented Jun 14, 2024