Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/id mapping #113

Merged
merged 3 commits into from
Apr 26, 2023
Merged

Feature/id mapping #113

merged 3 commits into from
Apr 26, 2023

Conversation

mikekucera
Copy link
Contributor

@mikekucera mikekucera commented Apr 24, 2023

General information

Associated issues: bridgedb/BridgeDb#112

Checklist

Author:

  • One or more reviewers have been assigned.
  • Automated tests have been included in this pull request, if possible, for the new feature(s) or bug fix.
  • The associated GitHub issues are included (above).
  • Notes have been included (below).

Reviewers:

  • All automated checks are passing (green check next to latest commit).
  • At least one reviewer has signed off on the pull request. Reviewers have two business days to review the pull request, after which the author may merge in the pull request unilaterally.

Notes

Adds Ensembl ID mapping...

  • Detects Ensembl IDs and makes a call to the BridgeDB service to map them to HGNC IDs.
  • The UI displays the HGNC names.
  • Logs any IDs that failed to get mapped in Sentry.
  • The ClassChooser component now automatically sets non-numeric columns to "ignored".
  • The sample data file 'GSE129943_rsem_counts_2016.txt' is now (finally) able to round-trip through the services and produces a usable network.
  • Note an update to the FGSEA service is required. I don't want to push the change to FGSEA yet because it will automatically get deployed before this pull request is approved.

@mikekucera mikekucera requested a review from maxkfranz April 24, 2023 19:36
Copy link
Member

@maxkfranz maxkfranz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I suppose the main thing in merging this would be coordinating the timing with the FGSEA merge, as you mentioned.

Were there any things that stood out to you re. BirdgeDB? Any problems? Anything we should log for future? It looks straightforward to me at first glance, but you'd have a better sense of the details.

Unless you have something you'd like to raise or ask specifically, I think we should just merge this in.

@mikekucera
Copy link
Contributor Author

No major problems using BridgeDB. I couple things to mention would be...

  1. the endpoint has a query parameter to limit the results to a particular type, like HGNC, but it doesn't work. I get back a list of IDs for each Ensembl id and have to parse it. But its no big deal.
  2. The sample data file from Ruth has 39% of the Ensembl ids not mappable (22594 of 57906). That file is from 2016 so probably not something to be concerned about but worth mentioning.

We should probably do better data validation up front at some point. Things like blank lines, randomly corrupted lines, or anything else will probably error out in the FGSEA service. It would be cleaner to catch all that on the client and report errors properly. But that's work that we can punt.

@maxkfranz
Copy link
Member

maxkfranz commented Apr 26, 2023 via email

@mikekucera mikekucera merged commit 7d4575f into main Apr 26, 2023
@maxkfranz
Copy link
Member

It looks like this issue already addresses validation: Improved validation of input files bridgedb/BridgeDb#42

@mikekucera mikekucera deleted the feature/id-mapping branch May 2, 2024 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants