-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/id mapping #113
Feature/id mapping #113
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. I suppose the main thing in merging this would be coordinating the timing with the FGSEA merge, as you mentioned.
Were there any things that stood out to you re. BirdgeDB? Any problems? Anything we should log for future? It looks straightforward to me at first glance, but you'd have a better sense of the details.
Unless you have something you'd like to raise or ask specifically, I think we should just merge this in.
No major problems using BridgeDB. I couple things to mention would be...
We should probably do better data validation up front at some point. Things like blank lines, randomly corrupted lines, or anything else will probably error out in the FGSEA service. It would be cleaner to catch all that on the client and report errors properly. But that's work that we can punt. |
OK, I’ll kick off an issue for preemptive client-side data validation tomorrow. We can use it to keep notes until we plan out the details.
I’ll ask Ruth and Veronique if they have a newer file. It would be nice to have one that's a bit more representative.
As for BridgeDB, it looks like it may be a known issue — although it doesn’t block us as you said: bridgedb/BridgeDbR#33
… On Apr 25, 2023, at 19:39, Mike Kucera ***@***.***> wrote:
No major problems using BridgeDB. I couple things to mention would be...
the endpoint has a query parameter to limit the results to a particular type, like HGNC, but it doesn't work. I get back a list of IDs for each Ensembl id and have to parse it. But its no big deal.
The sample data file from Ruth has 39% of the Ensembl ids not mappable (22594 of 57906). That file is from 2016 so probably not something to be concerned about but worth mentioning.
We should probably do better data validation up front at some point. Things like blank lines, randomly corrupted lines, or anything else will probably error out in the FGSEA service. It would be cleaner to catch all that on the client and report errors properly. But that's work that we can punt.
—
Reply to this email directly, view it on GitHub <#113 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAHRO45EOPR3I4COISBAZDLXDBOBTANCNFSM6AAAAAAXKA2GYQ>.
You are receiving this because your review was requested.
|
It looks like this issue already addresses validation: Improved validation of input files bridgedb/BridgeDb#42 |
General information
Associated issues: bridgedb/BridgeDb#112
Checklist
Author:
Reviewers:
Notes
Adds Ensembl ID mapping...