-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
anndata initializer does not verify indices match on layers #1769
Comments
Happy to try to fix and submit a PR given some advice on how to proceed |
@amcpherson I think this should be ok, although I am not sure the correct intention for |
Layers can be accessed in the same way as X for instance |
@amcpherson Those global indices are internally converted to integer-like indices (np array, slice, int). So I don't think we would have a great way of special-casing this checking, but I will wait for @flying-sheep to weigh in. I would probably say this should be allowed to be honest but not sure. |
@ilan-gold , yes, exactly. The global indices are converted to integer like indices, which is how it should be. The anndata encapsulates a set of axes aligned objects but the underlying representations are just raw numpy arrays. This is the same as for .X. However, if we are trusting anndata to manage the obs, var, X and layers to be axis aligned, then it is really helpful to have some checks to ensure we dont do something wrong. I use anndata extensively and this is the number 1 cause of bugs especially by junior developers who do not know all the caveats of anndatas. Another gotcha is this:
This completely scrambles the relationship between obs and X/layers. It should raise an exception because the indices are not exactly equal including order, or align the indices of the new adata.obs. Ideally the former. Looking forward to hearing from @flying-sheep. |
In that case, the flipside is people who expect obs_index_changed = adata.obs.set_index('foo', inplace=False)
adata.obs = obs_index_changed to work. We could distinguish by looking at the set of index values and ensuring they match or something. But it may also be better to simply document that people need to be aware of this tradeoff we have made. I agree that this behavior is confusing, though. I think this then requires a broader discussion because we have no way of knowing what the intentions of people are. |
Please make sure these conditions are met
Report
The AnnData Initializer verifies index and columns of the input X match var and obs, but does not do the same for layers resulting in possible bugs especially when initializing with layers but no X.
Code:
Versions
The text was updated successfully, but these errors were encountered: