-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Timeseries can have timestamps + data that don't match in length #1536
Comments
Hey there, When running the latest version of PyNWB, (v2.1.0 - try UserWarning: <Object type> '<Object name>': Length of data does not match length of timestamps. Your data may be transposed. Time should be on the 0th dimension. whenever such a size mismatch occurs. As for why it's only a warning instead of an error, I believe the reasoning was to not break back-compatibility (is that right @lry @bendichter @oruebel?) or otherwise be too strict, as it can be a commonly observed mistake to unintentionally transpose the dimensions of a This is also a |
Yes, that is correct. If we had made it an error then read of files that have this issue would not work due to the ValueError being triggered on read. However, as of the most recent HDMF release, we now have the ability to detect in Long story short, with this, the check whether timestamps have the correct length could be updated to raise an error in user mode and still preserve backward compatibility by only raising a warning when reading an invalid file. CC @weiglszonja since she implement this behavior for |
Got it, thank you both! I will update my version to get my desired behavior as a warning for now, which will at least alert me to it.
… On Aug 11, 2022, at 3:28 PM, Oliver Ruebel ***@***.***> wrote:
As for why it's only a warning instead of an error, I believe the reasoning was to not break back-compatibility (is that right
Yes, that is correct. If we had made it an error then read of files that have this issue would not work due to the ValueError being triggered on read.
However, as of the most recent HDFM release, we now have the ability to detect in __init__ of Container classe (e.g,. TimeSeries) whether we are in construct mode (i.e., automatic construction during read) or not. This was implemented in hdmf-dev/hdmf#751 <hdmf-dev/hdmf#751> . The following #1516 <#1516> shows how this can be used to implement checks in a way that you warn when reading from file and raise an error when constructing new data. Essentially, what it boils down in most cases is that you simply call self._error_on_new_warn_on_construct(error_msg="my message") in __init__, which will either: 1) raise a ValueError when in user mode, 2) raise a warning when in construct mode (i.e., read from file), or 3) do nothing if error_msg=None is passed. In this way you can implement check function to either return a string or None and then simpy call self._error_on_new_warn_on_construct(error_msg=self._check_timestamps_valid()) in __init__. If you need more complex behavior then you can also check if self._in_construct_mode: directly to distinguis between behavior during constuction on read and new creation by the user.
Long story short, with this, the check whether timestamps have the correct length could be updated to raise an error in user mode and still preserve backward compatibility by only raising a warning when reading an invalid file.
—
Reply to this email directly, view it on GitHub <#1536 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQKOLZCCDVG3E5FXHSWYPGDVYVH4NANCNFSM56JF4MRA>.
You are receiving this because you authored the thread.
|
Thank you for raising this, I started working on a PR that should fix this behavior. |
What happened?
I use the convenient timeseries pointer feature, where you can just store timestamps in one timeseries and have many others point to them. This saves a lot of memory! However, I just fixed a bug that arose because some data I stored in a timeseries didn't have the same length as the timeseries (due to a processing mistake). Then I went and checked, and it turns out that timeseries' timestamps don't need to have the same length as the data at all! I consider this a bug, given that timeseries are meant as a core unit of data in NWB, but let me know if I'm just using them wrong.
Steps to Reproduce
Traceback
Operating System
macOS
Python Executable
Conda
Python Version
3.9
Package Versions
nwb 2.0.0
Code of Conduct
The text was updated successfully, but these errors were encountered: