Syncing the current folder structure across separate acquisition machines. #373

JoeZiminski · 2024-04-25T14:03:03Z

A problem of acquiring data across machines is that it is possible for the subject / session numbers to get out of 'alignment'. For example you may have two acquisition machines (ephys, behaviour). On one, you accidently create a 'subject-002' that is empty. Now when you automatically get the next subject, on one machine it is 'sub-003' and the other 'sub-002'.

This is a general problem of acquisition pipelines across machines and not datashuttle-specific. However it would be nice to include as many protections against this, and ways to mitigate this in datashuttle as possible.

Some ways to handle this are:

The current way, which is sub-optimal. When data is transferred from the local machines to central, then central contains the most up to date project. When using get_next_sub or ses then (by default) this includes central and so the correct recommendations should be made. However it is not reasonable to expect that data across all acquisition machines is transferred immediately after the session is finished (e.g. might have two mice back to back).
A layer of protection can be added quite easily by writing a metadata file to hidden .datashuttle folders within subject/ session level whenever a folder is created. This could include the time that the folder is written. Then checks can be run, for example if ses-002 and ses-003 were created within 1 minute of each other, they are probably wrong.
The best solution will be to write metadata detailing local folder structure and send this to central immediately (i.e. when folders are created). Then central can always contain a total overview of folders on the project at any time. This could be updated either when a) folders are created b) data is transferred. This is a bigger job and a lot of care will have to be taken to consider the many possible edge cases. It will be necessary to expose the central directory tree through the GUI / through an datashuttle method that prints the file tree.

The text was updated successfully, but these errors were encountered:

adamltyson · 2024-04-25T14:46:56Z

Some kind of "sync metadata" function would be great. Users could certainly run this before/after every session, and then sync the data itself later. As you say though, a lot of work.

niksirbi · 2024-04-25T15:09:53Z

I agree that the 3rd option, albeit more difficult, is the only permanent solutions to this. We need to sync a representation of the project's tree structure without the actual files.

JoeZiminski added the enhancement New feature or request label Apr 25, 2024

JoeZiminski mentioned this issue Jul 2, 2024

Add suggest next sub / ses including remote project to TUI #409

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syncing the current folder structure across separate acquisition machines. #373

Syncing the current folder structure across separate acquisition machines. #373

JoeZiminski commented Apr 25, 2024

adamltyson commented Apr 25, 2024

niksirbi commented Apr 25, 2024

Syncing the current folder structure across separate acquisition machines. #373

Syncing the current folder structure across separate acquisition machines. #373

Comments

JoeZiminski commented Apr 25, 2024

adamltyson commented Apr 25, 2024

niksirbi commented Apr 25, 2024