Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I found that the snapshot flow would stop and hang when it tried to receive the exit status from stdout and stderr when collecting the output of
zfs list -t snapshot
, preventing any new snapshots from being taken.This happens when the output gets over a certain number of lines, I believe in our case it's because rudaux is now taking four times the number of snapshots it used to because of the section overrides. At the time the hanging started, around 20000 snapshots had been taken.
The issue is described here and is also described in a warning on the documentation here, I increased the window_size and max_packet_size and moved checking exit status until after reading stdout and stderr as recommended in the comments and warning.
Also added polling on exit_status_ready to make sure command finished executing before reading stdout and stderr.rudaux on dsci-100-instructor has been running with these changes for the last three weeks and snapshots are now being taken properly again.