Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

watched_tasks: maintain a list of spawned async tasks and propagate errors #47

Merged
merged 8 commits into from
Jan 19, 2024

Conversation

hnez
Copy link
Member

@hnez hnez commented Oct 2, 2023

This PR is based on an idea from a discussion in PR #43, where it was noticed that errors from tasks and threads propagate nowhere.

This PR fixed that by assembling a list of threads and tasks that should not complete before the end of the program and propagating all errors from these threads and tasks to main() and beyond.

This touches a lot of places, as we spawn quite a lot of async_std tasks (the tacd logs now say "Spawned 58 tasks tasks and 5 threads").

This PR allows us to add better error handling for tasks and threads (we can for example use the ? operator to propagate errors out and do not have to resort to .unwrap() to at least crash the tacd from within a task on error). Adding this better error handling is out of scope for this PR, as it is already large enough.

@hnez
Copy link
Member Author

hnez commented Oct 4, 2023

The failed cargo deny run should be fixed by the dependency updates in #42 and has nothing to do with this PR.

Copy link
Member

@KarlK90 KarlK90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I like the changes and implementation. I thought whether or not the async runtime provides mechanisms to query running tasks and see if anything failed like tokio-console, but this might be messy (or not even possible). Anyway, there are some minor things that should be corrected but apart from that LGTM.

src/watched_tasks.rs Outdated Show resolved Hide resolved
src/adc/iio/hardware.rs Outdated Show resolved Hide resolved
src/adc/iio/hardware.rs Outdated Show resolved Hide resolved
src/adc/iio/hardware.rs Outdated Show resolved Hide resolved
src/watched_tasks.rs Outdated Show resolved Hide resolved
@hnez
Copy link
Member Author

hnez commented Oct 6, 2023

While addressing your comments I have also changed the way ThreadHandle notifies a task about the completion of a thread.
Using a async_std::channel there was somewhat of a hack so I do not have to work with Future and Waker directly, because that sounded somewhat scary.
As it turns out working with Wakers is not scary at all, so we can just use one here:

impl ThreadHandle {
fn new<F>(name: String, function: F) -> Result<Self>
where
F: FnOnce() -> TaskResult + Send + 'static,
{
let wake_on_exit = Arc::new(Mutex::new(None::<Waker>));
let wake_on_exit_thread = wake_on_exit.clone();
let handle = thread::Builder::new().name(name).spawn(move || {
// We could std::panic::catch_unwind() here in the future to handle
// panics inside of spawned threads.
let res = function();
// Keep the Mutex locked until exiting the thread to prevent
// a race with another thread checking for task completion and
// adding a waker.
let mut wake_on_exit = wake_on_exit_thread
.lock()
.map_err(|_| anyhow!("Tried to lock a tainted Mutex"))?;
if let Some(waker) = wake_on_exit.take() {
waker.wake();
}
res
})?;

This does however mean that you have to re-review that part.

@hnez
Copy link
Member Author

hnez commented Oct 6, 2023

The failed cargo clippy run should be fixed by #49 and has nothing to do with this PR.

Copy link
Member

@KarlK90 KarlK90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me now and will definitely improve the error handling story in the code base. Feel free to merge.

@jluebbe
Copy link
Member

jluebbe commented Oct 6, 2023

Adding a custom wrapper around standard APIs seems very invasive to me. Is this really such an unusual problem that there is no idiomatic way?

@hnez
Copy link
Member Author

hnez commented Oct 6, 2023

Hi @jluebbe

no idiomatic way

Not in async_std itself, so we would have to use a crate for it. I had a look around and found async_nursery which may be coerced into doing what we we want (minus the thread handling), but I would prefer not to use it because it does not look too mature (and does not support threads in addition to tasks).

I quite like the tailor-made solution implemented in this PR and think 206 not-that-dense lines of code are an acceptable complexity compared to adding a new dependency to handle this for us.

I would prefer merging this as-is rather than adding a new dependency.

@hnez
Copy link
Member Author

hnez commented Oct 6, 2023

The recently merged pull requests #33 and #34 made some changes to this PR necessary, as both of them spawn tasks.
Please re-review affected files.

@hnez hnez force-pushed the watched-tasks branch 5 times, most recently from 3114a63 to ea5cc2b Compare November 9, 2023 11:13
@hnez
Copy link
Member Author

hnez commented Nov 9, 2023

I've rebased this PR on top of recent main and think it should be the next one we merge.

@KarlK90
Copy link
Member

KarlK90 commented Nov 9, 2023

I remember that @jluebbe wanted some written commentary why we choose to implement a bespoke solution and not use async_nursery or tokio's JoinSet. Could you add this to watched_tasks.rs? After that I'm happy to approve this.

@hnez hnez self-assigned this Nov 23, 2023
@hnez hnez force-pushed the watched-tasks branch 2 times, most recently from f66b1b4 to 0612c98 Compare January 19, 2024 09:32
hnez added 4 commits January 19, 2024 11:05
The WatchedTasksBuilder makes sure that the program ends if any of the
tasks fails and also handles error propagation from the tasks to
the main() function.

Signed-off-by: Leonard Göhrs <[email protected]>
Execution should stop if any of the important threads ends, not just
one of the tasks.

Signed-off-by: Leonard Göhrs <[email protected]>
The large re-indentation would have made the commit that adds the
wtb.spawn_thread() call rather unreadable, as it mixes formatting
and code changes.
Instead add this commit that _only_ changes formatting.

Signed-off-by: Leonard Göhrs <[email protected]>
The large re-indentation would have made the commit that adds the
wtb.spawn_thread() call rather unreadable, as it mixes formatting
and code changes.
Instead add this commit that _only_ changes formatting.

Signed-off-by: Leonard Göhrs <[email protected]>
hnez added 4 commits January 19, 2024 11:06
While .expect() still means panicking it is considerd a better kind of
panicking than .unwrap().

Signed-off-by: Leonard Göhrs <[email protected]>
This allows us to display an error message if the server setup fails.

Signed-off-by: Leonard Göhrs <[email protected]>
Many small functions is a nice thing in general, but a three line function
that is only used once seems a bit over the top.

Signed-off-by: Leonard Göhrs <[email protected]>
@hnez
Copy link
Member Author

hnez commented Jan 19, 2024

I've added a comment …

// This is a wrapper around async_std::task:spawn() that keeps track of the
// tasks it spawned. This solves the problem of error propagation from tasks
// for us.
// When using async_std::task:spawn() you get a handle back that can be used
// to check if and how the task has completed, but there is no common way in
// async_std to have a list of tasks to watch at the same time.
// We want to keep track of the various long running tasks we spawn in the
// tacd and want to propagate errors from back to main().
// We also want the same, with a similar API for actual threads instead of
// async tasks. WatchedTasks does both.
//
// There are other solutions that do basically the same, but did not quite
// fit our needs:
//
// - async_nursery - Works for async_std but does not handle actual threads
// and does not look that great code-wise.
// - tokio JoinSet - Does roughly the same as WatchedTasks, but also without
// native thread support and - more importantly for tokio (which we do
// not use).

… that talks about why we went with our own task spawning wrapper instead of using something off-the-shelf.

Copy link
Member

@KarlK90 KarlK90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hnez The comment summarizes the discussion very well. You have my approval - feel free to merge.

@hnez hnez merged commit e94c996 into linux-automation:main Jan 19, 2024
10 checks passed
@hnez hnez deleted the watched-tasks branch January 19, 2024 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants