-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't allow unwinding from Drop impls #3288
base: master
Are you sure you want to change the base?
Conversation
I think it would also be good to add an explicit section to the RFC about how users will find out about the change in behavior, just to document that going in - for example, release blog post and compat note would be a good start in my eyes, but we could also consider some softer change (e.g., adjusting the panic logging to say "this will abort starting in X release) to phase this in. AIUI, there's not really much static checking we can do, so that's probably about our limit... |
I'm a huge fan of this proposal. When writing the rustonomicon this was one of the few major details of rust I couldn't really get a straight answer from anyone about. From my perspective it existed in the same grey area as unwinding through FFI, where we just lacked institutional expertise on the subtleties of unwinding corner cases and had punted on it. This is why I never really adequately documented it. The current behaviour is very subtle and not something even stdlib devs ever had institutional knowledge or consideration of. Writing exception-safe unsafe code is already enough of a headache, partially because it comes up rarely enough that it's hard to stay actively vigilant for. I think like, BinaryHeap is one of the only places in liballoc that has to do some explicit exception-safety fuckery because it needs to invoke the user-defined comparator repeatedly while doing mutations that put it in a transiently invalid state. All those stars lining up is actually surprisingly rare! Most of the time there are distinct lookup, allocate, then modify phases; so code is trivially exception-safe (assuming your own code doesn't panic, which collections generally only do during the lookup and allocations, not the final modification). Having less places where this rare problem can surprisingly come up, especially from user code, seems like a great thing. We've never had a good story about this situation and at this point I think we have ample evidence it should just be banished to the shadow realm for being more trouble than it's worth. Especially since the stdlib already has the established precedent of Buf and File both just swallowing errors on Drop. Those cases were going to be my one hand-wring for this change, so it turns out I have nothing to complain about! (My memory was that they panicked, but honestly I never really gave it much consideration). |
I'd like to rehash my earlier arguments against this change, for those starting from this issue. I'd like to begin with some statements that I think everyone here agrees with:
Thus, I believe that since this is a silent change that will harm some current use cases, we should really show that the benefit of this change is greater than the cost, and that there are neither more beneficial nor less costly alternatives. I argue that such an alternative exists: keep Currently, the RFC twice compares this change to C++11's The ability of Rust code to explicitly control dropping is quite relevant here. In C++, destruction is always invisible and uncontrollable (except if a value is placed into an The problem is, unsafe code today can easily implicitly drop values without realizing there's an issue; just look at the There are a few different variations on how the
|
I would like to echo that the RFC as currently written puts too much weight on the code size / compilation benefits of the change. These are a strong argument for making the option available, but distract from the correctness argument for making it mandatory. Presenting this in the main text which is concerned with justifying changing the semantics of existing code serves to distinctly weaken the justification, since the argued position is not necessary for a primary argued benefit. If the change is to be made, it should be justified on the unwind safety benefits alone, and the other benefits are just a very convenient bonus. There is additionally a third undiscussed option: panic-in-drop=catch. Effectively, this automatically converts types to the recommended I also need to point out what I think should be addressed, or at least acknowledged as a drawback: it is impossible to drop a structure where dropping its fields may panic without risking a panic-in-drop=abort. Even if you use A final really silly option: add If the compiler had some sort of Footnotes
|
I'll just note that the performance and code size changes are pretty inconsequential, for a break in backwards compatibility. 3-5% decrease in code size is irrelevant for all but the most resource-constrained applications, which will likely use Also, while the change would possibly simplify the correctness of unsafe code, it would compromise the correctness of the application at large, which may rely on properly handling Drop panics. This is a big issue, both because the application is much harder to audit due to its size and lack of compiler-enforced panic boundaries, and because the application is usually written by less experienced users, including total newbies, while unsafe code is relegated to the experts. The change in behaviour of Drop would be a breaking change, and a very insidious one, since it is entirely silent and hard to audit. That's the worst kind of breakage that can be introduced. It is also well-known that Rust currently allow panic in Drop, so many crates may rely on it. |
An attribute that specific unsafe types can put on their own drop impl if they need to be absolutely sure that no unwind can escape their drop is one thing. This is too much. |
I feel this is a throwing baby with the bathwater situation. I'd prefer exploration of alternative designs. For example:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've thought about it some more, and I am currently against changing the global default to panic-in-drop=abort
. I have a hypothesis that end of scope drops are not the problem to writing unwind-safe code. In most cases, unsafe code is written to have an unsafe precondition but a sound (or at least no-more-unsound) postcondition in a { suspend_safety; defer! { restore_safety; } tricky_code; return; }
pattern.
Because of this, most code, even unsafe
code, is unwind safe. Of course, the issue making unwind safety difficult is specifically that most code doesn't need to care, so when you do need to care, it's fairly easy to forget to. (This is an inherent limitation of operational pre/post-condition soundness semantics; an axiomatic invariant is just never put into an invalid unsound state in the first place.)
In order to justify changing the default, the RFC should justify the claim that RAII drop glue is an issue, and not just explicit drop_in_place
calls unwinding. I would much rather see (and am drafting) a proposal providing APIs for unsafe code to control unwinds more carefully; namely some combination of:
drop_unwind
which returnsOption<R>
rather thanResult<R, UnwindPayload>
to prevent Footgun withcatch_unwind
when catching panic-on-drop types rust#86027 issues.catch_unwind2
that forces explicitly deciding how to deal with payload drop panics.- Two main ways to do so: a
UnwindPayload
type which isManuallyDrop
, or a#[must_use]
Result
-variant that doesn't offer functionality implicitly dropping the unwind payload. - Even better if this can replace
catch_unwind
in-place. (Doing so in existing editions would require#[deprecated]
coercions for use of the current API or a strong argument for being a soundness fix that breaks inference.) - Author's current weak preference:
suspend_unwind
, which returnsResult<R, Unwind>
, and droppingUnwind
callsresume_unwind
.
- Two main ways to do so: a
#[abort_unwind]
to add unwind-to-abort shims to arbitrary functions.#[catch_unwind(return)]
to addcatch_unwind
shims returning some fallback value.- Some
#[drop_abort_unwind]
/#[drop_catch_unwind]
to do the above but on the full drop glue including dropping fields (as just shimmingDrop::drop
will not catch fields' drop glue unwinding). - Per-type control between leaking fields on drop glue unwind and suspending the unwind to drop fields.
drop_in_place_abort_unwinds
which calls drop glue with an unwind-to-abort shim.drop_in_place_catch_unwinds
which calls drop glue returning someResult<(), PanicPayload>
(seecatch_unwind2
for options here).drop_in_place_drop_unwinds
which calls drop glue withdrop_unwind
glue.#[warn(implicit_drop_glue)]
to warn where drop glue is inserted. (A goodrestriction
lint candidate, Though perhaps perf sensitive?)- Potential refinements involve only warning on drop glue which is not
nounwind
.
- Potential refinements involve only warning on drop glue which is not
The goal of the above is providing options to unsafe
code to allow code to choose their guarantees.
The below is a very interesting alternative to both panic-in-drop=unwind
and panic-in-drop=abort
.
There's additionally another intriguing middle ground which should be listed as an alternative: drop glue / drop_in_place
can still unwind, and a field's drop glue unwinding continues to unwind and drop other fields (siblings and of parent containing types), but unwind is implicitly caught and turned into an abort only when the unwind exits from the drop glue stack into a calling function.
(How this behavior translates to async
witness tables is a question in this scheme that still would need to be answered. My personal preferred solution is to treat drop
ping an async
witness table identically to as if it were an unwind (as the sync analogue is exactly an unwind), thus setting thread::panicking()
to true to eagerly convert any panics into aborts.)
This scheme has the potential to combine the benefits of both panic-in-drop=abort
1 and panic-in-drop=unwind
, though with a bit of additional effort. Namely, implicit drop glue unwind points are removed as a thing that unsafe
code has to worry about, but drop_in_place
still allows drop implementations to unwind at clearly marked points. Also notable is that drop
under this scheme can be made to opt-in by-value drops unwinding by reimplementing it as ManuallyDrop::drop(&mut ManuallyDrop::new(it))
.
Combined with some of the above controls for managing unwinding, this is a very promising approach to very potentially gain the benefit of this RFC while mitigating the downsides. (For maximum control, make a function-level switch between drop glue unwinds propagating or aborting.)
Footnotes
-
The unwind-safety benefits, anyway; the code size benefits will be reduced, if not completely eliminated, by drop glue handling suspend-and-continue unwind semantics. ↩
|
||
This RFC is already implemented as an unstable compiler flag which controls the behavior of panics escaping from a `Drop` impl. | ||
|
||
Performance results from [rust-lang/rust#88759](https://github.com/rust-lang/rust/pull/88759) show up to 10% reduction in compilation time ([perf](https://perf.rust-lang.org/compare.html?start=c9db3e0fbc84d8409285698486375f080d361ef3&end=1f815a30705b7e96c149fc4fa88a98ca04e2deee)) and a 5MB (3%) reduction in the size of `librustc_driver.so`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For comparison, I would like to see results where just std
's destructors are nounwind
/unwind-in-drop=abort
. This is likely a worst-of-both-worlds situation where using std
containers turns a drop unwind into an unconditional abort but stack types can still unwind from drop glue, but is well within std
's QOI rights to decide that unwinds out of its drop glue is forbidden while still allowed by the language. (Whether this is acceptable QOI is a separate question and depends on the potential benefit.)
- QOI
- Quality Of Implementation
text/0000-panic-in-drop.md
Outdated
// Codegen with -Z panic-in-drop=unwind (old behavior) | ||
unsafe fn drop_in_place<T>(ptr: *mut T) { | ||
// Call the Drop impl if there is one. | ||
try { | ||
<T as Drop>::drop(&mut *ptr); | ||
} catch { | ||
// If the Drop impl panics the keep trying to drop fields. | ||
try { | ||
drop_in_place(&mut (*ptr).field1); | ||
drop_in_place(&mut (*ptr).field2); | ||
drop_in_place(&mut (*ptr).field3); | ||
} catch { | ||
abort(); | ||
} | ||
} | ||
|
||
// Drop the first field. | ||
try { | ||
drop_in_place(&mut (*ptr).field1); | ||
} catch { | ||
// If dropping field1 panics the keep trying to drop the remaining fields. | ||
try { | ||
drop_in_place(&mut (*ptr).field2); | ||
drop_in_place(&mut (*ptr).field3); | ||
} catch { | ||
abort(); | ||
} | ||
} | ||
|
||
// Drop the second field. | ||
try { | ||
drop_in_place(&mut (*ptr).field2); | ||
} catch { | ||
// If dropping field2 panics the keep trying to drop the remaining fields. | ||
try { | ||
drop_in_place(&mut (*ptr).field3); | ||
} catch { | ||
abort(); | ||
} | ||
} | ||
|
||
// Drop the third field. | ||
drop_in_place(&mut (*ptr).field3); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that this is illustrative rather than accurate, but this is also significantly worse than what is necessary and as such exaggerates the benefit of panic-in-drop=abort
simplifying drop glue. A better representation of code the compiler could emit would be
unsafe fn drop_in_place<T>(ptr: *mut T) {
let mut unwind = None;
// Call the Drop impl if there is one.
try {
<T as Drop>::drop(&mut *ptr);
} catch(payload) {
std::panicking::panic_count::increase();
match unwind {
Some(_) => abort(),
None => unwind = Some(payload),
}
}
// Drop the first field.
try {
drop_in_place(&mut (*ptr).field1);
} catch {
std::panicking::panic_count::increase();
match unwind {
Some(_) => abort(),
None => unwind = Some(payload),
}
}
// Drop the second field.
try {
drop_in_place(&mut (*ptr).field2);
} catch {
std::panicking::panic_count::increase();
match unwind {
Some(_) => abort(),
None => unwind = Some(payload),
}
}
// Drop the third field.
try {
drop_in_place(&mut (*ptr).field3);
} catch {
std::panicking::panic_count::increase();
match unwind {
Some(_) => abort(),
None => unwind = Some(payload),
}
}
// Resume an in-progress unwind.
if let Some(payload) = unwind {
std::panicking::panic_count::decrease();
resume_unwind(payload);
}
}
but even this is more complicated than it need be, as those abort
calls could be replaced by unreachable_unchecked
, as hitting that case requires a double panic to unwind rather than immediately abort.
maybe better suggestion
// Codegen with -Z panic-in-drop=unwind (old behavior) | |
unsafe fn drop_in_place<T>(ptr: *mut T) { | |
// Call the Drop impl if there is one. | |
try { | |
<T as Drop>::drop(&mut *ptr); | |
} catch { | |
// If the Drop impl panics the keep trying to drop fields. | |
try { | |
drop_in_place(&mut (*ptr).field1); | |
drop_in_place(&mut (*ptr).field2); | |
drop_in_place(&mut (*ptr).field3); | |
} catch { | |
abort(); | |
} | |
} | |
// Drop the first field. | |
try { | |
drop_in_place(&mut (*ptr).field1); | |
} catch { | |
// If dropping field1 panics the keep trying to drop the remaining fields. | |
try { | |
drop_in_place(&mut (*ptr).field2); | |
drop_in_place(&mut (*ptr).field3); | |
} catch { | |
abort(); | |
} | |
} | |
// Drop the second field. | |
try { | |
drop_in_place(&mut (*ptr).field2); | |
} catch { | |
// If dropping field2 panics the keep trying to drop the remaining fields. | |
try { | |
drop_in_place(&mut (*ptr).field3); | |
} catch { | |
abort(); | |
} | |
} | |
// Drop the third field. | |
drop_in_place(&mut (*ptr).field3); | |
} | |
// Codegen with -Z panic-in-drop=unwind (old behavior) | |
unsafe fn drop_in_place<T>(ptr: *mut T) { | |
let unwind; | |
// Call the Drop impl if there is one. | |
try { | |
<T as Drop>::drop(&mut *ptr); | |
} catch(payload) { | |
magic::assume!(!magic::initialized!(unwind)); | |
std::panicking::panic_count::increase(); | |
unwind = payload; | |
} | |
// Drop the first field. | |
try { | |
drop_in_place(&mut (*ptr).field1); | |
} catch { | |
magic::assume!(!magic::initialized!(unwind)); | |
std::panicking::panic_count::increase(); | |
unwind = payload; | |
} | |
// Drop the second field. | |
try { | |
drop_in_place(&mut (*ptr).field2); | |
} catch { | |
magic::assume!(!magic::initialized!(unwind)); | |
std::panicking::panic_count::increase(); | |
unwind = payload; | |
} | |
// Drop the third field. | |
try { | |
drop_in_place(&mut (*ptr).field3); | |
} catch { | |
magic::assume!(!magic::initialized!(unwind)); | |
std::panicking::panic_count::increase(); | |
unwind = payload; | |
} | |
// Resume an in-progress unwind. | |
if magic::initialized!(unwind) { | |
std::panicking::panic_count::decrease(); | |
resume_unwind(unwind); | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the code that I showed is pretty much what is currently emitted by the compiler. Specifically:
- The panic count is not checked, because it requires a TLS access which can be very slow.
- Separate drop calls are used in the normal path and the unwind path because a different landing pad needs to be used when those calls unwind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, because of extern "C-unwind"
we can have C++ exceptions unwinding through Rust code which don't increment the panic count. This means that we can't rely on the panic count to guarantee that a drop within an unwind guard won't itself panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the compiler is emitting intrinsics::r#try
directly. AIUI, this means that the drop glue can in fact just not touch the panic count, as the panic count is already set by the in-progress unwind. It is currently the case that a panic-in-panic will abort via the panic_count check in rust_panic_with_hook
.
It is the case that calling drop_in_place
while panicking cannot unwind due to a rust panic. The only way that the separate landing pad would landed on is if an external unwind is caught. (IIUC, the intent is for extern "C-unwind"
to convert allowed unwinds into a rust panic. This must involve incrementing the panic count (and thus a place to abort on double panic), as catching will decrement the panic count.)
If you can show a case, where the drop glue second landing pad is landed on, I'll happily be proven wrong. But the normal case is in fact that the abort comes immediately from the panic!
, not the drop glue catching it.
So here's my updated drop glue impl:
// Codegen with -Z panic-in-drop=unwind (necessary behavior, different from current)
unsafe fn drop_in_place<T>(ptr: *mut T) {
let unwind;
// Note that try {} catch {} here is the raw intrinsics::try.
// The panic count is not touched, so remains set from the first panic.
// Call the Drop impl if there is one.
try {
<T as Drop>::drop(&mut *ptr);
} catch(payload) {
unwind = payload;
goto 'unwind_field1;
}
// Drop the first field.
try {
drop_in_place(&mut (*ptr).field1);
} catch(payload) {
unwind = payload;
goto 'unwind_field2;
}
// Drop the second field.
try {
drop_in_place(&mut (*ptr).field2);
} catch(payload) {
unwind = payload;
goto 'unwind_field3;
}
// Drop the third field.
try {
drop_in_place(&mut (*ptr).field3);
} catch(payload) {
unwind = payload;
goto 'done;
}
goto 'done;
try {
'unwind_field1:
drop_in_place(&mut (*ptr).field1);
'unwind_field2:
drop_in_place(&mut (*ptr).field1);
'unwind_field3:
drop_in_place(&mut (*ptr).field1);
} catch {
// A rust panic cannot hit this landing pad, because it would've aborted.
__rust_foreign_exception(); // aborts
}
'done:
// Resume an in-progress unwind.
if magic::initialized!(unwind) {
__rust_resume_panic(unwind);
}
}
I personally find the code size argument more convincing since it is backed by hard numbers which show the improvement. The exception safety argument is inherently subjective since it can just be countered by saying "well it's your fault for writing buggy unsafe code". It's not impossible to write exception-safe Rust code, just difficult.
I find this argument quite strange. Does any existing code actually rely on panics for normal operation? In general, panics will only happen in the case of buggy code, when an unexpected error occurs. Additionally, the proportion of code executed by
This is something that has been suggested a lot. If I understand correctly, what you are suggesting is that unwinding out of a drop aborts only for implicit drops but not explicit ones. The problem is: if you already know which drop calls are going to potentially panic then you could just convert that code to use a
The RAII drop glue calls
The problem with these is that none of them give the same code size & compilation time benefits as the approach proposed in this RFC. This is something that I feel quite strongly about: a 10% speedup in compilation time on a crate like |
Given that the Rust standard library (ostensibly a codebase with a much higher level of rigor and review than most crates) has gotten this wrong on several occasions, I'm doubtful that most developers would be able to disarm this footgun, or even recognize that this exists. |
Panics may be used to implement non-local exception-based control flow, even though it is clearly not an intended usage. But to the point, while I doubt that panics could occur during normal operation of most non-buggy code, the correctness argument is about the graceful handling of bugs. Code may rely on being able to catch the panics for correctness. A long-running web server, or an embedded device, or a security-critical firmware, or an OS kernel absolutely should not panic. End-users may rely on being able to catch any possible panics with With the proposed changes, a relatively benign panic for some rare edge case will be escalated to a total crash, which can lead to DoS or worse.
The argument about distribution of panics is quite weird, I don't know what to make of it. Since Drop must deal with disposing potentially fallible resources, I find it likely that Drop impls on average have more panics, but this is also a baseless assumption. |
But the code size can be achieved by opt-in or opt-out behavior. It is not a justification for making
Yes. For example, rust-analyzer uses unwinds for cancellation. Unwinding is the semantically correct way to handle cancellation, and is essentially what cancelling an This is not unique to just rust-analyzer; this is a design choice of salsa. It has generally been understood that while you SHOULD NOT require unwinding support in a library (and MUST support unwinding), you MAY rely on unwinding being available if you control the application compilation. I agree that likely no code is deliberately relying on unwinding out of drop for application correctness, but it is the case that long-running programs intend to catch any task unwind, and turning unwinds that would have been recovered from into aborts is adding a new DoS vector to such servers.
Then I can take this argument and apply it unchanged to making Unless I'm sorely mistaken, the justification for making this a mandatory breaking change without opt-out is that it is too difficult to know which drop calls are going to potentially panic, and the proposed resolution is to make no drop calls panic. I fail to see how this alternative fails at the goal, as every point that can unwind is clearly marked. And if you do want the behavior where I'd even be happy if we took the usual "safer by default" approach (as with
I will again also reiterate that the compile-time and code-size improvements are not an argument to make this mandatory. They are a very good argument to make it an option. They may even support making this the default. They would definitely be an argument against adding Swift's lack of If anything, this is an argument in support of
Yes, but I'm not arguing that point. It is a perfectly viable strategy for the unwind-to-abort shim to be added in the drop glue rather than in
Nobody (including std devs) really understood the issue of a caught unwind's drop unwinding until recently. (This discussion started almost immediately after it was discovered.) We can attack this with documentation on Other than this specific case, std has generally had no issue with unwind safety, because it is something you so rarely have to think about that std basically only encounters it for BTreeMap and where |
Also many TUI applications depend on unwinding to avoid getting stuck in raw mode when panicking. Very few register an atexit callback and signal handlers to exit raw mode on crashes. |
I wrote up an outline for a related alternative in an IRLO thread a while back. The idea is to have a use std::{any::Any, panic::UnwindSafe};
pub fn catch_unwind_v2<F: FnOnce() -> R + UnwindSafe, R>(f: F) -> UnwindResult<R>;
#[must_use]
pub struct UnwindResult<T> { ... }
impl<T> UnwindResult<T> {
// Unwinding functions
pub fn into_result(self) -> Result<T, Box<dyn Any + Send>>;
pub fn drop_err(self) -> Option<T>;
pub fn resume_unwind(self) -> T;
// Unwind-free functions
pub fn inspect_err<F>(self, f: F) -> UnwindResult<T>
where
F: FnOnce(&(dyn Any + Send + 'static));
pub fn inspect_err_mut<F>(mut self, f: F) -> UnwindResult<T>
where
F: FnOnce(&mut (dyn Any + Send + 'static));
pub fn forget_err(self) -> Option<T>;
pub fn unwrap_or_abort(self) -> T;
pub fn try_drop_err(self) -> Result<T, UnwindResult<()>>;
pub fn drop_err_or_forget(self) -> Option<T>;
pub fn drop_err_or_abort(self) -> Option<T>;
// Add `new()`, `map()`, etc. as desired
} I've been doing some surveying of real-world |
After a bit of research, I wanna say I don't particularly find the comparisons to C++
Rust code provides neither of these guarantees, even in idiomatic code. (It is only better than having no exception guarantee that memory corruption iins not allowed). Leak amplification is used throughout the Rust stdlib (unlike C++). In addition, unwinds do not require object invariants to be preserved, which is the entire reason why Given that objects with throwing destructors were already forbidden in STL containers, the change in C++ likely impacted a lot less code than these proposed Rust changes. And I think given the difference in semantics for unwinding vs exceptions, this change is too big a hammer. |
Yeah, |
The RFC text (in the Downsides) ought IMO to discuss the "drop bomb" pattern, sometimes used to simulate linear types - or rather, to avoid shipping code which treats a particular type as nonlinear. E.g. rust-lang/rust#78802 (comment) This is relevant here, because drop bombs precisely involve deliberately panicking inside drop, in circumstances where the author would probably have made very different design choices if they expected that such a panic would be an immediately irrecoverable abort. |
Drop bombs are usually being used to cover up for things you would want to be compile-time errors, so it's not at all clear to me that you would care about whether the bomb unwinds or aborts -- if linear (true must-use) type lovers had their druthers, the program wouldn't have even been runnable! |
I think there's been a huge amount of focus on the optimization possibilities here, as opposed to the correctness argument. The optimization possibilities would justify creating an option, but not changing the default. The correctness issue justifies changing the default. I think the correctness argument is the primary reason to do this; that it enables great optimizations is a bonus. Handling objects that panic in their drop implementations is incredibly complex, and as noted, even the standard library has gotten it wrong. I think it's reasonable for us to evaluate whether we expect libraries to handle this, or whether we want to proscribe it entirely so that libraries don't have to handle it. I am quite sympathetic to cases like long-running servers that want to catch panics and turn them into
Such servers already have multiple DoS vectors:
This proposal is not making all panics abort; I agree that that would be a huge change to the default. This proposal is making panics in |
"you already can't recover from lock poisoning" was obviously a gap in the design, and closing that gap is already in progress: rust-lang/rust#96469 There's three alternatives given, including "opt-in to unwinding", but there isn't an alternative given for "opt-out of unwinding". Since particular unsafe code is the code that can go wrong with a panic in drop (not all unsafe code does), why don't we put the burden on the particular instance of unwind sensitive unsafe code to opt in to having a drop that will abort if an unwind tries to escape from it? Call it |
@Amanieu We discussed this in today's @rust-lang/lang meeting. We felt like all of the arguments have been made at this point, and now we need to weigh the tradeoffs between possibilities:
However, before we do that, there are a few changes we'd like to see:
|
Once the correctness argument is at the forefront, I think it would be great for the RFC to also consider (or at least mention) alternatives that would aid the user in writing correct code, such as:
I'm no expert on the compiler, but I feel like (2) should be reasonably easy to implement. Implementing (3) in a useful manner would likely require a lot of additional design work1. Footnotes
|
This already exists: an unwind guard will call the panic handler with a |
I think this is an idea worth pursuing. Implementation-wise it should just be as simple as not making drop glues nounwind, but we emit drop terminators with We then only need to make sure |
While I think this isn't a bad idea at its core, I'm wary of such a rule being applied everywhere, since then every time you see an explicit drop mechanism, you'd have to judge whether the author was specifically thinking about unwinding, or whether the explicit drop is just being used for readability or performance. (E.g., |
This is the kind of thing where, similar to unsafe blocks, an explicit drop should almost always have an accompanying comment stating intent. Though I would also say that for example, safe code almost never actually cares about the panic behaviour of drop (or panic behaviour in general), so it's also fairly easy to tell from context |
I don't like this exceptional behavior, giving one main reason to use explicit |
catch_panic: warn about panicking payload drop Warns about the footgun in rust-lang/rust#86027. Will be unnecessary if panics escaping from drop leads to abort (rust-lang/rfcs#3288). But until that is enforced everywhere, let's warn users about this.
(xref to some discussion of drop bombs in Tor Project gitlab: https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/2091#note_3021454) |
I don't understand the motivation for some explicit drops being allowed to panic but not others. It seems like these kinds of scenarios would be better encoded as a Put another way, I really think we should be striving for a single set of rules that apply to |
The problem I see with that is that every generic container would need to forward And regardless, I don't see how all these added workarounds would be superior to just opting out at the call sites that are sensitive to unwinding. The whole point of unwinding is that it should soundly pass through any safe function with well-behaved local values (disrupting logical invariants but not causing UB): catching an unwind and continuing should be the difficult part, and being unwound through should be harmless except for a small minority of code. Of course, that minority is also quite important to get correct, hence my recommendation to create better call-site tooling. Also, I'm still quite curious, what are those "many bugs" from unwind-on-drop that you were referring to? |
A primary difficulty of using a I do think a native A notable failure case of opt-in no-drop-unwinds is that it's an uncontrolled property whether a type gets dropped in an unwind-friendly context or an unwind-abort context (and not one that can be tested for like |
This is a gap in the language that could be filled in. As you observe, it's already possible but unsafe to do. It could be made safe, by providing a modifier on a struct or enum pattern (possibly restricted to use in the defining crate?) which:
Just like a struct literal expression introduces a droppable struct value, by taking ownership of its ingredients as a primitive operation, this “de-construction” operation does the opposite. Even if no changes are made to Rust's general finalization story, de-construction would be useful for all operations which consume a value and want to do something other than its (To be clear, I'm not arguing in favor of adding |
well, since impl Drop for MyType {
fn drop(self) { // takes self by value, requires MyType: Unpin
let Self { field_1, field_2 } = self;
// ...
}
} |
Would it help to have some standard library type that panics in its own struct Transaction {
drop_bomb: Option<std::panic::Bomb>,
}
impl Transaction {
fn commit(self) {
self.drop_bomb.take().map(std::panic::Bomb::defuse);
}
} and/or have: std::panic::no_unwind_on_drop(|| {
// write delicate code here without worrying about unwinding
}); |
First off, as some others have said regarding this, not knowing if you're in an unwind-friendly context is already the status quo: your crate might have been compiled with Also, an opt-in context would be mostly equivalent to an aborting drop bomb (except possibly more eager), which is already a widespread approach for unwind-sensitive code that doesn't want to carefully forward panics, or is physically unable to (e.g., FFI callbacks). If anything, this would make it easier to add a Second, I don't see unsafe-collection destructors as the kind of "unwind-during-cleanup sensitive" code that would want or need an abort bomb. Most collections don't have a crucial invariant to be maintained during drop: if a panic cuts them off, they just leak some of their elements and/or allocations. After all, the collection is totally inaccessible anyway once it's dropped, so any broken invariant would have to lie somewhere outside, which is far from common. I envision such an opt-in mechanism as specifically being for those cases where unsoundness or extreme incorrectness might occur from an unwind. (E.g., perhaps a cryptographic zero-on-drop type might be particularly concerned about not skipping the remainder of the destructor.) I suppose it could indeed be a failure mode if authors become afraid and start using it unnecessarily. Still, I think that the benefits of first-class explicit unwind control are worth this cost, especially given how unwieldy the current alternative (drop guards) is; we'd just have to take care in how we document it. |
catch_panic: warn about panicking payload drop Warns about the footgun in rust-lang/rust#86027. Will be unnecessary if panics escaping from drop leads to abort (rust-lang/rfcs#3288). But until that is enforced everywhere, let's warn users about this.
On the other hand one of the motivators for aborting upon unwind out of drop is preventing this kind of partial destruction and simplifying drop glue by not ever needing to handle unwinds during normal cleanup.
A subtle one I wrote somewhat recently (in proof of concept code) was assuming that dropping the sending half of a channel can't drop any queued messages; that dropping the receiving half drops the queue and sending fails after that sync point. For the usual impls, this is true unless dropping a message panics, in which case any remaining messages in the queue resume being dropped when the final sender is dropped. Compare dropping |
Sure, I'm all for making abort-on-unwinding-drop something that crates can opt into, given the measured perf benefits; it's just making it mandatory that I have big issues with. The basic principle is, random crate authors have no obligation to make their panic-on-drop behavior sane (only to make it sound), but I should have the ability to handle unwinds as robustly as I want, and treat (E.g., not too long ago, in some of my own experimental code, I messed around with making trait objects that execute all their behavior on drop. That way, the vtable could be made as small as possible for what the objects were required to do. Or, for a more standard example, a codebase might want to use unwinds to implement robust thread cancellation for a particular thread, by Regarding partial destruction, I don't see the status quo of 'best-effort' cleanup as particularly problematic: it just means that if you do have anything really important to clean up on panic (open sockets, perhaps?), you might want to segregate it into its own collection, where each element is known to be well-behaved on drop.
Thank you, that is a good example. I suppose channels would be an archetypical case of dropping a container possibly causing external effects. More generally, in the status quo, when implementing a data structure with shared ownership (e.g.,
Yeah, writing fully unwind-correct code is pretty tedious currently. But I don't think it necessarily has to remain that way forever. There will always be lots of tricky unwind-sensitive code outside of destructors, so the more we can add built-in functionality to help with those, the less tedious destructors in particular can become. In the case of fn for_each_guarded<F: FnMut(Self::Item)>(mut self, mut f: F) where Self: Sized {
struct Guard<'a, I: Iterator, F: FnMut(I::Item)>(&'a mut I, &'a mut F);
impl<I: Iterator, F: FnMut(I::Item)> Drop for Guard<'_, I, F> { fn drop(&mut self) {
self.0.for_each(|item| (self.1)(item));
} }
let guard = Guard(&mut self, &mut f);
guard.0.for_each(|item| (guard.1)(item));
core::mem::forget(guard);
} so you could then write let (front, back) = self.as_mut_slices();
[front, back].into_iter().for_each_guarded(|slice| drop_in_place(slice)); It would be perfectly composable for tree-like data structures. But I don't know how well it would optimize.
I suppose you're referring to certain forms of stack pinning in particular, where the data structure holds an owned value in borrowed storage? For heap pinning, the invariant would be upheld by just leaking the heap storage after the value's destructor panics. |
(NOT A CONTRIBUTION) I do not think this change would be an improvement to the language, chiefly for the same reasons as other commenters in this thread: if a buggy destructor panics, I want to be able to isolate it without bringing down my entire application. More importantly, as @nikomatsakis notes in his proposal to merge this RFC, this is a breaking change to your existing user base and a violation of the Rust project's stability guarantees. For this reason alone this RFC should be closed & I don't understand how this can be brushed off so lightly. Is Rust a reliable tool that I can upgrade without fear, or does it make breaking changes in point releases? The primary motivation for this RFC seems to be the issue of panicking drops breaking unsafe code that is not exception safe. I agree that this is a difficult thing to work around, however I would point out that drops (being silently inserted code) can cause bugs in unsafe code that hasn't considered them even if they don't panic. I wrote about this years and years ago when I wrote two bugs in ringbahn that both were the result of destructors being inserted that I didn't consider: https://without.boats/blog/two-memory-bugs-from-ringbahn/ For this reason I think the project should pursue @LegionMammal978's proposal from 2022:
A totally non-invasive change would be to allow users to warn when any destructors are inserted in a code block, so they instead use explicit calls to |
I generally agree with the "it's a breaking change" argument, but there's an important caveat to the above statement: even today, recovering from a buggy destructor is not guaranteed, because double-panicking (due to a second buggy destructor) leads to an immediate abort. |
(NOT A CONTRIBUTION) @Jules-Bertholet yes that's true in an absolute sense, but its about the degree of risk. In a correctly implemented program, zero panics occur. But we need to tolerate the risk of programmer error. Using unwinding to isolate the programmer error is an effect way to reduce the degree of risk in practice so that it is less likely to escalate into an urgent system failure. With the current system, the risk that unwinding fails and becomes a process abort (increasing the chance of escalation) is the multiple of the probability of a panic, the probability of panic in a destructor, and the inverse of their correlation (because these two probabilities are likely somewhat correlated). By eliminating one of the multiples, this increases the risk of process abortion and therefore the risk of escalating failure. Instead of reducing the degree to which unwinding is available to mitigate programmer error, the project should find ways to make correct unsafe code easier to write that doesn't require eliminating this mitigation. |
Thank you for your intervention. I agree.
Quite so,.
This is of course an excellent idea. This kind of alternative is IMO not at all properly discussed in the RFC. There is one short sentence about it, nearly at the bottom of the list of alternatives. The Alternatives section ought to explain, for each rejected alternative, why that option has been rejected. In this case, the RFC does not give any reason why a lint wouldn't be a superior answer. This is an obvious red flag for any RFC, let alone one which is such a significant departure from previously advertised guarantees, and which has attracted such substantial opposition. The proper response to criticisms here in the discussion thread, is to refute them in the RFC text (if they can be refuted, which I suspect isn't the case here). Instead, , the discussion here has become an unmanageable behemoth, and many good points made go unanswered. TBH I have found it quite alarming, how ready the project seems to be to adopt a proposal which is so controversial, on the basis of a document which has such an obvious deficiency. The RFC process is failing here. I don't think this is because it's a bad process; rather, we don't seem to be applying it rigourously. |
FWIW, I agree. I don't have a strong opinion either way for this RFC: from an unsafe code perspective, it'd be very nice to have to worry about fewer unwinding points -- but it's not clear that by "just" removing drops from the picture, we are making a significant enough dent here; there's still an unwinding point at each function call after all. But there are some interesting alternatives -- not just a lint; thanks to all the C-unwind work (that happened after this PR was initiated), the As a t-lang advisor, I'd like to register this as a formal concern: to better describe the possible alternatives and explain why they are worse than the proposed solution. |
(NOT A CONTRIBUTION) I like Ralf's idea as a way of formalizing what libraries like rayon apparently already do: upgrading any panic to an abort whenever you enter their routines. But I think it's important that libraries document if they do this; given the choice between a library which handles unwinding correctly and one which doesn't, I would be more likely to choose the unwinding-friendly library, and I don't think we should abandon that as the goal for unsafe code. |
@rfcbot concern describe alternatives Ralf writes:
|
As an interesting note, C++ denied |
Something Nadrieril came up with during unsafe fields rfc discussion: How about unsafe locals ( |
(More precisely, my intent was for the borrow-checker to prevent automatic drops out of
|
This RFC proposes to change this behavior to always abort if a panic attempts to escape a drop, even if this drop was invoked as part of normal function execution (as opposed to unwinding which already aborts with a double-panic).
Previous discussion: rust-lang/lang-team#97
Rendered
FCP comment