-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for allocation APIs #27700
Comments
cc @pnkfelix |
It’s already possible to indirectly use fn allocate<T>(count: usize) -> *mut T {
let mut v = Vec::with_capacity(count);
let ptr = v.as_mut_ptr();
std::mem::forget(v);
ptr
}
fn deallocate<T>(ptr: *mut T, count: usize) {
std::mem::drop(Vec::from_raw_parts(ptr, 0, count));
} Any future GC support (mentioned in the While this hack has the merit of existing (and enabling many libraries to making themselves available on stable Rust),
While I’m not attached to the details of the current |
Random idea for a (more rustic?) RAII-based API for /// Allocated but not-necessarily-initialized memory.
struct Buffer {
ptr: Unique<u8>,
size: usize,
align: usize,
}
impl Drop for Buffer {
/* deallocate */
}
impl Buffer {
fn new(size: usize, align: usize) -> Result<Self, ()> {
/* allocate, but avoid undefined behavior by asserting or something
(maybe skip calling allocate() on `size == 0`?) */
}
// Maybe skip those and keep the unsafe functions API?
fn into_ptr(self) -> *mut u8 { /* forget self */ }
unsafe fn from_raw_parts(ptr: *mut u8, size: usize, align: usize) -> Self { /* */ }
fn as_ptr(&self) -> *const u8 { self.ptr }
fn as_mut_ptr(&mut self) -> *mut u8 { self.ptr }
fn size(&self) -> usize { self.size } // Call this len?
fn align(&self) -> usize { self.align }
// The caller is responsible for not reading uninitialized memory
unsafe fn as_slice(&self) -> &[u8] { /* ... */ }
unsafe fn as_mut_slice(&mut self) -> &mut [u8] { /* ... */ }
fn reallocate(&mut self, new_size: usize) -> Result<(), ()> { /* ... */ }
fn reallocate_in_place(&mut self, new_size: usize) -> Result<(), ()> { /* ... */ }
} |
this doesn't get you access to the |
Not directly, but you can influence alignment by carefully picking |
@SimonSapin ooh nice! I'd have to read up on alignment rules to figure out if there's a type that would let me align to a page boundary or not, but nice trick :-) |
@kamalmarhubi This stackoverflow answer is probably the most relevant thing that is actually implemented today: http://stackoverflow.com/questions/32428153/how-can-i-align-a-struct-to-a-specifed-byte-boundary (Longer term we'll presumably put in something better. The |
@pnkfelix thanks for the link. Sounds like I'm out of luck for page boundary alignment though! I am also unclear if the allocator would hate me for |
It took me several seconds to realize that "aligning to a page boundary" actually wasn't a tongue-in-cheek joke about text rendering... |
I noticed that Echoing @glandium, my focus is:
Per #33082 (comment), I understand the answer is:
I'm focused on server use cases - particularly those where large allocations are both common and recoverable. I think a strong case for OOM recoverability has been made in several ways. I would, ideally, like to see
Looking at actionable options, I appear to be presented with:
And last, is this the right place to discuss, or should this be a separate issue? |
I've seen a lot of push-back about whether Rust should offer graceful handling of malloc failure. A few points that recur:
And a few recurring points about panic-on-OOM specifically:
I have not yet found any sound arguments as to why OOM should not panic by default (vs. current behavior - panic sometimes, but usually abort). Perhaps I should write a rust app that demonstrates how malloc failure isn't what people expect, and link to it here. |
I am concerned that as As code paths dependent upon this may start appearing, the likelihood of breakage decreases the earlier OOM behavior is changed (or clearly documented to be changing, or documented as changeable). To present a couple strawmen that, through simplicity, might be faster to stabilize: Opt-in to Panic
Out-out of panic
Would either of these be easier to fast-track for stabilization (compared to stabilizing EDIT: Is there concern that libstd may not be robust in the face of panic-on-OOM due to unsafe bits with un-exercised OOM failure states? If so, perhaps this could be opened as an issue for me to work on? With custom allocators this can be pretty straightforward to test. |
OOM always aborts. It's only capacity overflows that panic, which For example, |
As the person who implemented Since this behavior was confusing many people, I added this function, which is used by libstd to set an OOM handler that prints "Out of memory" to standard error before calling This functionality couldn't have been put into liballoc directly since that crate is used in bare metal systems and kernels which don't have a standard error to print to and don't have an abort function to call. So the default behavior is to call TLDR: |
Regarding your proposal of panicking on OOM, the biggest issue I can see is that unsafe code may leave inconsistent state if a panic occurs where it does not expect one. This inconsistent state could result in memory unsafety and could even be used by exploits that can trigger OOM conditions. |
@Amanieu The first step to either improving robust OOM-panic handling in libstd or creating replacement APIs is to identify which APIs perform allocations. Is there an automated way to do this? I would love to see such APIs flagged in documentation, although a simple list would suffice. Is there already a compiler plugin for this? For APIs whose state is contained in a single region of memory, my default testing approach would be to employ a custom allocator to exhaustively force *alloc failures, while requiring that post-panic state involves a bit-for-bit match with the original. I would hope those writing unsafe code in libstd put allocations as early as possible. Also, thank you for explaining the motivation behind |
I'd like for the OOM API to be stabilized in some form. I currently have a sandboxed process which performs calculations on bignums, and OOMs frequently result. Because there is no stabilized API for reporting and handling OOM failures, I have to assume that all aborts are OOMs, which is not always the case. I don't need to be able to recover from an OOM failure - I just need to signal to the parent process that it was an OOM and not some other crash. |
Perhaps this is not the right place to ask, but what's the progress on the |
Nominating for libs team discussion. |
In my previous comment I mentioned I was using |
There’s even a crate for it: https://crates.io/crates/memalloc |
Proposal: Edit: removed drive-by changes per discussion below.
@aturon how does this sound? |
And of course, before stabilizing:
|
If we insist on building more ergonomic APIs, I think we should expose the current ones as-is, suffixed with |
I proposed small changes in passing because that seemed an easy improvement, but I’m not particularly attached to them. And I don’t think they’re worth doubling the API surface. @gankro What’s the value of the |
There is little to know value; I've just seen this API get punted from stabilization time and time again over hemming and hawing over potential improvements, when everyone just needs some way to allocate with a given size and alignment. So basically I'm desperate to do anything it takes to get this landed. I had already been planning to suggest this rename precisely to get them "out of the way" of premium name real-estate. I personally don't think our lowest level allocation functions should provide any help or do anything on top of the system allocator. This is a critical code path. We should definitely provide higher level abstractions that do the sorts of things you suggest, but the API as it exists should be preserved and pushed out ASAP. |
Sounds fair enough. I’ve edited my "proposal" message above to skip the changes. I don’t have an opinion on the rename. |
Discussed briefly in the libs meeting today; I proposed that we need to get all of the allocator stakeholders together to discuss our plan (in light of @sfackler's RFC, etc). The goal would be to lay out a definite plan of action for incremental stabilization of pieces of the allocator system, trying to get something stable ASAP. If you'd like to take part in this discussion, please leave a comment and I'll be in touch. |
@aturon I'm happy to share any feedback/thoughts/grumpy remarks/etc. |
I'd like to be involved. |
Ok! We've now had a chance to get many of the stakeholders together and chat about the current state of affairs. Action items coming out of this moot:
So with that in mind hopefully we can aim to start closing out this issue soon! |
[I hope this is the right place to ask] Playing around with aligned allocation using alloc::heap::allocate, I noticed that usable_size() does not steadily increase by a factor of 2 with increasing requests, but instead jumps in reported (effectively allocated?) storage after 2K directly to 16K. Playground: https://is.gd/HGhRYR |
@rolandsteiner Have you tried requesting the system allocator to see if that behaviour changes? ...because I know jemalloc rounds allocations up as part of its approach for combatting memory fragmentation in long-running programs. EDIT: Yep. I just tried adding You'll have to ask the jemalloc devs what rationale they used to decide against having a 4K or 8K arena. |
@ssokolow Thanks for the follow-up! I was mainly puzzled because this behavior is only triggered by the requested alignment, not size (i.e., a 4K or 8K request on a 2K alignment returns a 4K/8K block just fine). This means, one cannot naively request a single 4K page. But perhaps the playground server runs on a different architecture that uses 16K pages (?). |
Upon review of the allocator-related tracking issues we actually have quite a lot now! I'm going to close this in favor of #32838 as the stabilization of the I'll be copying over some of the points at the top of this tracking issue to that issue as well. |
Current status
Final incarnation of
std::heap
is being proposed in rust-lang/rfcs#1974, hopefully for stabilization thereafter.Open questions for stabilization are:
Is it required to deallocate with the exact size that you allocate with? With the
usable_size
business we may wish to allow, for example, that you if you allocate with(size, align)
you must deallocate with a size somewhere in the range ofsize...usable_size(size, align)
. It appears that jemalloc is totally ok with this (doesn't require you to deallocate with a precisesize
you allocate with) and this would also allowVec
to naturally take advantage of the excess capacity jemalloc gives it when it does an allocation. (although actually doing this is also somewhat orthogonal to this decision, we're just empoweringVec
). So far @gankro has most of the thoughts on this.Is it required to deallocate with the exact
align
that you allocate with? Concerns have been raised that allocatores like jemalloc don't require this, and it's difficult to envision an allocator that does require this. (more discussion). @ruuda and @rkruppe look like they've got the most thoughts so far on this.Original report
This is a tracking issue for the unstable APIs related to allocation. Today this encompasses:
alloc
heap_api
oom
This largely involves just dealing with the stabilization of liballoc itself, but it needs to address issues such as:
liballoc
?oom
be a generally available piece of functionality, and should it be pluggable?This will likely take a good deal of work to stabilize, and this may be done piecemeal, but this issue should serve as a location for tracking at least these unstable features.
The text was updated successfully, but these errors were encountered: