The purpose of this document is to describe the interface between a MiniRust program and memory.
The interface shown below already makes several key decisions. It is not intended to be able to support any imaginable memory model, but rather start the process of reducing the design space of what we consider a "reasonable" memory model for Rust. For example, it explicitly acknowledges that pointers are not just integers and that uninitialized memory is special (both are true for C and C++ as well but you have to read the standard very careful, and consult defect report responses, to see this). Another key property of the interface presented below is that it is untyped. This implies that in MiniRust, operations are typed, but memory is not - a key difference to C and C++ with their type-based strict aliasing rules.
One key question a memory model has to answer is what is a pointer. It might seem like the answer is just "an integer of appropriate size", but that is not the case (as more and more discussion shows). This becomes even more prominent with aliasing models such as Stacked Borrows. The memory model hence takes the stance that a pointer consists of the address (which truly is just an integer of appropriate size) and a provenance. What exactly provenance is is up to the memory model. As far as the interface is concerned, this is some opaque extra data that we carry around with our pointers and that places restrictions on which pointers may be used to do what when.
The unit of communication between the memory model and the rest of the program is a byte.
To distinguish our MiniRust bytes from u8
, we will call them "abstract bytes".
Abstract bytes differ from u8
to support representing uninitialized Memory and to support maintaining pointer provenance when pointers are stored in memory.
We define the AbstractByte
type as follows, where Provenance
will later be instantiated with the Memory::Provenance
associated type.
#[derive(PartialEq, Eq)]
enum AbstractByte<Provenance> {
/// An uninitialized byte.
Uninit,
/// An initialized byte, optionally with some provenance (if it is encoding a pointer).
Init(u8, Option<Provenance>),
}
impl AbstractByte<Provenance> {
fn data(self) -> Option<u8> {
match self {
Uninit => None,
Init(data, _) => Some(data),
}
}
fn provenance(self) -> Option<Provenance> {
match self {
Uninit => None,
Init(_, provenance) => provenance,
}
}
}
The MiniRust memory interface is described by the following (not-yet-complete) trait definition:
/// An "address" is a location in memory. This corresponds to the actual
/// location in the real program.
/// We make it a mathematical integer, but of course it is bounded by the size
/// of the address space.
type Address = BigInt;
/// A "pointer" is an address together with its Provenance.
/// Provenance can be absent; those pointers are
/// invalid for all non-zero-sized accesses.
#[derive(PartialEq, Eq)]
struct Pointer<Provenance> {
addr: Address,
provenance: Option<Provenance>,
}
/// *Note*: All memory operations can be non-deterministic, which means that
/// executing the same operation on the same memory can have different results.
/// We also let read operations potentially mutate memory (they actually can
/// change the current state in concurrent memory models and in Stacked Borrows).
trait MemoryInterface {
/// The type of pointer provenance.
type Provenance : Eq;
/// We use `Self::Pointer` as notation for `Pointer<Self::Provenance>`,
/// and `Self::AbstractByte` as notation for `AbstractByte<Self::Provenance>`.
type Pointer = Pointer<Self::Provenance>;
type AbstractByte = AbstractByte<Self::Provenance>;
/// Create a new allocation.
/// The initial contents of the allocation are `AbstractByte::Uninit`.
fn allocate(&mut self, size: Size, align: Align) -> NdResult<Self::Pointer>;
/// Remove an allocation.
fn deallocate(&mut self, ptr: Self::Pointer, size: Size, align: Align) -> Result;
/// Write some bytes to memory.
fn store(&mut self, ptr: Self::Pointer, bytes: List<Self::AbstractByte>, align: Align) -> Result;
/// Read some bytes from memory.
fn load(&mut self, ptr: Self::Pointer, len: Size, align: Align) -> Result<List<Self::AbstractByte>>;
/// Test whether the given pointer is dereferenceable for the given size and alignment.
/// Raises UB if that is not the case.
/// Note that a successful read/write/deallocate implies that the pointer
/// was dereferenceable before that operation (but not vice versa).
fn dereferenceable(&self, ptr: Self::Pointer, size: Size, align: Align) -> Result;
}
This is a very basic memory interface that is incomplete in at least the following ways:
- We need to add support for casting pointers to integers and back.
- To represent concurrency, many operations need to take a "thread ID" and
load
andstore
need to take an [Option<Ordering>
] (withNone
indicating non-atomic accesses). - To represent Stacked Borrows, there needs to be a "retag" operation, and that one might have to be "lightly typed" (to care about
UnsafeCell
). - Maybe we want operations that can compare pointers without casting them to integers.