19.4 Box<T>: Simple Heap Allocation
Box<T> is the most basic smart pointer, providing ownership of data allocated on the heap. Conceptually, Box<T> is a simple struct that holds a raw pointer to heap-allocated data of type T.
- Creation:
Box::new(value)allocates memory on the heap, movesvalueinto that memory, and returns aBox<T>instance (which itself usually lives on the stack or in another structure). - Ownership: The
Box<T>exclusively owns the heap-allocated data. Only oneBox<T>points to a given allocation at a time (though ownership can be transferred via moves). - Deallocation: When the
Box<T>goes out of scope, itsDropimplementation is called, which deallocates the heap memory and drops the contained valueT.
19.4.1 Key Features of Box<T>
- Exclusive Ownership: Ensures only one owner exists, aligning with Rust’s default ownership rules but for heap data.
- Heap Allocation: The primary way to explicitly put data on the heap in Rust.
- Known Size Pointer: A
Box<T>always has the size of a pointer, regardless of the size ofT. This is crucial for types whose size isn’t known at compile time (like trait objects) or for recursive types. - Indirection: Provides a level of pointer indirection to access the data.
DerefandDerefMut: Implements these traits.Derefallows aBox<T>to be treated like&T(e.g., using*for immutable access or calling methods via automatic deref coercion:my_box.some_method()). If theBox<T>binding itself is mutable (let mut my_box),DerefMutallows treating it like&mut T, enabling mutation of the heap-allocated value (e.g.,*my_box = new_value;ormy_box.some_mut_method()). It’s worth noting that the singlemutkeyword on the binding enables both the standard reassignment of theBoxitself (my_box = Box::new(...)) and, thanks toDerefMut, the mutation of the value it points to.- Minimal Overhead: Because
Box<T>is essentially just a wrapper around a raw pointer, accessing the data viaBox<T>involves the same level of indirection as a C pointer. There’s no additional overhead for the pointer access itself compared to a raw pointer, beyond the initial heap allocation cost.
19.4.2 Use Cases and Trade-Offs
Common Use Cases:
- Recursive Data Structures: To define types that need to contain pointers to themselves (e.g., nodes in a list or tree),
Box<T>breaks the infinite size calculation at compile time by providing indirection with a known pointer size.#![allow(unused)] fn main() { enum List { Cons(i32, Box<List>), Nil, } } - Trait Objects: To store an object implementing a specific trait when the concrete type isn’t known at compile time (
dyn Trait).Box<dyn Trait>provides the necessary indirection and owns the unknown-sized object on the heap. - Transferring Large Data: Moving a
Box<T>is efficient because it only involves copying the pointer itself (which is small and typically resides on the stack or in a register), not the potentially large data structure located on the heap. This is much faster than moving the entire data structure if it were stack-allocated. - Explicit Heap Placement: To avoid placing large data structures on the stack, preventing potential stack overflows, especially in constrained environments or deep recursion.
Trade-Offs:
- Indirection Cost: Accessing heap data via a pointer involves an extra memory lookup compared to direct stack access, potentially leading to cache misses and a small performance penalty.
- Allocation Cost: Heap allocation and deallocation operations are generally slower than stack allocation.
Example:
fn main() { let stack_val = 5; // On the stack // Allocate an integer on the heap, owned by a mutable Box binding let mut boxed_val: Box<i32> = Box::new(stack_val); // The 'mut' allows the binding itself to be reassigned: // boxed_val = Box::new(7); // This would drop the original Box(5) and point to a new Box(7). // Access the value using immutable dereferencing println!("Initial value on heap: {}", *boxed_val); // Output: 5 // Mutate the value on the heap via mutable dereferencing // This requires `boxed_val` to be declared with `let mut` *boxed_val += 10; println!("Mutated value on heap: {}", *boxed_val); // Output: 15 // You can still work with the mutated value directly let added_val = *boxed_val + 10; println!("Heap value + 10: {}", added_val); // Output: 25 // Methods defined on i32 (taking &self) can often be called directly // on the Box<i32> due to automatic deref coercion. println!("Absolute value on heap: {}", boxed_val.abs()); // Output: 15 // Note: .abs() takes &self, so deref coercion works seamlessly. // Methods taking `self` like checked_add would need // explicit deref: (*boxed_val).checked_add(10) // `boxed_val` goes out of scope here. Its Drop implementation runs, // freeing the heap memory. }
Note: For specific advanced scenarios, particularly involving async code or FFI where data must not be moved in memory after allocation, Pin<Box<T>> is used. This provides guarantees about memory location stability.