19.4 Box<T>
: Simple Heap Allocation
Box<T>
is the most basic smart pointer, providing ownership of data allocated on the heap. Conceptually, Box<T>
is a simple struct that holds a raw pointer to heap-allocated data of type T
.
- Creation:
Box::new(value)
allocates memory on the heap, movesvalue
into that memory, and returns aBox<T>
instance (which itself usually lives on the stack or in another structure). - Ownership: The
Box<T>
exclusively owns the heap-allocated data. Only oneBox<T>
points to a given allocation at a time (though ownership can be transferred via moves). - Deallocation: When the
Box<T>
goes out of scope, itsDrop
implementation is called, which deallocates the heap memory and drops the contained valueT
.
19.4.1 Key Features of Box<T>
- Exclusive Ownership: Ensures only one owner exists, aligning with Rust’s default ownership rules but for heap data.
- Heap Allocation: The primary way to explicitly put data on the heap in Rust.
- Known Size Pointer: A
Box<T>
always has the size of a pointer, regardless of the size ofT
. This is crucial for types whose size isn’t known at compile time (like trait objects) or for recursive types. - Indirection: Provides a level of pointer indirection to access the data.
Deref
andDerefMut
: Implements these traits, allowing aBox<T>
to be dereferenced using*
(e.g.,*my_box
) and enabling automatic deref coercions, so you can often call methods onT
directly via the box (e.g.,my_box.some_method()
).Deref
andDerefMut
: Implements these traits.Deref
allows aBox<T>
to be treated like&T
(e.g., using*
for immutable access or calling methods via automatic deref coercion:my_box.some_method()
). If theBox<T>
binding itself is mutable (let mut my_box
),DerefMut
allows treating it like&mut T
, enabling mutation of the heap-allocated value (e.g.,*my_box = new_value;
ormy_box.some_mut_method()
). It’s worth noting that the singlemut
keyword on the binding enables both the standard reassignment of theBox
itself (my_box = Box::new(...)
) and, thanks toDerefMut
, the mutation of the value it points to.- Minimal Overhead: Because
Box<T>
is essentially just a wrapper around a raw pointer, accessing the data viaBox<T>
involves the same level of indirection as a C pointer. There’s no additional overhead for the pointer access itself compared to a raw pointer, beyond the initial heap allocation cost.
19.4.2 Use Cases and Trade-Offs
Common Use Cases:
- Recursive Data Structures: To define types that need to contain pointers to themselves (e.g., nodes in a list or tree),
Box<T>
breaks the infinite size calculation at compile time by providing indirection with a known pointer size.#![allow(unused)] fn main() { enum List { Cons(i32, Box<List>), Nil, } }
- Trait Objects: To store an object implementing a specific trait when the concrete type isn’t known at compile time (
dyn Trait
).Box<dyn Trait>
provides the necessary indirection and owns the unknown-sized object on the heap. - Transferring Large Data: Moving a
Box<T>
is efficient because it only involves copying the pointer itself (which is small and typically resides on the stack or in a register), not the potentially large data structure located on the heap. This is much faster than moving the entire data structure if it were stack-allocated. - Explicit Heap Placement: To avoid placing large data structures on the stack, preventing potential stack overflows, especially in constrained environments or deep recursion.
Trade-Offs:
- Indirection Cost: Accessing heap data via a pointer involves an extra memory lookup compared to direct stack access, potentially leading to cache misses and a small performance penalty.
- Allocation Cost: Heap allocation and deallocation operations are generally slower than stack allocation.
Example:
fn main() { let stack_val = 5; // On the stack // Allocate an integer on the heap, owned by a mutable Box binding let mut boxed_val: Box<i32> = Box::new(stack_val); // The 'mut' allows the binding itself to be reassigned: // boxed_val = Box::new(7); // This would drop the original Box(5) and point to a new Box(7). // Access the value using immutable dereferencing println!("Initial value on heap: {}", *boxed_val); // Output: 5 // Mutate the value on the heap via mutable dereferencing // This requires `boxed_val` to be declared with `let mut` *boxed_val += 10; println!("Mutated value on heap: {}", *boxed_val); // Output: 15 // You can still work with the mutated value directly let added_val = *boxed_val + 10; println!("Heap value + 10: {}", added_val); // Output: 25 // Methods defined on i32 (taking &self) can often be called directly // on the Box<i32> due to automatic deref coercion. println!("Absolute value on heap: {}", boxed_val.abs()); // Output: 15 // Note: .abs() takes &self, so deref coercion works seamlessly. // Methods taking `self` like checked_add would need // explicit deref: (*boxed_val).checked_add(10) // `boxed_val` goes out of scope here. Its Drop implementation runs, // freeing the heap memory. }
Note: For specific advanced scenarios, particularly involving async
code or FFI where data must not be moved in memory after allocation, Pin<Box<T>>
is used. This provides guarantees about memory location stability.