19.4 Box<T>: Simple Heap Allocation

Box<T> is the most basic smart pointer, providing ownership of data allocated on the heap. Conceptually, Box<T> is a simple struct that holds a raw pointer to heap-allocated data of type T.

  • Creation: Box::new(value) allocates memory on the heap, moves value into that memory, and returns a Box<T> instance (which itself usually lives on the stack or in another structure).
  • Ownership: The Box<T> exclusively owns the heap-allocated data. Only one Box<T> points to a given allocation at a time (though ownership can be transferred via moves).
  • Deallocation: When the Box<T> goes out of scope, its Drop implementation is called, which deallocates the heap memory and drops the contained value T.

19.4.1 Key Features of Box<T>

  1. Exclusive Ownership: Ensures only one owner exists, aligning with Rust’s default ownership rules but for heap data.
  2. Heap Allocation: The primary way to explicitly put data on the heap in Rust.
  3. Known Size Pointer: A Box<T> always has the size of a pointer, regardless of the size of T. This is crucial for types whose size isn’t known at compile time (like trait objects) or for recursive types.
  4. Indirection: Provides a level of pointer indirection to access the data.
  5. Deref and DerefMut: Implements these traits, allowing a Box<T> to be dereferenced using * (e.g., *my_box) and enabling automatic deref coercions, so you can often call methods on T directly via the box (e.g., my_box.some_method()).
  6. Deref and DerefMut: Implements these traits. Deref allows a Box<T> to be treated like &T (e.g., using * for immutable access or calling methods via automatic deref coercion: my_box.some_method()). If the Box<T> binding itself is mutable (let mut my_box), DerefMut allows treating it like &mut T, enabling mutation of the heap-allocated value (e.g., *my_box = new_value; or my_box.some_mut_method()). It’s worth noting that the single mut keyword on the binding enables both the standard reassignment of the Box itself (my_box = Box::new(...)) and, thanks to DerefMut, the mutation of the value it points to.
  7. Minimal Overhead: Because Box<T> is essentially just a wrapper around a raw pointer, accessing the data via Box<T> involves the same level of indirection as a C pointer. There’s no additional overhead for the pointer access itself compared to a raw pointer, beyond the initial heap allocation cost.

19.4.2 Use Cases and Trade-Offs

Common Use Cases:

  1. Recursive Data Structures: To define types that need to contain pointers to themselves (e.g., nodes in a list or tree), Box<T> breaks the infinite size calculation at compile time by providing indirection with a known pointer size.
    #![allow(unused)]
    fn main() {
    enum List {
        Cons(i32, Box<List>),
        Nil,
    }
    }
  2. Trait Objects: To store an object implementing a specific trait when the concrete type isn’t known at compile time (dyn Trait). Box<dyn Trait> provides the necessary indirection and owns the unknown-sized object on the heap.
  3. Transferring Large Data: Moving a Box<T> is efficient because it only involves copying the pointer itself (which is small and typically resides on the stack or in a register), not the potentially large data structure located on the heap. This is much faster than moving the entire data structure if it were stack-allocated.
  4. Explicit Heap Placement: To avoid placing large data structures on the stack, preventing potential stack overflows, especially in constrained environments or deep recursion.

Trade-Offs:

  • Indirection Cost: Accessing heap data via a pointer involves an extra memory lookup compared to direct stack access, potentially leading to cache misses and a small performance penalty.
  • Allocation Cost: Heap allocation and deallocation operations are generally slower than stack allocation.

Example:

fn main() {
    let stack_val = 5; // On the stack

    // Allocate an integer on the heap, owned by a mutable Box binding
    let mut boxed_val: Box<i32> = Box::new(stack_val);

    // The 'mut' allows the binding itself to be reassigned:
    // boxed_val = Box::new(7);
    // This would drop the original Box(5) and point to a new Box(7).

    // Access the value using immutable dereferencing
    println!("Initial value on heap: {}", *boxed_val); // Output: 5

    // Mutate the value on the heap via mutable dereferencing
    // This requires `boxed_val` to be declared with `let mut`
    *boxed_val += 10;
    println!("Mutated value on heap: {}", *boxed_val); // Output: 15

    // You can still work with the mutated value directly
    let added_val = *boxed_val + 10;
    println!("Heap value + 10: {}", added_val); // Output: 25

    // Methods defined on i32 (taking &self) can often be called directly
    // on the Box<i32> due to automatic deref coercion.
    println!("Absolute value on heap: {}", boxed_val.abs()); // Output: 15
    // Note: .abs() takes &self, so deref coercion works seamlessly.
    // Methods taking `self` like checked_add would need
    // explicit deref: (*boxed_val).checked_add(10)

    // `boxed_val` goes out of scope here. Its Drop implementation runs,
    // freeing the heap memory.
}

Note: For specific advanced scenarios, particularly involving async code or FFI where data must not be moved in memory after allocation, Pin<Box<T>> is used. This provides guarantees about memory location stability.