Generics in Rust - Rust for C-Programmers

11.2 Generics: Abstracting Over Types

Generics allow you to write code parameterized by types. This means you can define functions, structs, enums, and methods that operate on values of various types without knowing the concrete type beforehand, while still benefiting from Rust’s compile-time type checking. This contrasts sharply with C’s approaches like macros (which lack type safety) or void* pointers (which require unsafe casting and manual type management).

Generic items use abstract type parameters (like T, U, etc.) as placeholders for concrete types. These parameters are declared inside angle brackets (<>) immediately following the name of the function, struct, enum, or impl block.

Key Points

Type Parameters: Declared within angle brackets (<>), commonly using single uppercase letters like T, U, V. These act as placeholders for concrete types.
Monomorphization: Rust compiles generic code into specialized versions for each concrete type used, resulting in efficient machine code equivalent to manually written specialized code (a “zero-cost abstraction”).
Flexibility and Reuse: Write algorithms and data structures once and apply them to many types. The compiler guarantees, through type checking and trait bounds, that the generic code is used correctly with the specific types provided at each call site.

11.2.1 Generic Functions

Functions can use generic type parameters for their arguments and return values. You declare these type parameters in angle brackets (<>) right after the function name. Optionally, you can restrict which types are allowed by specifying trait bounds using the colon (:) syntax after the type parameter name.

Once declared, you can use the type parameter (T in the examples below) within the function signature and body just like any concrete type name – for parameter types, return types, and even type annotations of local variables.

// Declares a generic type parameter 'T'. 'T' can be any type.
// 'T' is used as both the parameter type and the return type.
fn identity<T>(value: T) -> T {
    value
}

// Declares 'T' but restricts it: T must implement the 'PartialOrd' trait
// (which provides comparison operators like >).
// 'T' is used for both parameters and the return type.
fn max<T: PartialOrd>(a: T, b: T) -> T {
    if a > b {
        a
    } else {
        b
    }
}

fn main() {
    // When calling a generic function, the compiler usually infers the concrete
    // type for 'T' based on the arguments.
    let five = identity(5);      // Compiler infers T = i32
    let hello = identity("hello"); // Compiler infers T = &str

    println!("Max of 10, 20 is {}", max(10, 20)); // T = i32 satisfies PartialOrd
    println!("Max of 3.14, 1.61 is {}", max(3.14, 1.61)); // T = f64 satisfies PartialOrd

    // Why wouldn't max(10, 3.14) work?
    // let invalid_max = max(10, 3.14); // Compile-time error!
}

The call max(10, 3.14) would fail to compile for two primary reasons:

Single Generic Type Parameter T: The function signature fn max<T: PartialOrd>(a: T, b: T) -> T uses only one generic type parameter T. This requires both input arguments a and b to be of the exact same concrete type at the call site. In max(10, 3.14), the first argument 10 is inferred as i32 (or some integer type), while 3.14 is inferred as f64. Since i32 and f64 are different types, they cannot both substitute for the single parameter T.
PartialOrd Trait Bound: The PartialOrd trait bound (T: PartialOrd) enables the > comparison. The standard library implementation of PartialOrd for primitive types like i32 and f64 only defines comparison between values of the same type (e.g., i32 vs i32, or f64 vs f64). There is no built-in implementation to compare an i32 directly with an f64 using >. Even if the function were generic over two types (<T, U>), comparing T and U would require a specific trait implementation allowing such a cross-type comparison, which PartialOrd does not provide out-of-the-box.

11.2.2 Generic Structs and Enums

Structs and enums can also be defined with generic type parameters declared after their name. These parameters can then be used as types for fields within the definition.

// A generic Pair struct holding two values, possibly of different types T and U.
// T and U are used as the types for the fields 'first' and 'second'.
struct Pair<T, U> {
    first: T,
    second: U,
}

// The standard library Option enum is generic over the contained type T.
enum Option<T> {
    Some(T), // The Some variant holds a value of type T
    None,
}

// The standard library Result enum is generic over the success type T and error type E
enum Result<T, E> {
    Ok(T),   // Ok holds a value of type T
    Err(E),  // Err holds a value of type E
}

fn main() {
    // Instantiate generic types by providing concrete types.
    // Often, the compiler can infer the types from the values provided.
    let integer_pair = Pair { first: 5, second: 10 }; // Inferred T=i32, U=i32
    let mixed_pair = Pair { first: "hello", second: true }; // Inferred T=&str, U=bool

    // Explicitly specifying types using the 'turbofish' syntax ::<>
    let specific_pair = Pair::<u8, f32> { first: 255, second: 3.14 };

    // Alternatively, using type annotation on the variable binding
    let another_pair: Pair<i64, &str> = Pair { first: 1_000_000, second: "world" };

    println!("Integer Pair: ({}, {})", integer_pair.first, integer_pair.second);
    println!("Mixed Pair: ({}, {})", mixed_pair.first, mixed_pair.second);
    println!("Specific Pair: ({}, {})", specific_pair.first, specific_pair.second);
    println!("Another Pair: ({}, {})", another_pair.first, another_pair.second);
}

As shown in the main function, while Rust can often infer the concrete types for T and U when you create an instance of Pair, you can also specify them explicitly. This is done using the ::<> syntax (often called “turbofish”) immediately after the struct name (Pair::<u8, f32>) or by adding a type annotation to the variable declaration (let another_pair: Pair<i64, &str> = ...). Explicit annotation is necessary when inference is ambiguous or when you want to ensure a specific type is used (e.g., using u8 instead of the default i32 for an integer literal).

Standard library collections like Vec<T> (vector of T) and HashMap<K, V> (map from key K to value V) are prominent examples of generic types, providing type-safe containers.

11.2.3 Generic Methods

Methods can be defined on generic structs or enums using an impl block. When implementing methods for a generic type, you typically need to declare the same generic parameters on the impl keyword as were used on the type definition.

Consider the syntax impl<T, U> Pair<T, U> { ... }:

The first <T, U> after impl declares generic parameters T and U scope for this implementation block. This signifies that the implementation itself is generic.
The second <T, U> after Pair specifies that this block implements methods for the Pair type when it is parameterized by these same types T and U.

For implementing methods directly on the generic type (like Pair<T, U>), these parameter lists usually match. Methods within the impl block can then use T and U. Furthermore, methods themselves can introduce additional generic parameters specific to that method, if needed, which would be declared after the method name.

struct Pair<T, U> {
    first: T,
    second: U,
}

// The impl block is generic over T and U, matching the struct definition.
impl<T, U> Pair<T, U> {
    // This method uses the struct's generic types T and U.
    // It consumes the Pair<T, U> and returns a new Pair<U, T>.
    fn swap(self) -> Pair<U, T> {
        Pair {
            first: self.second, // Accessing fields of type U and T
            second: self.first,
        }
    }

    // Example of a method introducing its own generic parameter V
    // We add a trait bound 'Display' to ensure 'description' can be printed.
    fn describe<V: std::fmt::Display>(&self, description: V) {
        // Here, V is specific to this method, T and U come from the struct.
        println!("{}", description);
        // Cannot directly print self.first or self.second unless T/U implement Display
    }
}

fn main() {
    let pair = Pair { first: 5, second: 3.14 }; // Pair<i32, f64>
    let swapped_pair = pair.swap(); // Becomes Pair<f64, i32>
    println!("Swapped: ({}, {})", swapped_pair.first, swapped_pair.second);

    // Call describe; the type for V is inferred as &str which implements Display
    swapped_pair.describe("This is the swapped pair.");
}

11.2.4 Trait Bounds on Generics

Often, generic code needs to ensure that a type parameter T has certain capabilities (methods provided by traits). This is done using trait bounds, specified after a colon (:) when declaring the type parameter.

To require that a type implements multiple traits, you can use the + syntax. For example, T: Display + PartialOrd means T must implement both Display and PartialOrd.

use std::fmt::Display;

// Requires T to implement the Display trait so it can be printed with {}.
fn print_item<T: Display>(item: T) {
    println!("Item: {}", item);
}

// Requires T to implement both Display and PartialOrd using the '+' syntax.
fn compare_and_print<T: Display + PartialOrd>(a: T, b: T) {
    if a > b {
        println!("{} > {}", a, b);
    } else {
        println!("{} <= {}", a, b);
    }
}

fn main() {
    print_item(123); // Works because i32 implements Display
    compare_and_print(5, 3); // Works because i32 implements Display and PartialOrd
}

When trait bounds become numerous or complex, listing them inline can make function signatures hard to read. In these cases, you can use a where clause after the function signature to list the bounds separately, improving readability.

use std::fmt::Display;
struct Pair<T, U> { first: T, second: U }
// Assume Pair implements Display if T and U do (implementation not shown)
impl<T: Display, U: Display> Pair<T, U> { fn display(&self) { println!("({}, {})", self.first, self.second); } }

// Using a 'where' clause for clarity with multiple types and bounds.
fn process_items<T, U>(item1: T, item2: U)
where // 'where' starts the clause listing bounds
    T: Display + Clone, // Bounds for T
    U: Display + Copy,  // Bounds for U
{
    let item1_clone = item1.clone(); // Possible because T: Clone
    let item2_copied = item2; // Possible because U: Copy (implicit copy)
    println!("Item 1 (cloned): {}, Item 2 (copied): {}", item1_clone, item2_copied);
    // Original item1 is still available due to clone
    println!("Original Item 1: {}", item1);
}

fn main() {
    process_items(String::from("test"), 42); // String: Display+Clone, i32: Display+Copy
}

11.2.5 Const Generics

Rust also supports const generics, allowing generic parameters to be constant values (like integers, bools, or chars), most commonly used for array sizes. These are declared using const NAME: type within the angle brackets.

// Generic struct parameterized by type T and a constant N of type usize.
struct FixedArray<T, const N: usize> {
    data: [T; N], // Use N as the array size
}

// Implementation block requires T: Copy to initialize the array easily
impl<T: Copy, const N: usize> FixedArray<T, N> {
    // Constructor taking an initial value
    fn new(value: T) -> Self {
        // Creates an array [value, value, ..., value] of size N
        FixedArray { data: [value; N] }
    }
}

fn main() {
    // Create an array of 5 i32s, initialized to 0.
    // N is specified as 5. T is inferred as i32.
    let arr5: FixedArray<i32, 5> = FixedArray::new(0);

    // Create an array of 10 bools, initialized to true.
    // N is 10. T is inferred as bool.
    let arr10: FixedArray<bool, 10> = FixedArray::new(true);

    println!("Length of arr5: {}", arr5.data.len()); // Output: 5
    println!("Length of arr10: {}", arr10.data.len()); // Output: 10
}

Const generics allow encoding invariants like array sizes directly into the type system, enabling more compile-time checks.

11.2.6 Generics and Performance: Monomorphization

Rust implements generics using monomorphization. During compilation, the compiler generates specialized versions of the generic code for each concrete type used.

// Generic function
fn print<T: std::fmt::Display>(value: T) { println!("{}", value); }

fn main() {
    print(5);   // Compiler generates specialized code for T = i32
    print("hi"); // Compiler generates specialized code for T = &str
}

This means:

No Runtime Cost: Generic code runs just as fast as manually written specialized code because the specialization happens at compile time.
Potential Binary Size Increase: If generic code is used with many different concrete types, the compiled binary size might increase due to the duplicated specialized code. This is similar to the trade-off with C++ templates.

11.2.7 Comparison to C++ Templates

Rust generics are often compared to C++ templates:

Compile-Time Expansion: Both are expanded at compile time (monomorphization in Rust, template instantiation in C++).
Zero-Cost Abstraction: Both generally result in highly efficient specialized code with no runtime overhead compared to non-generic code.
Type Checking: Rust generics require trait bounds to be explicitly satisfied before monomorphization (using : or where clauses). This checks that the required methods/capabilities exist for the type parameter T itself. If the bounds are met, the generic function body is type-checked once abstractly. This typically leads to clearer error messages originating from the point of definition or the unsatisfied bound. C++ templates traditionally use “duck typing,” where type checking happens during instantiation. Errors might only surface deep within the template code when a specific operation fails for a given concrete type, sometimes leading to complex error messages.
Concepts vs. Traits: C++20 Concepts aim to provide similar pre-checking capabilities as Rust’s trait bounds, allowing constraints on template parameters to be specified and checked earlier.
Specialization: C++ templates support extensive specialization capabilities. Rust’s support for specialization is currently limited and considered unstable, though similar effects can sometimes be achieved using other mechanisms like trait object dispatch or careful trait implementation choices.