10.3 Enums with Associated Data

The true power of Rust enums lies in their ability for variants to hold associated data. This allows an enum to represent a value that can be one of several different kinds of things, where each kind might carry different information. This effectively combines the concepts of C enums (choosing a kind) and C unions (storing data for different kinds) in a type-safe manner.

10.3.1 Defining Enums with Data

Variants can contain data similar to tuples or structs:

#[derive(Debug)] // Allow printing the enum
enum Message {
    Quit,                      // No associated data (unit-like variant)
    Move { x: i32, y: i32 },   // Data like a struct (named fields)
    Write(String),             // Data like a tuple struct (single String)
    ChangeColor(u8, u8, u8),   // Data like a tuple struct (three u8 values)
}

fn main() {
    // Creating instances of each variant
    let msg1 = Message::Quit;
    let msg2 = Message::Move { x: 10, y: 20 };
    let msg3 = Message::Write(String::from("Hello, Rust!"));
    let msg4 = Message::ChangeColor(255, 0, 128);

    println!("Message 1: {:?}", msg1);
    println!("Message 2: {:?}", msg2);
    println!("Message 3: {:?}", msg3);
    println!("Message 4: {:?}", msg4);
}
  • Variant Kinds:
    • Quit: A simple variant with no data.
    • Move: A struct-like variant with named fields x and y.
    • Write: A tuple-like variant containing a single String.
    • ChangeColor: A tuple-like variant containing three u8 values.

Each instance of the Message enum will hold either no data, or an x and y coordinate, or a String, or three u8 values, along with information identifying which variant it is.

10.3.2 Comparison with C Tagged Unions

To achieve a similar result in C, you typically use a combination of a struct, an enum (as a tag), and a union:

#include <stdio.h>
#include <stdlib.h> // For malloc/free
#include <string.h> // For strcpy

// 1. Enum to identify the active variant (the tag)
typedef enum { MSG_QUIT, MSG_MOVE, MSG_WRITE, MSG_CHANGE_COLOR } MessageType;

// 2. Structs to hold data for complex variants
typedef struct { int x; int y; } MoveData;
typedef struct { unsigned char r; unsigned char g; unsigned char b; } ChangeColorData;

// 3. Union to hold the data for different variants
typedef union {
    MoveData move_coords;
    char* write_text; // Using char* requires manual memory management
    ChangeColorData color_values;
    // Quit needs no data field in the union
} MessageData;

// 4. The main struct combining the tag and the union
typedef struct {
    MessageType type;
    MessageData data;
} Message;

// Helper function to create a Write message safely
Message create_write_message(const char* text) {
    Message msg;
    msg.type = MSG_WRITE;
    msg.data.write_text = malloc(strlen(text) + 1); // Allocate heap memory
    if (msg.data.write_text != NULL) {
        strcpy(msg.data.write_text, text); // Copy data
    } else {
        fprintf(stderr, "Memory allocation failed for text\n");
        msg.type = MSG_QUIT; // Revert to a safe state on error
    }
    return msg;
}

// Function to process messages (MUST check type before accessing data)
void process_message(Message msg) {
    switch (msg.type) {
        case MSG_QUIT:
            printf("Received Quit\n");
            break;
        case MSG_MOVE:
            // Access is safe *because* we checked msg.type
            printf("Received Move to x: %d, y: %d\n",
                   msg.data.move_coords.x, msg.data.move_coords.y);
            break;
        case MSG_WRITE:
            // Access is safe *because* we checked msg.type
            printf("Received Write: %s\n", msg.data.write_text);
            // CRUCIAL: Free the allocated memory when done with the message
            free(msg.data.write_text);
            msg.data.write_text = NULL; // Avoid double free
            break;
        case MSG_CHANGE_COLOR:
             // Access is safe *because* we checked msg.type
            printf("Received ChangeColor to R:%d, G:%d, B:%d\n",
            msg.data.color_values.r, msg.data.color_values.g, msg.data.color_values.b);
            break;
        default:
            printf("Unknown message type\n");
    }
}

int main() {
    Message quit_msg = { .type = MSG_QUIT }; // Designated initializer
    process_message(quit_msg);

    Message move_msg = { .type = MSG_MOVE, .data.move_coords = {100, 200} };
    process_message(move_msg);

    Message write_msg = create_write_message("Hello from C!");
    if(write_msg.type == MSG_WRITE) { // Check if creation succeeded
       process_message(write_msg); // Handles printing and freeing
    }

    // Potential Pitfall: Accessing the wrong union member is Undefined Behavior!
    // move_msg.type is MSG_MOVE, but if we accidentally read write_text...
    // printf("Incorrect access: %s\n", move_msg.data.write_text);// CRASH or garbage!

    return 0;
}
  • Complexity: Requires multiple definitions (enum, potentially structs, union, main struct).
  • Manual Tag Management: Programmer must manually synchronize the type tag and the data union.
  • Lack of Safety: The compiler does not prevent accessing the wrong field in the union. This relies entirely on programmer discipline.
  • Manual Memory Management: Heap-allocated data within the union (like write_text) requires manual malloc and free, risking leaks or use-after-free bugs.

10.3.3 Advantages of Rust’s Enums with Data

Rust’s approach elegantly solves the problems seen with C’s tagged unions:

  • Conciseness: A single enum definition handles variants and their data.
  • Type Safety: Compile-time checks prevent accessing data for the wrong variant.
  • Integrated Memory Management: Rust’s ownership automatically manages memory for data within variants (like String).
  • Pattern Matching: match provides a structured, safe way to access associated data.