Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

18.4 The HashMap<K, V> Type

HashMap<K, V> is Rust’s primary implementation of a hash map (also known as a hash table, dictionary, or associative array). It stores mappings from unique keys of type K to associated values of type V. It provides efficient average-case time complexity for insertion, retrieval, and removal operations, typically O(1).

To use HashMap, you first need to bring it into scope:

#![allow(unused)]
fn main() {
use std::collections::HashMap;
}

18.4.1 Key Characteristics

  • Unordered: The iteration order of elements in a HashMap is arbitrary and depends on the internal hashing and layout. You should not rely on any specific order. The order might even change between different program runs.
  • Key Requirements: The key type K must implement the Eq (equality comparison) and Hash (hashing) traits. Most built-in types that can be meaningfully compared for equality, like integers, booleans, String, and tuples composed of hashable types, satisfy these requirements. Floating-point types (f32, f64) do not implement Hash by default because NaN != NaN and other precision issues make consistent hashing difficult. To use floats as keys, you typically need to wrap them in a custom struct that defines appropriate Hash and Eq implementations (e.g., by handling NaN explicitly or comparing based on bit patterns).
  • Hashing Algorithm: By default, HashMap uses SipHash 1-3, a cryptographically secure hashing algorithm designed to be resistant to Hash Denial-of-Service (HashDoS) attacks. These attacks involve an adversary crafting keys that deliberately cause many hash collisions, degrading the map’s performance to O(n). While secure, SipHash is slightly slower than simpler, non-cryptographic hashers. For performance-critical scenarios where HashDoS is not a concern (e.g., keys are not derived from external input), you can switch to a faster hasher using crates like fnv or ahash.
  • Ownership: HashMap takes ownership of its keys and values. When you insert an owned type like a String key or a Vec<T> value, that specific instance is moved into the map. If you insert types that implement the Copy trait (like i32), their values are copied into the map.

18.4.2 Creating and Populating a HashMap

#![allow(unused)]
fn main() {
use std::collections::HashMap; // Required import

// Create an empty HashMap
let mut scores: HashMap<String, i32> = HashMap::new();

// Insert key-value pairs using .insert()
// Note: .to_string() creates an owned String from the &str literal
scores.insert("Alice".to_string(), 95);
scores.insert(String::from("Bob"), 88); // String::from also works

// Create with initial capacity estimate
let mut map_cap: HashMap<u64, i32> = HashMap::with_capacity(50);

// Create from an iterator of tuples (K, V)
let teams = vec![String::from("Blue"), String::from("Red")];
let initial_scores = vec![10, 50];
// zip combines the two iterators into an iterator of pairs
// collect consumes the iterator and creates the HashMap
let team_scores: HashMap<String, i32> =
    teams.into_iter().zip(initial_scores.into_iter()).collect();
}

18.4.3 Accessing Values

#![allow(unused)]
fn main() {
use std::collections::HashMap;
let mut scores: HashMap<String, i32> = HashMap::new();
scores.insert(String::from("Alice"), 95);
scores.insert(String::from("Bob"), 88);

// Using .get(&key) for safe access (returns Option<&V>)
// Note: .get() takes a reference to the key type.
// If K is String, you can often pass &str due to borrowing rules.
let alice_score: Option<&i32> = scores.get("Alice");
match alice_score {
    Some(score_ref) => println!("Alice's score: {}", score_ref),
    None => println!("Alice not found."),
}

// Using indexing map[key] -> &V or &mut V
// Panics the current thread if the key is not found!
// Use only when absolutely sure the key exists.
// The Index trait implementation returns an immutable reference &V.
let alice_ref: &i32 = &scores["Alice"];
println!("Alice's score via index: {}", alice_ref);

// let charlie_ref = &scores["Charlie"]; // This would panic the current thread

// Checking for key existence
if scores.contains_key("Bob") {
    println!("Bob is in the map.");
}
}

Unlike Vec, HashMap indexing (map[key]) does not return the value directly even if V is Copy. It always returns a reference (&V for immutable indexing, &mut V for mutable indexing). Dereferencing this reference (*map[key]) will copy the value if V is Copy. The panic occurs if the key is absent, preventing access to non-existent data.

18.4.4 Updating and Removing Values

  • Overwriting with insert: If you insert a key that already exists, the old value is overwritten, and insert returns Some(old_value). If the key was new, it returns None.

    #![allow(unused)]
    fn main() {
    use std::collections::HashMap;
    let mut scores: HashMap<String, i32> = HashMap::new();
    scores.insert(String::from("Alice"), 95);
    let old_alice = scores.insert("Alice".to_string(), 100); // Update Alice's score
    assert_eq!(old_alice, Some(95));
    }
  • Conditional Insertion/Update with the entry API: The entry method is powerful for handling cases where you might need to insert a value only if the key doesn’t exist, or update an existing value.

    #![allow(unused)]
    fn main() {
    use std::collections::HashMap;
    let mut word_counts: HashMap<String, u32> = HashMap::new();
    let text = "hello world hello";
    
    for word in text.split_whitespace() {
        // entry(key) returns an Entry enum (Occupied or Vacant)
        // or_insert(default_value) gets a mutable ref to the existing value
        // or inserts the default and returns a mutable ref to the new value.
        let count: &mut u32 = word_counts.entry(word.to_string()).or_insert(0);
        *count += 1; // Dereference the mutable reference to increment the count
    }
    // word_counts is now {"hello": 2, "world": 1} (order may vary)
    println!("{:?}", word_counts);
    }

    The entry API has other useful methods like or_default() (uses Default::default() if vacant) and and_modify() (runs a closure to update if occupied).

  • Removing with remove: remove(&key) removes a key-value pair if the key exists, returning Some(value) (the owned value). If the key doesn’t exist, it returns None.

    #![allow(unused)]
    fn main() {
    use std::collections::HashMap;
    let mut scores: HashMap<String, i32> = HashMap::new();
    scores.insert(String::from("Alice"), 95);
    if let Some(score) = scores.remove("Alice") {
        println!("Removed Alice with score: {}", score); // score is the owned i32
    }
    }

18.4.5 Iteration

You can iterate over keys, values, or key-value pairs. Remember that the iteration order is not guaranteed.

#![allow(unused)]
fn main() {
use std::collections::HashMap;
let scores: HashMap<String, i32> = HashMap::from([
    ("Alice".to_string(), 95), ("Bob".to_string(), 88)
]);

// Iterate over key-value pairs (yields immutable references: (&K, &V))
println!("Scores:");
for (name, score) in &scores { // or scores.iter()
    println!("- {}: {}", name, score);
}

// Iterate over keys only (yields immutable references: &K)
println!("\nNames:");
for name in scores.keys() {
    println!("- {}", name);
}

// Iterate over values only (yields immutable references: &V)
println!("\nValues:");
for score in scores.values() {
    println!("- {}", score);
}

// Iterate with mutable references to values:
// let mut mutable_scores = scores; // Need mutable binding
// for score in mutable_scores.values_mut() { *score += 1; }
// for (key, value) in mutable_scores.iter_mut() { *value += 1; }
}

18.4.6 Internal Details: Hashing, Collisions, and Resizing

Internally, HashMap typically uses an array (often a Vec) of buckets or slots. When inserting a key-value pair:

  1. The key is hashed to produce an integer.
  2. This hash is used to calculate an index (a bucket or slot) into the internal array.
  3. If the calculated slot is empty, the key-value pair (or information pointing to them) is stored there.
  4. If the calculated slot is already occupied by a different key (a hash collision, where multiple keys hash to the same initial slot index), the map must employ a collision resolution strategy. The two primary approaches are:
    • Separate Chaining: Each slot in the array acts as the head of a secondary collection (like a linked list) that stores all the key-value pairs which hashed to that initial slot. Operations involve searching this secondary collection.
    • Open Addressing: If the target slot is occupied, the map probes subsequent slots within the main array itself according to a defined sequence (e.g., linear probing, quadratic probing, double hashing) until an empty slot is found (for insertion), the key is found (for lookup), or it’s determined the key is absent.

Rust’s std::collections::HashMap uses a highly optimized implementation of open addressing (specifically, a variation of Robin Hood probing adapted from the hashbrown crate). This strategy tends to have better cache performance compared to separate chaining because elements are stored directly within the contiguous array, reducing pointer indirections.

To maintain efficient average O(1) lookups, the HashMap monitors its load factor (number of elements / number of slots). When the load factor exceeds a certain threshold (meaning the table is becoming too full, increasing collision probability), the map allocates a larger array of slots (resizing) and re-inserts all existing elements into the new, larger table using their hash values. This resizing operation takes O(n) time but happens infrequently enough that the average (amortized) insertion time remains O(1).

18.4.7 Summary: HashMap vs. C Hash Tables

Implementing hash tables manually in C requires significant effort: choosing or implementing a suitable hash function, designing an effective collision resolution strategy (like chaining or open addressing), writing the logic for resizing the table, and managing memory for the table structure, keys, and values. Using a third-party C library can help, but integration and ensuring type safety and memory safety still rely heavily on the programmer.

Rust’s HashMap<K, V> provides:

  • A ready-to-use, performant, and robust implementation.
  • Automatic memory management for keys, values, and the internal table structure, preventing leaks.
  • Compile-time type safety enforced by generics (K, V).
  • A secure default hashing algorithm (SipHash 1-3) resistant to HashDoS attacks.
  • Integration with Rust’s ownership and borrowing system, preventing dangling pointers to keys or values.
  • Average O(1) performance for insertion, lookup, and removal, comparable to well-tuned C implementations but with built-in safety guarantees.