18.4 The HashMap<K, V>
Type
HashMap<K, V>
is Rust’s primary implementation of a hash map (also known as a hash table, dictionary, or associative array). It stores mappings from unique keys of type K
to associated values of type V
. It provides efficient average-case time complexity for insertion, retrieval, and removal operations, typically O(1)
.
To use HashMap
, you first need to bring it into scope:
#![allow(unused)] fn main() { use std::collections::HashMap; }
18.4.1 Key Characteristics
- Unordered: The iteration order of elements in a
HashMap
is arbitrary and depends on the internal hashing and layout. You should not rely on any specific order. The order might even change between different program runs. - Key Requirements: The key type
K
must implement theEq
(equality comparison) andHash
(hashing) traits. Most built-in types that can be meaningfully compared for equality, like integers, booleans,String
, and tuples composed of hashable types, satisfy these requirements. Floating-point types (f32
,f64
) do not implementHash
by default becauseNaN != NaN
and other precision issues make consistent hashing difficult. To use floats as keys, you typically need to wrap them in a custom struct that defines appropriateHash
andEq
implementations (e.g., by handlingNaN
explicitly or comparing based on bit patterns). - Hashing Algorithm: By default,
HashMap
uses SipHash 1-3, a cryptographically secure hashing algorithm designed to be resistant to Hash Denial-of-Service (HashDoS) attacks. These attacks involve an adversary crafting keys that deliberately cause many hash collisions, degrading the map’s performance toO(n)
. While secure, SipHash is slightly slower than simpler, non-cryptographic hashers. For performance-critical scenarios where HashDoS is not a concern (e.g., keys are not derived from external input), you can switch to a faster hasher using crates likefnv
orahash
. - Ownership:
HashMap
takes ownership of its keys and values. When you insert an owned type like aString
key or aVec<T>
value, that specific instance is moved into the map. If you insert types that implement theCopy
trait (likei32
), their values are copied into the map.
18.4.2 Creating and Populating a HashMap
#![allow(unused)] fn main() { use std::collections::HashMap; // Required import // Create an empty HashMap let mut scores: HashMap<String, i32> = HashMap::new(); // Insert key-value pairs using .insert() // Note: .to_string() creates an owned String from the &str literal scores.insert("Alice".to_string(), 95); scores.insert(String::from("Bob"), 88); // String::from also works // Create with initial capacity estimate let mut map_cap: HashMap<u64, i32> = HashMap::with_capacity(50); // Create from an iterator of tuples (K, V) let teams = vec![String::from("Blue"), String::from("Red")]; let initial_scores = vec![10, 50]; // zip combines the two iterators into an iterator of pairs // collect consumes the iterator and creates the HashMap let team_scores: HashMap<String, i32> = teams.into_iter().zip(initial_scores.into_iter()).collect(); }
18.4.3 Accessing Values
#![allow(unused)] fn main() { use std::collections::HashMap; let mut scores: HashMap<String, i32> = HashMap::new(); scores.insert(String::from("Alice"), 95); scores.insert(String::from("Bob"), 88); // Using .get(&key) for safe access (returns Option<&V>) // Note: .get() takes a reference to the key type. // If K is String, you can often pass &str due to borrowing rules. let alice_score: Option<&i32> = scores.get("Alice"); match alice_score { Some(score_ref) => println!("Alice's score: {}", score_ref), None => println!("Alice not found."), } // Using indexing map[key] -> &V or &mut V // Panics the current thread if the key is not found! // Use only when absolutely sure the key exists. // The Index trait implementation returns an immutable reference &V. let alice_ref: &i32 = &scores["Alice"]; println!("Alice's score via index: {}", alice_ref); // let charlie_ref = &scores["Charlie"]; // This would panic the current thread // Checking for key existence if scores.contains_key("Bob") { println!("Bob is in the map."); } }
Unlike Vec
, HashMap
indexing (map[key]
) does not return the value directly even if V
is Copy
. It always returns a reference (&V
for immutable indexing, &mut V
for mutable indexing). Dereferencing this reference (*map[key]
) will copy the value if V
is Copy
. The panic occurs if the key is absent, preventing access to non-existent data.
18.4.4 Updating and Removing Values
-
Overwriting with
insert
: If youinsert
a key that already exists, the old value is overwritten, andinsert
returnsSome(old_value)
. If the key was new, it returnsNone
.#![allow(unused)] fn main() { use std::collections::HashMap; let mut scores: HashMap<String, i32> = HashMap::new(); scores.insert(String::from("Alice"), 95); let old_alice = scores.insert("Alice".to_string(), 100); // Update Alice's score assert_eq!(old_alice, Some(95)); }
-
Conditional Insertion/Update with the
entry
API: Theentry
method is powerful for handling cases where you might need to insert a value only if the key doesn’t exist, or update an existing value.#![allow(unused)] fn main() { use std::collections::HashMap; let mut word_counts: HashMap<String, u32> = HashMap::new(); let text = "hello world hello"; for word in text.split_whitespace() { // entry(key) returns an Entry enum (Occupied or Vacant) // or_insert(default_value) gets a mutable ref to the existing value // or inserts the default and returns a mutable ref to the new value. let count: &mut u32 = word_counts.entry(word.to_string()).or_insert(0); *count += 1; // Dereference the mutable reference to increment the count } // word_counts is now {"hello": 2, "world": 1} (order may vary) println!("{:?}", word_counts); }
The
entry
API has other useful methods likeor_default()
(usesDefault::default()
if vacant) andand_modify()
(runs a closure to update if occupied). -
Removing with
remove
:remove(&key)
removes a key-value pair if the key exists, returningSome(value)
(the owned value). If the key doesn’t exist, it returnsNone
.#![allow(unused)] fn main() { use std::collections::HashMap; let mut scores: HashMap<String, i32> = HashMap::new(); scores.insert(String::from("Alice"), 95); if let Some(score) = scores.remove("Alice") { println!("Removed Alice with score: {}", score); // score is the owned i32 } }
18.4.5 Iteration
You can iterate over keys, values, or key-value pairs. Remember that the iteration order is not guaranteed.
#![allow(unused)] fn main() { use std::collections::HashMap; let scores: HashMap<String, i32> = HashMap::from([ ("Alice".to_string(), 95), ("Bob".to_string(), 88) ]); // Iterate over key-value pairs (yields immutable references: (&K, &V)) println!("Scores:"); for (name, score) in &scores { // or scores.iter() println!("- {}: {}", name, score); } // Iterate over keys only (yields immutable references: &K) println!("\nNames:"); for name in scores.keys() { println!("- {}", name); } // Iterate over values only (yields immutable references: &V) println!("\nValues:"); for score in scores.values() { println!("- {}", score); } // Iterate with mutable references to values: // let mut mutable_scores = scores; // Need mutable binding // for score in mutable_scores.values_mut() { *score += 1; } // for (key, value) in mutable_scores.iter_mut() { *value += 1; } }
18.4.6 Internal Details: Hashing, Collisions, and Resizing
Internally, HashMap
typically uses an array (often a Vec
) of buckets or slots. When inserting a key-value pair:
- The key is hashed to produce an integer.
- This hash is used to calculate an index (a bucket or slot) into the internal array.
- If the calculated slot is empty, the key-value pair (or information pointing to them) is stored there.
- If the calculated slot is already occupied by a different key (a hash collision, where multiple keys hash to the same initial slot index), the map must employ a collision resolution strategy. The two primary approaches are:
- Separate Chaining: Each slot in the array acts as the head of a secondary collection (like a linked list) that stores all the key-value pairs which hashed to that initial slot. Operations involve searching this secondary collection.
- Open Addressing: If the target slot is occupied, the map probes subsequent slots within the main array itself according to a defined sequence (e.g., linear probing, quadratic probing, double hashing) until an empty slot is found (for insertion), the key is found (for lookup), or it’s determined the key is absent.
Rust’s std::collections::HashMap
uses a highly optimized implementation of open addressing (specifically, a variation of Robin Hood probing adapted from the hashbrown
crate). This strategy tends to have better cache performance compared to separate chaining because elements are stored directly within the contiguous array, reducing pointer indirections.
To maintain efficient average O(1)
lookups, the HashMap
monitors its load factor (number of elements / number of slots). When the load factor exceeds a certain threshold (meaning the table is becoming too full, increasing collision probability), the map allocates a larger array of slots (resizing) and re-inserts all existing elements into the new, larger table using their hash values. This resizing operation takes O(n)
time but happens infrequently enough that the average (amortized) insertion time remains O(1)
.
18.4.7 Summary: HashMap
vs. C Hash Tables
Implementing hash tables manually in C requires significant effort: choosing or implementing a suitable hash function, designing an effective collision resolution strategy (like chaining or open addressing), writing the logic for resizing the table, and managing memory for the table structure, keys, and values. Using a third-party C library can help, but integration and ensuring type safety and memory safety still rely heavily on the programmer.
Rust’s HashMap<K, V>
provides:
- A ready-to-use, performant, and robust implementation.
- Automatic memory management for keys, values, and the internal table structure, preventing leaks.
- Compile-time type safety enforced by generics (
K
,V
). - A secure default hashing algorithm (SipHash 1-3) resistant to HashDoS attacks.
- Integration with Rust’s ownership and borrowing system, preventing dangling pointers to keys or values.
- Average
O(1)
performance for insertion, lookup, and removal, comparable to well-tuned C implementations but with built-in safety guarantees.