The Book·Chapter 8·14 min

Common collections

Vec, String, HashMap. The handful of containers you use every day, with the ownership rules that bite.

The Rust standard library ships a small set of collections. Vec<T>, String, HashMap<K, V> cover 90% of cases. BTreeMap, HashSet, VecDeque cover most of the rest. Each one has predictable ownership rules and a with_capacity constructor. Knowing which collection to reach for, and which method on it to call, separates Rust that runs fast from Rust that allocates in every loop.

Vec<T>: a growable array

The default sequence container. Heap-allocated, contiguous storage, doubles in capacity when it fills:

let mut v: Vec<i32> = Vec::new();
v.push(1);
v.push(2);
v.push(3);
 
let v = vec![1, 2, 3];           // macro form for literals
let zeros = vec![0; 10];          // ten zeros
 
println!("{:?}", v);              // [1, 2, 3]
println!("{}", v[0]);             // 1
println!("{}", v.len());          // 3

Indexing with v[i] panics if i is out of bounds. Indexing with v.get(i) returns Option<&T>, which is the right choice when you cannot prove the bound at compile time.

Iteration uses references unless you say otherwise:

let v = vec![1, 2, 3];
 
for n in &v { println!("{}", n); }   // borrow; v still usable after
for n in &mut v { *n += 1; }          // mutable borrow
for n in v { println!("{}", n); }     // consumes v; cannot use after

That last form moves v into the loop. The values are owned i32s, so the original v is gone after the loop ends.

Ownership rules that bite

Pushing to a Vec can reallocate. If a reallocation moves the backing buffer, every reference into the old buffer becomes invalid. Rust prevents this at compile time:

let mut v = vec![1, 2, 3];
let first = &v[0];
v.push(4);                // error: cannot borrow `v` as mutable
println!("{}", first);    // because `first` borrows it immutably

The fix is to copy out the value (let first = v[0]; for Copy types) or to restructure so the borrow ends before the mutation.

with_capacity: pre-allocate when you know the size

A Vec doubles its capacity when full, which means a series of pushes can trigger multiple reallocations. If you know the size in advance, pre-allocate:

let mut out: Vec<String> = Vec::with_capacity(rows.len());
for row in rows {
    out.push(format!("row-{}", row.id));
}

with_capacity reserves space without changing the length. The pushes never reallocate. Same code, fewer allocations.

String has the same trick: String::with_capacity(n). So does HashMap::with_capacity(n).

String and &str: two distinct types

String owns a heap-allocated UTF-8 buffer. &str is a borrowed view: a pointer and length into UTF-8 bytes that someone else owns.

let owned: String = String::from("hello");
let borrowed: &str = &owned;
let literal: &'static str = "world";

String literals ("world") are &'static str: they live in the binary's read-only data and exist for the whole program.

The conversion direction matters:

  • String to &str: free. &owned[..] or just &owned (auto-deref).
  • &str to String: allocates. s.to_string() or String::from(s).

Function arguments that read a string should take &str. Function arguments that take ownership (because they will store the string somewhere) should take String. Defaulting to &str is almost always right.

UTF-8 and why s[0] is a compile error

Rust strings are UTF-8. A character is not a fixed number of bytes:

let s = "résumé";
println!("{}", s.len());          // 8, not 6 (é is 2 bytes)
println!("{}", s[0]);             // error[E0277]: `String` cannot be indexed by `{integer}`

The compiler refuses to let you index a string by integer, because the answer is meaningless. s[0] would either be a byte (u8, often invalid mid-character) or a char (variable cost to find). Rust forces you to choose:

let s = "résumé";
println!("{}", s.as_bytes()[0]);                       // 114 (the byte 'r')
println!("{:?}", s.chars().next());                    // Some('r')
println!("{}", s.chars().nth(2).unwrap());             // 's', but O(n) to find

Slicing with &s[0..3] works, but only if the range is on a character boundary. Otherwise it panics at runtime:

let s = "résumé";
let head = &s[0..2];              // ok: 'r' + first byte of 'é'... wait, that panics

The fix is to find boundaries via char_indices():

let s = "résumé";
let first_three: String = s.chars().take(3).collect();

Most "indexing" you actually want is "iterate characters" or "search by substring." Lean on those, not byte slicing.

HashMap<K, V>

Standard hash map. Keys must implement Eq and Hash:

use std::collections::HashMap;
 
let mut counts: HashMap<String, u32> = HashMap::new();
counts.insert("apple".to_string(), 3);
counts.insert("pear".to_string(), 1);
 
if let Some(n) = counts.get("apple") {
    println!("{}", n);
}

Insertion takes ownership of both key and value. The map cannot keep references to data it does not own. If you need to look up by a borrow without allocating, use .get(s) where s: &str works thanks to the Borrow trait:

let key: &str = "apple";
counts.get(key);                  // ok, no allocation

The entry API: get-or-insert without two lookups

The most useful HashMap pattern, and one agents miss constantly:

△ Bad
let mut counts: HashMap<String, u32> = HashMap::new();
for word in words {
  if counts.contains_key(word) {
      let count = counts.get_mut(word).unwrap();
      *count += 1;
  } else {
      counts.insert(word.to_string(), 1);
  }
}
// Two lookups in the hit path. Also allocates word.to_string() unconditionally.
◇ Good
let mut counts: HashMap<String, u32> = HashMap::new();
for word in words {
  *counts.entry(word.to_string()).or_insert(0) += 1;
}
// One lookup. Still allocates word.to_string(), but only because the key type is String.
// If you can store &str keys (with a matching lifetime), even that goes away.

The entry API returns an Entry enum: Occupied or Vacant. Methods on it:

map.entry(key).or_insert(default);                  // insert if absent
map.entry(key).or_insert_with(|| expensive_default()); // lazy default
map.entry(key).or_default();                         // insert V::default() if absent
map.entry(key).and_modify(|v| *v += 1).or_insert(1); // update or insert

This is the difference between idiomatic Rust and Rust that allocates twice per loop iteration.

BTreeMap and HashSet

BTreeMap<K, V> is the same shape as HashMap, but keys are sorted (so iteration is in key order) and operations are O(log n) instead of O(1). Use it when you need ordered iteration, range queries (map.range(low..high)), or when keys do not implement Hash but do implement Ord.

HashSet<T> is just HashMap<T, ()> with set-flavored methods (contains, insert, union, intersection, difference). BTreeSet<T> is the ordered equivalent.

VecDeque<T> is a double-ended queue (push/pop at either end is O(1)). Use it for FIFOs or sliding windows.

Capacity, length, allocation strategy

Every growable collection has two numbers:

  • Length (len()): how many elements are currently stored.
  • Capacity (capacity()): how many can be stored before reallocating.

The allocation strategy is: when you push and length equals capacity, allocate a new buffer at roughly 2x capacity, copy elements over, free the old buffer. The amortized cost of push is O(1), but individual pushes can be expensive.

shrink_to_fit() reduces capacity to length (rarely needed). clear() sets length to zero without freeing the buffer (reuse pattern).

Where agents go wrong