The Price of Minimalism: The Art of Memory Layout in Compact Vectors
In systems programming, every byte counts. While the standard library's Vec<T> is versatile and powerful, its memory layout can be too "luxurious" for certain edge cases.
The "Obesity" of Standard Vectors
A standard Vec<T> (or std::vector in C++) typically occupies 24 bytes (on a 64-bit system):
- Pointer (8 bytes)
- Capacity (8 bytes)
- Length (8 bytes)
Consider a graph with millions of nodes, where each node has an adjacency list. If the graph is sparse, most adjacency lists might be empty. Even when empty, each node pays the 24-byte tax. For 100 million nodes, that's 2.4 GB of memory overhead—just to represent "nothing."
Compact Design: Header Pre-allocation
To solve this, we can adopt a more compact design: hide the metadata (capacity and length) in a header allocated on the heap, keeping only a single pointer on the stack.
This design is known as a Compact Vector or Slim Vector.
- Empty State: Just a null pointer (8 bytes). One-third the overhead of a standard vector.
- Non-Empty State: The pointer points to a heap memory block. The header of this block stores
capacityandlen, immediately followed by the actual data.
Rust Implementation Walkthrough
To implement this structure in Rust, we need to play with unsafe and raw pointer layouts.
use std::alloc::{alloc, dealloc, realloc, Layout};
use std::ptr::{self, NonNull};
use std::marker::PhantomData;
// Metadata hidden before the data
#[repr(C)]
struct Header {
len: usize,
cap: usize,
}
pub struct SlimVec<T> {
// Pointer to Header, data follows immediately
// If None, the vector is empty
ptr: Option<NonNull<Header>>,
_marker: PhantomData<T>,
}
impl<T> SlimVec<T> {
pub fn new() -> Self {
Self { ptr: None, _marker: PhantomData }
}
pub fn push(&mut self, elem: T) {
if let Some(header_ptr) = self.ptr {
unsafe {
let header = header_ptr.as_ptr();
if (*header).len == (*header).cap {
self.grow();
}
// Write data
let header = self.ptr.unwrap().as_ptr();
let data_ptr = (header as *mut u8).add(std::mem::size_of::<Header>()) as *mut T;
ptr::write(data_ptr.add((*header).len), elem);
(*header).len += 1;
}
} else {
self.allocate_initial(elem);
}
}
// ... Complex grow, allocate_initial and Drop implementations omitted ...
}
// Just to demonstrate layout size
fn main() {
println!("Standard Vec size: {}", std::mem::size_of::<Vec<i32>>());
println!("SlimVec size: {}", std::mem::size_of::<SlimVec<i32>>());
}
The output would be:
Standard Vec size: 24
SlimVec size: 8
What's the Catch?
There is no free lunch. While we save memory, we pay in CPU instructions:
- Indirection: Accessing length or capacity is no longer a direct read from a stack variable; it requires dereferencing a pointer to access heap memory (potential Cache Miss).
- Pointer Arithmetic: Every element access requires skipping the
Header, adding an extra addition operation. - Branching: Every operation must check if the pointer is null.
Conclusion
The Compact Vector is a classic "space-time trade-off"—or more accurately, "sacrificing tiny access latency for massive memory savings." In scenarios like graph computing or DOM tree construction involving massive amounts of small objects, this optimization is worth it. It reminds us: the standard library is the optimal general-purpose solution, but not necessarily the only solution for specific domains.