Compact Vector: The Art of Memory Layout Optimization
In embedded systems or memory-sensitive scenarios, every byte counts. Today we analyze an industrial-grade Compact Vector implementation to understand how designers balance space and performance.
The Core Problem
Standard library vectors typically occupy 24 bytes when empty (three pointers: begin, end, capacity). In scenarios requiring大量空容器的创建,这会造成严重的内存浪费。
Design: Embedded Metadata
// vector that is 8 bytes when empty (TVector is 24 bytes)
template <typename T>
class TCompactVector {
private:
struct THeader {
size_t Size;
size_t Capacity;
};
T* Ptr; // Empty state only 8 bytes
THeader* Header() {
return ((THeader*)Ptr) - 1; // Metadata stored before data
}
};
Key Design Decisions
- Empty State Compression: Only stores one pointer (8 bytes), metadata allocated on demand
- Embedded Metadata: Header stored before the data pointer, located via pointer arithmetic
- Copy-based Growth: Creates copy and swaps when capacity needs to expand
Trade-off Analysis
Advantages
- Extreme Space: Empty vector is 8 bytes vs standard 24 bytes
- Cache Friendly: Metadata and data contiguous, improving cache hit rate
- Simple Interface: Compatible with standard vector
Costs
- Pointer Arithmetic Overhead: Each access requires calculating real address
- Allocation Complexity: Additional handling for header memory
- Copy Cost: Uses copy instead of move during reserve
Clean Room Reimplementation: Rust Implementation
use std::alloc::{alloc, dealloc, Layout};
use std::ptr::NonNull;
pub struct CompactVector<T> {
ptr: Option<NonNull<T>>,
}
struct Header {
size: usize,
capacity: usize,
}
impl<T> CompactVector<T> {
pub fn new() -> Self {
CompactVector { ptr: None }
}
fn header(&self) -> Option<&Header> {
unsafe {
self.ptr.map(|p| {
let header_ptr = (p.as_ptr() as *mut Header).offset(-1);
&*header_ptr
})
}
}
pub fn size(&self) -> usize {
self.header().map(|h| h.size).unwrap_or(0)
}
pub fn capacity(&self) -> usize {
self.header().map(|h| h.capacity).unwrap_or(0)
}
pub fn push(&mut self, value: T) {
self.reserve(self.size() + 1);
unsafe {
let ptr = self.ptr.unwrap().as_ptr();
let idx = self.header_mut().unwrap().size;
ptr.add(idx).write(value);
self.header_mut().unwrap().size += 1;
}
}
pub fn reserve(&mut self, new_cap: usize) {
if new_cap <= self.capacity() {
return;
}
// Allocate new memory with embedded header...
}
}
Summary
The compact vector design embodies the classic space-time trade-off:
- Empty State Compression: From 24 bytes to 8 bytes, suitable for scenarios with many empty containers
- Embedded Metadata: Achieves compact layout through pointer arithmetic
- Copy-based Growth: Simplifies implementation but has performance overhead
This design is particularly effective in memory-constrained embedded environments or scenarios requiring large numbers of containers.
Series: Arch (82/90)
View
▼