← Back to Blog

Compact Vector: The Art of Memory Layout Optimization

In embedded systems or memory-sensitive scenarios, every byte counts. Today we analyze an industrial-grade Compact Vector implementation to understand how designers balance space and performance.

The Core Problem

Standard library vectors typically occupy 24 bytes when empty (three pointers: begin, end, capacity). In scenarios requiring大量空容器的创建,这会造成严重的内存浪费。

Design: Embedded Metadata

// vector that is 8 bytes when empty (TVector is 24 bytes)

template <typename T>
class TCompactVector {
private:
    struct THeader {
        size_t Size;
        size_t Capacity;
    };

    T* Ptr;  // Empty state only 8 bytes

    THeader* Header() {
        return ((THeader*)Ptr) - 1;  // Metadata stored before data
    }
};

Key Design Decisions

  1. Empty State Compression: Only stores one pointer (8 bytes), metadata allocated on demand
  2. Embedded Metadata: Header stored before the data pointer, located via pointer arithmetic
  3. Copy-based Growth: Creates copy and swaps when capacity needs to expand

Trade-off Analysis

Advantages

  • Extreme Space: Empty vector is 8 bytes vs standard 24 bytes
  • Cache Friendly: Metadata and data contiguous, improving cache hit rate
  • Simple Interface: Compatible with standard vector

Costs

  • Pointer Arithmetic Overhead: Each access requires calculating real address
  • Allocation Complexity: Additional handling for header memory
  • Copy Cost: Uses copy instead of move during reserve

Clean Room Reimplementation: Rust Implementation

use std::alloc::{alloc, dealloc, Layout};
use std::ptr::NonNull;

pub struct CompactVector<T> {
    ptr: Option<NonNull<T>>,
}

struct Header {
    size: usize,
    capacity: usize,
}

impl<T> CompactVector<T> {
    pub fn new() -> Self {
        CompactVector { ptr: None }
    }

    fn header(&self) -> Option<&Header> {
        unsafe {
            self.ptr.map(|p| {
                let header_ptr = (p.as_ptr() as *mut Header).offset(-1);
                &*header_ptr
            })
        }
    }

    pub fn size(&self) -> usize {
        self.header().map(|h| h.size).unwrap_or(0)
    }

    pub fn capacity(&self) -> usize {
        self.header().map(|h| h.capacity).unwrap_or(0)
    }

    pub fn push(&mut self, value: T) {
        self.reserve(self.size() + 1);
        
        unsafe {
            let ptr = self.ptr.unwrap().as_ptr();
            let idx = self.header_mut().unwrap().size;
            ptr.add(idx).write(value);
            self.header_mut().unwrap().size += 1;
        }
    }

    pub fn reserve(&mut self, new_cap: usize) {
        if new_cap <= self.capacity() {
            return;
        }
        // Allocate new memory with embedded header...
    }
}

Summary

The compact vector design embodies the classic space-time trade-off:

  1. Empty State Compression: From 24 bytes to 8 bytes, suitable for scenarios with many empty containers
  2. Embedded Metadata: Achieves compact layout through pointer arithmetic
  3. Copy-based Growth: Simplifies implementation but has performance overhead

This design is particularly effective in memory-constrained embedded environments or scenarios requiring large numbers of containers.