Intrusive Red-Black Trees: The Ultimate Balance of Memory Layout

In high-performance systems programming, we often face a brutal dichotomy: pursue extreme memory layout and cache friendliness, or enjoy the convenience of standard library containers?

When we use std::map or std::set (or similar structures in other languages), we often overlook their hidden costs: every insertion comes with a heap allocation. Thousands of tiny nodes scattered across the heap not only lead to severe memory fragmentation but also cripple CPU cache prefetching mechanisms. As we traverse the tree, the CPU is forced to "jump around" in memory—a phenomenon known as Pointer Chasing.

Today, we deconstruct a pattern widely used in industrial-grade infrastructure (like OS kernels, game engines, and HFT systems): the Intrusive Red-Black Tree. We'll use Zig to demonstrate how clever memory layout achieves zero extra allocations, cache friendliness, and $O(\log N)$ random access capabilities simultaneously.

What is "Intrusive"?

In standard (non-intrusive) containers, the container is responsible for allocating node memory and holding data. The data knows nothing about the container.

// Non-intrusive: Container wraps data
const Node = struct {
    left: ?*Node,
    right: ?*Node,
    data: UserData, // Data is copied or referenced
};

In an intrusive design, the data itself is the node. The business object "knows" it is part of a container and actively embeds the required linkage fields (like left, right, parent).

// Intrusive: Data embeds node
const UserData = struct {
    id: u64,
    name: []const u8,
    // Intrusive node, directly embedded in the struct
    rb_node: RbNode, 
};

This inversion brings massive advantages:

Zero Extra Allocations: Inserting an element requires no malloc for a new node because the node is already inside the object.
Cache Friendliness: Business data and tree linkage information are contiguous in memory.
Multiple Indexing: An object can embed multiple different nodes (e.g., rb_node, list_node), allowing it to exist in a Red-Black Tree and a Linked List simultaneously without complex back-pointer maintenance.

The Zig Implementation: The Magic of `fieldParentPtr`

In C++, intrusive containers are often implemented via complex templates and inheritance (CRTP). In Zig, we can handle this logic of "recovering the object from the node" much more elegantly and explicitly using fieldParentPtr.

First, define an embedded Red-Black Tree node. Note that we add a size field, which is crucial for the efficient indexing mentioned later.

const Color = enum { red, black };

/// Intrusive RB-Tree node, embedded in business objects
pub const RbNode = struct {
    parent: ?*RbNode = null,
    left: ?*RbNode = null,
    right: ?*RbNode = null,
    color: Color = .red,
    /// Subtree size (including self) for O(log N) random access
    size: usize = 1, 
};

Next is the container's core logic. Pay attention to the getEntry function, which uses pointer casting to restore the RbNode pointer back to the host object T pointer. This is the soul of intrusive containers.

pub fn IntrusiveRbTree(comptime T: type, comptime field_name: []const u8) type {
    return struct {
        const Self = @This();
        root: ?*RbNode = null,

        /// "Reverse" the node pointer back to the business object T
        fn getEntry(node: *RbNode) *T {
            return @fieldParentPtr(T, field_name, node);
        }

        fn getSize(node: ?*RbNode) usize {
            if (node) |n| return n.size;
            return 0;
        }

        // ... Insertion and Rotation Logic ...
    };
}

Core Feature: `size` Counting & $O(\log N)$ Random Access

Standard Red-Black Trees can only lookup by Key. What if we want to find the "100th smallest element"? Usually, that requires an $O(N)$ traversal.

In our design, each node maintains a size field (current node + left subtree + right subtree total count). This gives the Red-Black Tree array-like characteristics:

/// Get element at rank `index`, time complexity O(log N)
pub fn at(self: *Self, index: usize) ?*T {
    var curr = self.root;
    var idx = index;

    while (curr) |n| {
        const left_sz = getSize(n.left);
        if (idx < left_sz) {
            // Target is in the left subtree
            curr = n.left;
        } else if (idx == left_sz) {
            // Current node is the target
            return getEntry(n);
        } else {
            // Target is in the right subtree
            // adjust index by subtracting left subtree + current
            idx -= left_sz + 1;
            curr = n.right;
        }
    }
    return null;
}

This design makes the structure perfect for scenarios needing frequent access by "rank" or "index" (like leaderboards, random sampling of ordered sequences), while maintaining the Red-Black Tree's $O(\log N)$ insertion and deletion performance.

Of course, the cost is that every rotation and recoloring operation must update the size on the path, but this is merely a constant-factor overhead.

Full Demonstration

Let's see how to use it in actual code:

const std = @import("std");

// Business Object
const Monster = struct {
    hp: i32,
    attack: i32,
    // Embed node to make it part of the tree
    node: RbNode = .{},

    // Comparison function defines tree order
    pub fn compare(self: *Monster, other: *Monster) i32 {
        if (self.hp < other.hp) return -1;
        if (self.hp > other.hp) return 1;
        return 0;
    }
};

test "intrusive tree demo" {
    // Define an RB-Tree using the "node" field hook
    var tree = IntrusiveRbTree(Monster, "node"){};

    // Objects allocated on stack or anywhere, container doesn't care
    var m1 = Monster{ .hp = 100, .attack = 10 };
    var m2 = Monster{ .hp = 200, .attack = 50 };
    var m3 = Monster{ .hp = 50, .attack = 5 };

    // Insert pointers, zero memory allocation
    tree.insert(&m1);
    tree.insert(&m2);
    tree.insert(&m3);

    // Verify Order (Sorted by HP: 50, 100, 200)
    // Get 1st element (index 1), should be m1 with HP=100
    const found = tree.at(1); 
    try std.testing.expect(found.?.hp == 100);
}

Design Trade-offs & Conclusion

This design is not without cost.

Pros:

Extreme Performance: Zero allocation, high cache locality.
Powerful Features: Supports both Key lookup and Index lookup.
Flexibility: Object lifecycle controlled by user, not bound by container.

Cons:

Intrusiveness: Your business struct must be "polluted" with tree node fields.
Safety: Users must ensure objects are not destroyed before removal from the tree (though Zig's ownership model helps, care is still needed).

In systems programming, when performance hits a bottleneck, intrusive data structures are often the sledgehammer that breaks the ceiling. Zig, with fieldParentPtr and explicit memory modeling, makes this ancient and hardcore technique clearer and safer than ever.