← Back to Blog

Disjoint Interval Tree: Efficient Interval Management

When handling interval queries and interval merging, Disjoint Interval Tree is an efficient data structure. Today we dive into an industrial-grade disjoint interval tree implementation, exploring its ordered map-based design wisdom.

Scenarios and Requirements

Disjoint interval tree application scenarios:

  • Time scheduling: Managing non-overlapping time slots
  • Memory management: Tracking allocated/free memory blocks
  • Interval queries: Quickly determining if a point is in an interval

Core requirements:

  • Efficient insertion: Support single element and interval insertion
  • Automatic merging: Automatically merge adjacent intervals
  • Efficient queries: Quickly determine if point exists
  • Interval erasure: Support point and interval erasure

Solution: Ordered Map Storage

The industrial implementation uses a clever design:

// Core design: using ordered map
using TTree = TMap<T, T>;  // key = interval start, value = interval end (exclusive)

// Invariant: intervals never overlap

Core Design

  1. Interval Representation

    • Interval [a, b) represented as (a, b)
    • key = start, value = end (exclusive)
    • All intervals sorted by start, never overlap
  2. Insertion with Merging

    // Try to extend left interval
    if (p != Tree.end() && p->second == begin) {
        p->second = end;
        // Try to merge adjacent intervals
        if (next != Tree.end() && next->first == end) {
            p->second = next->second;
            Tree.erase(next);
        }
    }
  3. Point Query

    // Find first interval with start > t, then check previous
    TIterator l = Tree.lower_bound(t);
    if (l != Tree.begin()) {
        --l;
        if (l->first < t && t < l->second) {
            return l;  // Found
        }
    }
  4. Interval Erasure

    • May need to split interval
    • Handle three cases: erase at start, end, or middle

Trade-off Analysis

Advantages

  1. O(log n) lookup: Uses map's ordering
  2. Automatic merging: Adjacent intervals merge automatically
  3. Space efficient: Great when interval count << element count
  4. Rich operations: Supports various interval operations

Costs

  1. Memory overhead: Needs map node storage
  2. Complex implementation: Interval merging/splitting logic complex
  3. Invariant maintenance: Must ensure intervals never overlap

Use Cases

  • Frequent point-in-interval queries
  • Interval count much less than element count
  • Need efficient interval merging

Clean Room Reimplementation: Python Implementation

Here's the same design philosophy demonstrated in Python:

class DisjointIntervalTree:
    """Disjoint interval tree"""
    
    def __init__(self):
        self.intervals = []  # sorted list of (start, end)
        self.num_elements = 0
    
    def insert_interval(self, begin, end):
        """Insert interval [begin, end)"""
        # Try to merge with adjacent intervals
        # ...
    
    def has(self, t):
        """Check if point exists"""
        idx = bisect_left(self.intervals, (t + 1, float('inf'))) - 1
        if idx >= 0:
            start, inter_end = self.intervals[idx]
            return start <= t < inter_end
        return False
    
    def erase(self, t):
        """Erase single point, may split interval"""
        # Find containing interval, split based on position
        # ...

The output validates the design:

=== Disjoint Interval Tree Demo (Python) ===

--- Test 1: Insert Elements ---
Insert 5: [(5, 6)], elements=1
Insert 1: [(1, 2), (5, 6)], elements=2
Insert 3: [(3, 4), (1, 2), (5, 6)], elements=3

--- Test 2: Has ---
Has 1: True
Has 3: True
Has 10: False

--- Test 3: Erase ---
Before erase 3: [(1, 2), (3, 4), (5, 6)]
After erase 3: [(1, 2), (5, 6)]

Summary

Disjoint interval tree design embodies simplicity + efficiency balance:

  1. Ordered map: Trade space for time, O(log n) lookup
  2. Automatic merging: Intelligent merge on insertion
  3. Interval splitting: Correct handling of three erasure cases
  4. Non-overlapping invariant: Simplifies query logic

This design isn't a silver bullet, but for interval management scenarios, it's a very elegant trade-off choice.