← Back to Blog

Heap Dict: The Art of Delayed Repair

In scenarios requiring both fast lookup and priority ordering, how do we design a data structure that balances both operations efficiently? Today we analyze an industrial-grade Heap Dict implementation to understand how designers use delayed repair to balance performance.

The Core Problem

Traditional heaps require O(log N) to maintain heap property after priority updates, while hash tables provide O(1) lookup but can't sort directly. When we need:

  • Get extremum by priority
  • Fast lookup and update by key

Design: Hash + Heap Hybrid

template <class TKey, class TPriority, ...>
class THeapDict {
    // Dual data structure: hash for position, heap for priority
    THashMap<TKey, size_t> PositionsInHeap;  // key -> heap position
    TVector<std::pair<TKey, TPriority>> Heap; // heap storage
    
    TMaybe<size_t> ModifiedPosition;  // delayed repair marker
};

Core Design: Delayed Repair

void ConsiderHeapifying() {
    if (!ModifiedPosition) {
        return;
    }
    Heapify(*ModifiedPosition);  // lazy rebuild
    ModifiedPosition.Clear();
}

When an element is modified, only the position is marked. The heap is actually repaired only when accessed next time.

Trade-off Analysis

Advantages

  • O(1) Lookup: Hash table for direct positioning
  • Delayed Repair: Reduces unnecessary heap adjustments
  • Update Friendly: Priority changes only need marking, no immediate rebuild

Costs

  • Space Overhead: Additional hash table maintenance
  • Complexity: Delayed repair increases code complexity
  • Consistency: Need to ensure timely access after modification

Clean Room Reimplementation: Go Implementation

package heapdict

import "container/heap"

type HeapItem struct {
    Key      string
    Priority int
}

type HeapDict struct {
    items       map[string]int
    positionMap map[string]int
    heap        []HeapItem
}

func New() *HeapDict {
    return &HeapDict{
        items:       make(map[string]int),
        positionMap: make(map[string]int),
        heap:        make([]HeapItem, 0),
    }
}

func (h *HeapDict) Push(key string, priority int) {
    if _, exists := h.items[key]; exists {
        h.items[key] = priority
        h.rebuild()
        return
    }
    
    h.items[key] = priority
    heap.Push(h, HeapItem{Key: key, Priority: priority})
}

func (h *HeapDict) Pop() HeapItem {
    item := heap.Pop(h).(HeapItem)
    delete(h.items, item.Key)
    delete(h.positionMap, item.Key)
    return item
}

func (h *HeapDict) rebuild() {
    h.heap = make([]HeapItem, 0, len(h.items))
    for key, priority := range h.items {
        h.heap = append(h.heap, HeapItem{Key: key, Priority: priority})
    }
    for i := len(h.heap)/2 - 1; i >= 0; i-- {
        h.siftDown(i)
    }
}

Summary

The delayed repair mechanism in Heap Dict embodies the classic lazy evaluation principle:

  1. Hybrid Data Structure: Hash table + heap, combining strengths
  2. Delayed Repair: Mark on modification, repair on access
  3. Trade-off: Space for time, reducing unnecessary computation

This design is particularly effective in scenarios requiring frequent priority updates (like schedulers, leaderboards).