← Back to Blog

Lock-Free MPMC Ring Buffer: The Pawl Design

In concurrent programming, MPMC (Multiple Producer Multiple Consumer) queues are one of the most complex queue types. Today we dive into an industrial-grade lock-free MPMC ring buffer implementation, exploring its Pawl (Word Lock Free) design wisdom.

The MPMC Challenge

Compared to MPSC (Single Consumer), MPMC faces core challenges:

  • Multi-consumer competition: Multiple consumers may try to grab the same element
  • Order uncertainty: Speed differences between producers/consumers cause out-of-order execution
  • Empty/Full detection complexity: Precise counting needed to determine queue state

Solution: Pawl Mechanism

The industrial implementation uses a unique Pawl (Producer/Consumer Word Lock Free) mechanism:

// Core design: dual pointer system
ui64 WritePawl = 0;   // Producer竞争指针
ui64 WriteFront = 0;  // Producer实际槽位
ui64 ReadPawl = 0;    // Consumer竞争指针
ui64 ReadFront = 0;   // Consumer实际槽位

Core Design

  1. Pawl vs Front Separation

    • Pawl used for CAS competition (ensures lock-free)
    • Front used for actual slot calculation
    • This separation allows producers/consumers to advance lock-free
  2. WeakPush

    bool WeakPush(void* msg) noexcept {
        auto pawl = AtomicIncrement(WritePawl);
        if (pawl - AtomicGet(ReadFront) >= RingSize) {
            // Queue full, rollback
            AtomicDecrement(WritePawl);
            return false;
        }
        auto slot = AtomicGetAndIncrement(WriteFront);
        // CAS set slot
        return AtomicCas(&RingBuffer[slot % RingSize], msg, nullptr);
    }
  3. Retry Mechanism

    • Try up to MAX_PUSH_TRIES (4 times)
    • Can choose blocking push (StubbornPush) on failure
  4. No Order Guarantee

    • No FIFO guarantee
    • Mostly ordered, but elements may "stick"
    • Provides UnsafeScanningPop to recover stuck elements

Trade-off Analysis

Advantages

  1. Almost lock-free: Producers and consumers rarely need to block
  2. High performance: Ring buffer + atomic operations
  3. No complex synchronization: Simpler compared to other MPMC implementations

Costs

  1. No order guarantee: No FIFO
  2. Elements may stick: Elements may stuck in almost-empty queue
  3. Extra scanning needed: UnsafeScanningPop adds complexity

Use Cases

  • High-throughput scenarios
  • Order-insensitive task distribution
  • Scenarios requiring near lock-free high performance

Clean Room Reimplementation: Python Implementation

Here's the same design philosophy demonstrated in Python:

import threading
from typing import Optional, Any

class AtomicCounter:
    """Simulated atomic counter"""
    def __init__(self, initial: int = 0):
        self._value = initial
        self._lock = threading.Lock()
    
    def fetch_add(self, increment: int = 1) -> int:
        with self._lock:
            old = self._value
            self._value += increment
            return old
    
    def load(self) -> int:
        with self._lock:
            return self._value

class MPMCUnorderedRing:
    """Lock-free MPMC ring buffer"""
    
    MAX_PUSH_TRIES = 4
    MAX_POP_TRIES = 4
    
    def __init__(self, size: int):
        self.size = size
        self.buffer: list[Optional[Any]] = [None] * size
        # Pawl mechanism core
        self.write_pawl = AtomicCounter(0)
        self.write_front = AtomicCounter(0)
        self.read_pawl = AtomicCounter(0)
        self.read_front = AtomicCounter(0)
    
    def weak_push(self, msg: Any) -> bool:
        # 1. Increment write_pawl to get competition position
        pawl = self.write_pawl.fetch_add(1)
        
        # 2. Check if queue is full
        if pawl - self.read_front.load() >= self.size:
            self.write_pawl.fetch_sub(1)
            return False
        
        # 3. Get actual slot
        slot = self.write_front.fetch_add(1)
        
        # 4. CAS set slot
        idx = slot % self.size
        if self.buffer[idx] is None:
            self.buffer[idx] = msg
            return True
        return False
    
    def pop(self) -> Optional[Any]:
        for _ in range(self.MAX_POP_TRIES):
            pawl = self.read_pawl.fetch_add(1)
            
            if pawl > self.write_front.load():
                self.read_pawl.fetch_sub(1)
                return None
            
            slot = self.read_front.fetch_add(1)
            idx = slot % self.size
            
            msg = self.buffer[idx]
            self.buffer[idx] = None
            
            if msg is not None:
                return msg
        return None

The output validates the design:

=== MPMC Unordered Ring Buffer Demo (Python) ===

--- Test 1: Basic Push/Pop ---
Popped: msg-1 msg-2 msg-3 
Popped 3 messages

--- Test 2: Multiple Producers (simulated) ---
Pushed 20 messages, buffer size: 16
Popped 16 messages

--- Test 3: Full Queue Behavior ---
Push 0: True
...
Overflow push result: False
Retry after popping 2: True

Summary

Lock-free MPMC ring buffer's Pawl design embodies simplicity + high performance engineering wisdom:

  1. Pawl mechanism: Uses dual pointers to separate competition from actual position
  2. Weak push: Allows quick failure, retry or block
  3. No-order design: Trades order for performance
  4. Scanning recovery: Handles edge cases

This design isn't a silver bullet, but for order-insensitive high-throughput scenarios, it's a very elegant trade-off choice.