Priority-Aware Work Stealing: The Major/Minor Queue Pattern

In the design of high-performance concurrent systems, schedulers often face a fundamental dilemma: prioritize global load balancing or optimize for localized, low-latency execution?

Traditional work-stealing algorithms (like the Chase-Lev deque) excel at distributing work but can introduce significant implementation complexity and cache coherency overhead. This post explores a more streamlined, asymmetric scheduling model—the Major/Minor Queue Architecture. This design doesn't attempt to solve every scheduling problem but instead offers a pragmatic trade-off for systems where certain tasks must take precedence.

The Core Concept: Explicit Hierarchy

The philosophy behind this model is simple: not all work is created equal. It divides task sources into two distinct categories:

The Major Queue: This is the designated lane for high-priority tasks generated by the core execution path. It is often bounded and optimized for the local consumer.
The Minor Queues: A dynamic collection of secondary task sources. These could be queues from other idle threads, background maintenance pools, or lower-priority batch jobs.

Asymmetric Consumption Logic

Unlike symmetric work-stealing where any thread can steal from any other with equal probability, this model enforces a strict Two-Level Pop strategy:

Check Major First: The scheduler always polls the Major Queue first. This operation is designed to be non-blocking and extremely fast. If work exists here, it is executed immediately. This guarantees that the critical path remains hot and responsive.
Fallback to Minors: Only when the Major Queue is empty does the scheduler "downgrade" its priority to scan the collection of Minor Queues. This is the "stealing" phase—an opportunistic attempt to find work from other sources to fill idle cycles.

This design implicitly prevents priority inversion: the main workflow is never blocked by the overhead of checking secondary sources unless it has absolutely nothing else to do.

The Cost of Simplicity (Trade-offs)

Every architectural decision comes with a price. While the Major/Minor pattern reduces latency for high-priority tasks, it introduces specific challenges that system designers must navigate.

1. Starvation Risk

The most significant downside is the potential for starvation. If the Major Queue is continuously populated, the scheduler will never enter the second phase. Tasks sitting in the Minor Queues may languish indefinitely. This makes the model unsuitable for workloads requiring strict fairness but ideal for systems with a clear "control plane vs. data plane" separation.

2. Lifecycle Complexity

Managing the lifecycle of dynamic queues is tricky. If the Major Queue is shut down, does that imply the Minor sources should also be detached? In a system where Minor queues are registered dynamically (e.g., by transient worker threads), ensuring clean shutdown without leaking resources requires careful orchestration.

3. Scanning Overhead

When the system is idle, a worker might enter a busy loop scanning an empty list of Minor queues. While checking the local Major queue is cheap, iterating through a dynamic list of remote queues can cause cache thrashing, especially on NUMA architectures.

A Go Implementation Demo

To visualize this scheduling philosophy, we can build a simplified model using Go. Go's channel and select primitives are perfect for demonstrating non-blocking priority selection.

In this demonstration, the PriorityScheduler explicitly manages a primary Major channel and a thread-safe map of Minor channels.

package main

import (
	"context"
	"errors"
	"fmt"
	"sync"
	"sync/atomic"
)

// Task interface for generic work units
type Task interface {
	Execute()
}

// PriorityScheduler implements the asymmetric Major/Minor pattern.
// It prioritizes a local Major channel over a dynamic set of Minor channels.
type PriorityScheduler struct {
	majorChan chan Task
	minors    sync.Map // Simulates a lock-free container for stealable sources
	count     int64    // Tracks number of registered minor queues
}

func NewPriorityScheduler(bufferSize int) *PriorityScheduler {
	return &PriorityScheduler{
		majorChan: make(chan Task, bufferSize),
	}
}

// PushMajor submits a core task. This is the "fast path".
func (s *PriorityScheduler) PushMajor(task Task) error {
	select {
	case s.majorChan <- task:
		return nil
	default:
		return errors.New("major queue full")
	}
}

// RegisterMinor allows dynamic registration of a "victim" queue.
// This simulates other components exposing their work to the scheduler.
func (s *PriorityScheduler) RegisterMinor(id string, minorChan chan Task) {
	s.minors.Store(id, minorChan)
	atomic.AddInt64(&s.count, 1)
}

// Fetch implements the critical "Two-Level Pop" logic.
func (s *PriorityScheduler) Fetch(ctx context.Context) (Task, error) {
	// Phase 1: Check Major Queue strictly first.
	// Using select with default ensures this is non-blocking.
	select {
	case task := <-s.majorChan:
		return task, nil
	case <-ctx.Done():
		return nil, ctx.Err()
	default:
		// Phase 2: Downgrade to scanning Minor queues.
		// We only reach here if the Major path was empty.
		var foundTask Task
		
		// Iterate through all registered minor sources.
		// Note: The order of iteration here affects fairness.
		// Real implementations might randomize the start point.
		s.minors.Range(func(key, value interface{}) bool {
			ch := value.(chan Task)
			select {
			case t := <-ch:
				foundTask = t
				return false // Stole a task, stop scanning
			default:
				return true // This minor queue is empty, check next
			}
		})

		if foundTask != nil {
			return foundTask, nil
		}
	}

	return nil, errors.New("no tasks available")
}

// ---------------------------------------------------------
// Test scaffolding
// ---------------------------------------------------------

type SampleTask struct {
	ID string
}

func (t *SampleTask) Execute() {
	fmt.Printf("Executing task: %s\n", t.ID)
}

func main() {
	scheduler := NewPriorityScheduler(10)
	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()

	// Scenario: Register a remote node's queue as a Minor source
	minorQ := make(chan Task, 5)
	scheduler.RegisterMinor("node-1", minorQ)

	// 1. Inject High-Priority Task (Major)
	scheduler.PushMajor(&SampleTask{ID: "Major-Priority-Job"})

	// 2. Inject Low-Priority/Remote Task (Minor)
	minorQ <- &SampleTask{ID: "Stolen-Job-From-Minor"})

	// 3. Demonstrate Consumption Order
	fmt.Println("--- Start Scheduling ---")
	for i := 0; i < 2; i++ {
		task, err := scheduler.Fetch(ctx)
		if err == nil {
			task.Execute()
		} else {
			fmt.Println("Fetch error:", err)
		}
	}
}

Conclusion

The Major/Minor queue architecture demonstrates a pragmatic engineering truth: symmetry is not always the goal. By decoupling "local high-priority work" from "external stealable work," we gain latency guarantees for critical paths while retaining the ability to balance load during idle periods. Although it introduces starvation risks and management overhead, it remains a powerful tool for systems where the responsiveness of the main thread is non-negotiable.