Serial Post-Process Queue: Parallel-Serial Hybrid Architecture
In concurrent programming, some tasks need parallel processing first, then serial aggregation. Today we dive into an industrial-grade serial post-process queue implementation, exploring its parallel-serial hybrid architecture design wisdom.
Scenarios and Requirements
This two-stage processing pattern applies to:
- Batch data processing: Parallel computation first, then serial write
- Log collection: Parallel aggregation first, then serial disk write
- Report generation: Parallel statistics first, then serial report generation
Core requirements:
- High performance: First stage needs parallel processing for throughput
- Order guarantee: Second stage needs serial execution for order
- Resource control: Queue size limit to prevent memory overflow
Solution: Two-Stage Queue Design
The industrial implementation uses a unique two-stage architecture:
// Core design: two queues
IThreadPool* ParallelQueue; // Parallel processing queue
THolder<IThreadPool> SerialQueue; // Serial processing queue
// Task wrapping
class TAsync {
void Process(void* threadSpecificResource) override {
WorkItem->ParallelProcess();
Done(); // Mark completion
}
};
class TSync {
void Process(void*) override {
Async.Wait(); // Wait for parallel to complete
WorkItem->SerialProcess(); // Serial post-process
}
};
Core Design
TAsync: Parallel Processing
- Executes in thread pool
- Notifies via condition variable on completion
TSync: Serial Wrapper
- Added to serial queue first
- Waits for TAsync to complete
- Executes SerialProcess()
Add Flow
bool Add(TAutoPtr<IProcessObject> obj) { TSync* sync = new TSync(obj); // Add to serial queue first SerialQueue->Add(sync); // Then add to parallel queue IObjectInQueue* async = sync->GetAsync(); ParallelQueue->Add(async); }
Trade-off Analysis
Advantages
- High performance: Parallel stage utilizes multiple cores
- Order guarantee: Serial stage ensures order
- Resource control: Configurable queue size
Costs
- Complexity: Two-stage coordination adds complexity
- Latency: Serial stage may become bottleneck
- Memory: Tasks need to be kept until serial completes
Use Cases
- Tasks requiring parallel first then serial
- Order-sensitive post-processing
- Large batch data processing
Clean Room Reimplementation: Go Implementation
Here's the same design philosophy demonstrated in Go:
type SerialPostProcessQueue struct {
parallelPool *WorkerPool
serialQueue chan *SerialTask
}
// Two-stage processing
func (q *SerialPostProcessQueue) Add(obj ProcessObject) {
task := &SerialTask{
object: obj,
completed: make(chan bool, 1),
}
// Stage 1: Parallel processing
q.parallelPool.Submit(func() {
obj.ParallelProcess()
task.completed <- true
})
// Stage 2: Serial processing
q.serialQueue <- task
}
// Serial worker
go func() {
for task := range q.serialQueue {
<-task.completed // Wait for parallel to complete
task.object.SerialProcess() // Serial post-process
}
}()
The output validates the design:
=== Serial-PostProcess Queue Demo (Go) ===
--- Adding Tasks ---
Added task 1
Added task 2
Added task 3
Added task 4
Added task 5
--- Processing (parallel first, then serial) ---
[Parallel] Task 1 processing...
[Parallel] Task 4 processing...
[Parallel] Task 2 processing...
[Parallel] Task 3 processing...
[Parallel] Task 3 parallel done
[Parallel] Task 5 processing...
...
[Serial] Task 1 post-processing...
[Serial] Task 2 post-processing...
[Serial] Task 3 post-processing...
...
Summary
Serial post-process queue's parallel-serial hybrid architecture embodies throughput + order balance engineering wisdom:
- Two-stage design: Parallel improves throughput, serial ensures order
- Task wrapping: TSync waits for TAsync to complete
- Resource control: Queue size limits
- Graceful shutdown: Stop waits for all tasks to complete
This design isn't a silver bullet, but for scenarios requiring parallel first then serial, it's a very elegant trade-off choice.