Skip to content
← All posts

Predictive Queue Autoscaling with Little's Law

How I built an SLA-driven autoscaler for Laravel queues using queueing theory, trend prediction, and backlog drain algorithms. Replacing arbitrary worker counts with math.

· 10 min read · Sylvester Damgaard
Predictive Queue Autoscaling with Little's Law

Most Laravel queue setups boil down to a Supervisor config with a hardcoded numprocs=10. It works until it doesn't. Traffic spikes at 9 AM, a marketing campaign fires off 50,000 emails, or a batch import floods the default queue. You either overprovision (wasting resources around the clock) or underprovision (and jobs sit for minutes before a worker picks them up).

I built Laravel Queue Autoscale to replace that guesswork with a declarative SLA target. Instead of saying "I want 10 workers," you say:

php
'max_pickup_time_seconds' => 30,

The package figures out how many workers you need and adjusts continuously.

The Core Idea: SLA-Based Scaling

The traditional approach treats worker count as a knob you tune manually:

ini
[program:queue-worker]
numprocs=10

The problem is that the "right" number changes constantly. Laravel Queue Autoscale flips the model. You define a service level objective (the maximum time a job should wait before a worker picks it up) and the autoscaler maintains whatever worker count meets that target.

The configuration is straightforward. Each queue gets its own SLA, min/max bounds, and cooldown:

php
// config/queue-autoscale.php
'queues' => [
    [
        'connection' => 'redis',
        'queue' => 'critical',
        'max_pickup_time_seconds' => 10,
        'min_workers' => 5,
        'max_workers' => 50,
        'scale_cooldown_seconds' => 30,
    ],
    [
        'connection' => 'redis',
        'queue' => 'background',
        'max_pickup_time_seconds' => 300,
        'min_workers' => 0,
        'max_workers' => 10,
    ],
],

The critical queue guarantees 10-second pickup with at least 5 workers always warm. The background queue can scale to zero when idle and tolerates 5-minute wait times.

The Hybrid Predictive Algorithm

A single formula can't handle every traffic pattern. Steady-state math breaks down during spikes; trend prediction is useless when traffic is flat. So the autoscaler runs three algorithms in parallel and takes the maximum result:

1. Little's Law (Steady State)

Little's Law is a proven theorem from queueing theory: L = lambda x W. Rearranged for worker calculation:

Workers = (Pending Jobs / Target Pickup Time) / Processing Rate Per Worker

The actual implementation in HybridPredictiveStrategy:

php
private function littlesLaw(object $metrics, QueueConfiguration $config): int
{
    $pending = $metrics->depth->pending ?? 0;
    $rate = $metrics->processingRate ?? 0.0;
    $sla = $config->maxPickupTimeSeconds;

    if ($pending === 0 || $rate === 0) {
        return $config->minWorkers;
    }

    return (int) ceil(($pending / $sla) / $rate);
}

For 100 pending jobs, a 60-second SLA, and a processing rate of 10 jobs/second per worker, that gives ceil((100/60) / 10) = 1 worker. Simple, deterministic, O(1).

2. Trend Prediction (Proactive)

Little's Law is reactive. It only sees current state. The trend algorithm forecasts future arrival rates and scales before demand materializes:

Workers = Forecasted Rate x Average Job Time

If current throughput is 10 jobs/second but trending upward at +20%, the forecasted rate becomes 12 jobs/second. The autoscaler provisions for the predicted load, not the current one. This is the difference between scaling up at 9:00 when traffic starts climbing versus scaling up at 9:15 when jobs are already piling up.

3. Backlog Drain (SLA Protection)

When the oldest job in the queue approaches the SLA limit, the backlog drain algorithm takes over aggressively:

php
$timeRemaining = max(1, $slaTarget - $oldestJobAge);
$requiredThroughput = $pendingJobs / $timeRemaining;
$workers = (int) ceil($requiredThroughput / $processingRate);

If you have 500 pending jobs, the oldest is 55 seconds old, and your SLA is 60 seconds, that leaves 5 seconds. Required throughput: 100 jobs/second. At 8 jobs/second per worker, you need 13 workers, plus a safety margin that scales with proximity to breach:

php
$slaUsage = $oldestJobAge / $slaTarget;
$safetyMargin = 1.0 + max(0, ($slaUsage - 0.8) * 2);

At 91.7% SLA usage, the margin is 1.23x, pushing the result to 16 workers. At active breach (>100%), it jumps straight to maxWorkers.

The Metrics Pipeline

The autoscaler doesn't collect metrics itself. That part belongs to Laravel Queue Metrics, which provides queue discovery, depth tracking, processing rates, oldest-job age, and trend data through a single call:

php
$allQueues = QueueMetrics::getAllQueuesWithMetrics();

foreach ($allQueues as $queue) {
    // $queue->processingRate   -- jobs/second
    // $queue->depth->pending   -- backlog size
    // $queue->depth->oldestJobAgeSeconds
    // $queue->trend->direction -- 'up', 'down', 'stable'
}

This separation means you can use queue metrics independently for dashboards and alerting, without running the autoscaler.

Resource Constraints

Scaling workers is pointless if you run out of CPU or memory. The autoscaler integrates with System Metrics to cap worker counts at what the system can actually handle:

php
'resource_limits' => [
    'max_cpu_percent' => 90,
    'max_memory_percent' => 85,
    'worker_memory_mb_estimate' => 128,
],

The ResourceConstraintPolicy evaluates system capacity before every scaling decision. On a 16 GB machine with 128 MB per worker, you get a hard cap of ~100 workers regardless of what Little's Law requests.

Scaling Events

Every decision is observable through Laravel events:

php
Event::listen(WorkersScaled::class, function (WorkersScaled $event) {
    Log::info("Scaled {$event->queue}: {$event->from} -> {$event->to} workers");
});

Event::listen(SlaBreachPredicted::class, function (SlaBreachPredicted $event) {
    Notification::route('slack', env('SLACK_ALERT_WEBHOOK'))
        ->notify(new SlaBreachAlert($event->decision));
});

Getting Started

bash
composer require cboxdk/laravel-queue-autoscale
php artisan vendor:publish --tag=queue-autoscale-config
php artisan queue:autoscale

The autoscaler evaluates every 5 seconds by default, spawning php artisan queue:work processes and managing their lifecycle with graceful SIGTERM/SIGKILL shutdown.

The full algorithm documentation, custom strategy interfaces, and scaling policy hooks are covered in the package docs.


// Sylvester Damgaard