Workload Profiles
Workload Profiles
A profile is a PHP class whose resolve() method returns a pre-tuned bundle of SLA, forecast, worker, and spawn-compensation settings. Point a queue at a profile class and get a reasonable default without tuning individual keys.
SLA Timing Floor
All SLA targets are subject to a hard floor imposed by Laravel's queue worker internals. The queue:work command's sleep/poll loop means pickup time can never be faster than ~3-5 seconds, even with idle workers running. SLA targets below 5 seconds will always produce flaky breach events.
Profiles with workers.min = 0 (BurstyProfile, BackgroundProfile) have an additional ~5-7 seconds of scale-from-zero overhead (evaluation interval + worker spawn latency) on top of the poll floor. This is a conscious cost-vs-responsiveness trade-off built into those profiles.
See Understanding SLA Timing for the full explanation.
Shipped Profiles
Six profiles ship with the package:
| Profile | SLA | Min | Max | Percentile | Forecast policy | Intended use |
|---|---|---|---|---|---|---|
| CriticalProfile | 10s | 5 | 50 | p99 | Aggressive | Payments, order fulfilment |
| HighVolumeProfile | 20s | 3 | 40 | p95 | Moderate | Emails, batch notifications |
| BalancedProfile ⭐ | 30s | 1 | 10 | p95 | Moderate | General purpose default |
| BurstyProfile | 60s | 0 | 100 | p90 | Aggressive | Webhook storms, campaign fanouts |
| BackgroundProfile | 300s | 0 | 5 | p95 | Hint-only | Cleanup, analytics |
| ExclusiveProfile (v3) | 60s | 1 (pinned) | 1 (pinned) | p95 | Disabled | Sequential integrations |
BalancedProfile is the default sla_defaults. Change it only if your typical queue has a tighter or looser SLA than 30 seconds.
Using a profile
use Cbox\LaravelQueueAutoscale\Configuration\Profiles\BalancedProfile;
use Cbox\LaravelQueueAutoscale\Configuration\Profiles\CriticalProfile;
'sla_defaults' => BalancedProfile::class,
'queues' => [
'payments' => CriticalProfile::class,
],
Or override a single key on top of a profile:
'queues' => [
'exports' => [
'sla' => ['target_seconds' => 45],
'workers' => ['max' => 20],
],
],
The deep merge keeps every value the profile defines except what you explicitly overrode.
The five autoscaling profiles
Each profile below lists the exact values from its resolve() method. All settings defaulted elsewhere (like workers.timeout_seconds = 3600, spawn_compensation.enabled = true) are omitted here for brevity — check the class source if you need the full shape.
CriticalProfile
Use for: payment processing, order fulfilment, real-time user-facing operations where a 10-second pickup SLA is a hard requirement.
Don't use for: background work. workers.min = 5 keeps five workers running 24/7.
'sla' => [
'target_seconds' => 10,
'percentile' => 99,
'window_seconds' => 120,
'min_samples' => 20,
],
'workers' => ['min' => 5, 'max' => 50, 'sleep_seconds' => 1],
'forecast' => ['policy' => AggressiveForecastPolicy::class, 'horizon_seconds' => 60],
'spawn_compensation' => ['min_samples' => 3, 'ema_alpha' => 0.3],
Aggressive forecasting blends predicted traffic into the scaling decision, so spikes are met before backlog builds. The p99 signal tolerates one outlier per 100 jobs without over-reacting.
HighVolumeProfile
Use for: high-throughput queues where absolute latency is less critical than keeping up — bulk emails, notifications, mass imports.
Don't use for: queues with tight per-job SLA requirements or spiky traffic.
'sla' => [
'target_seconds' => 20,
'percentile' => 95,
'window_seconds' => 300,
'min_samples' => 50,
],
'workers' => ['min' => 3, 'max' => 40, 'sleep_seconds' => 2],
'forecast' => ['policy' => ModerateForecastPolicy::class, 'horizon_seconds' => 60],
Higher min_samples (50) means the autoscaler waits for more measurements before trusting the p95 — appropriate when throughput is high and noise is averaged out quickly.
BalancedProfile ⭐
Use for: the default. Anything user-facing but not real-time: exports, synchronous-equivalent workflows, general-purpose queues.
Don't use for: queues where defaults don't fit either direction (too loose for critical, too tight for background).
'sla' => [
'target_seconds' => 30,
'percentile' => 95,
'window_seconds' => 300,
'min_samples' => 20,
],
'workers' => ['min' => 1, 'max' => 10, 'sleep_seconds' => 3],
'forecast' => ['policy' => ModerateForecastPolicy::class, 'horizon_seconds' => 60],
This is the sla_defaults in the published config. Most apps never change it.
BurstyProfile
Use for: queues with highly variable traffic — webhook receivers, campaign fanouts, anything where you go from 0 to 1000 jobs in seconds then back to idle.
Don't use for: steady-state high-throughput queues (use HighVolumeProfile instead — BurstyProfile's workers.min = 0 means cold-starts on every burst).
'sla' => [
'target_seconds' => 60,
'percentile' => 90,
'window_seconds' => 600,
'min_samples' => 20,
],
'workers' => ['min' => 0, 'max' => 100, 'sleep_seconds' => 3],
'forecast' => ['policy' => AggressiveForecastPolicy::class, 'horizon_seconds' => 120],
Longer forecast horizon (120s) so the aggressive forecast catches ramps earlier. min = 0 lets the queue scale fully to zero between bursts to save cost.
BackgroundProfile
Use for: cleanup jobs, analytics batches, reports — anywhere a 5-minute pickup SLA is comfortably acceptable.
Don't use for: anything a user is waiting for.
'sla' => [
'target_seconds' => 300,
'percentile' => 95,
'window_seconds' => 900,
'min_samples' => 20,
],
'workers' => ['min' => 0, 'max' => 5, 'sleep_seconds' => 10],
'forecast' => ['policy' => HintForecastPolicy::class, 'horizon_seconds' => 300],
HintForecastPolicy barely influences the scaling decision — for a 5-minute SLA queue, reactive scaling is fine, and the extra prediction machinery isn't worth the compute. sleep_seconds = 10 also reduces idle-worker CPU.
The pinned profile
ExclusiveProfile
Use for: queues where jobs must run one at a time in order. Customer integrations that require single-connection semantics, legacy APIs with strict per-client rate limits, anything where two concurrent jobs would corrupt state.
Don't use for: anything you want the autoscaler to actually scale. This profile disables scaling entirely.
'sla' => ['target_seconds' => 60, 'percentile' => 95, 'min_samples' => 20],
'workers' => ['min' => 1, 'max' => 1, 'scalable' => false],
'forecast' => ['policy' => DisabledForecastPolicy::class],
'spawn_compensation' => ['enabled' => false],
scalable = false flips the autoscaler into supervisor mode for this queue: it ensures exactly one live worker and respawns on death, but never evaluates scaling. SLA breach events still fire so operators see when the queue is falling behind — but no scaling happens, because the whole point is to preserve ordering.
See Queue Topology → Exclusive Queues for the full behaviour model.
Scale-to-Zero
Any scalable profile can scale down to zero workers by setting workers.min = 0. Two shipped profiles already do this: BurstyProfile and BackgroundProfile.
When to use
Scale-to-zero is appropriate for queues with sporadic or unpredictable traffic where it is acceptable that jobs are not processed immediately. Examples:
- Webhook receivers that only fire during external events
- Nightly report generation
- Cleanup and maintenance jobs
- Campaign or batch notification queues
Wakeup latency
When a queue scales from 0 to 1 worker, there is a cold-start delay before the first job is picked up:
| Component | Typical duration |
|---|---|
| Evaluation interval (polling) | 5 seconds (configurable) |
| Worker process spawn | 1–3 seconds |
| Worker poll/sleep loop | 1–3 seconds |
| Total wakeup latency | ~7–11 seconds |
This delay is inherent to the polling-based architecture: the autoscaler must first observe pending jobs, then spawn a worker, which then polls the queue.
SLA implications
An SLA target should be at least 2–3x the wakeup latency to avoid false breach events on every cold start. For a default 5-second evaluation interval:
- SLA < 15 seconds: Will breach on nearly every cold start. Not compatible with scale-to-zero.
- SLA 30–60 seconds: Comfortable buffer. Recommended minimum.
- SLA 300+ seconds: Ideal for background work.
BurstyProfile (SLA 60s) and BackgroundProfile (SLA 300s) are both designed with this trade-off in mind.
Configuration
Override workers.min on any scalable profile:
'queues' => [
'webhooks' => [
'profile' => BalancedProfile::class,
'overrides' => ['workers' => ['min' => 0]],
],
],
Or set it directly:
'queues' => [
'webhooks' => [
'sla' => ['target_seconds' => 60],
'workers' => ['min' => 0, 'max' => 20],
],
],
Scale-to-zero is not compatible with ExclusiveProfile or any non-scalable configuration (scalable = false), which requires at least one worker at all times.
Custom profiles
If none of the shipped profiles matches your workload, write your own. It's a small class:
<?php
namespace App\QueueAutoscale\Profiles;
use Cbox\LaravelQueueAutoscale\Contracts\ProfileContract;
use Cbox\LaravelQueueAutoscale\Scaling\Calculators\LinearRegressionForecaster;
use Cbox\LaravelQueueAutoscale\Scaling\Forecasting\Policies\ModerateForecastPolicy;
final readonly class ReportsProfile implements ProfileContract
{
public function resolve(): array
{
return [
'sla' => [
'target_seconds' => 90,
'percentile' => 95,
'window_seconds' => 600,
'min_samples' => 20,
],
'forecast' => [
'forecaster' => LinearRegressionForecaster::class,
'policy' => ModerateForecastPolicy::class,
'horizon_seconds' => 120,
'history_seconds' => 600,
],
'workers' => [
'min' => 0,
'max' => 8,
'tries' => 3,
'timeout_seconds' => 3600,
'sleep_seconds' => 5,
'shutdown_timeout_seconds' => 30,
],
'spawn_compensation' => [
'enabled' => true,
'fallback_seconds' => 2.0,
'min_samples' => 5,
'ema_alpha' => 0.2,
],
];
}
}
Use it in config:
'queues' => [
'reports' => \App\QueueAutoscale\Profiles\ReportsProfile::class,
],
Start from the shipped profile closest to your target and tweak. Most common changes:
- Different SLA target. Change
sla.target_seconds. - Different scaling aggressiveness. Swap
forecast.policybetweenAggressive,Moderate,Hint,Disabled. - Different worker bounds. Change
workers.minandworkers.max. - Pinned count > 1. Set
workers.min = workers.max = Nandworkers.scalable = false.
See Also
- Queue Topology — when to use per-queue, groups, exclusive, excluded
- Configuration — the full config reference
- How It Works — the algorithms a profile's values tune
- Custom Strategies — replacing entire scaling algorithms, not just tuning them