Performance Tuning
Performance Tuning
Optimize Queue Autoscale for Laravel for maximum efficiency and cost-effectiveness.
Table of Contents
- Overview
- Configuration Tuning
- Strategy Optimization
- Resource Efficiency
- Scaling Patterns
- Cost Optimization
- Troubleshooting Performance
Overview
Performance tuning focuses on:
- Response Time: How quickly autoscaling reacts to load changes
- Resource Efficiency: Minimizing wasted capacity
- Cost Effectiveness: Balancing performance and expenses
- SLA Compliance: Meeting service level agreements consistently
Performance Metrics
Key Indicators:
- SLA compliance rate (target: >99%)
- Average worker utilization (target: 70-90%)
- Scaling latency (time to adjust workers)
- Cost per job processed
- Oscillation rate (unnecessary scaling events)
Configuration Tuning
Evaluation Interval
The evaluation_interval_seconds controls how often scaling decisions are made.
'evaluation_interval_seconds' => 30, // Default
Faster Intervals (10-20s):
- ✅ Quicker response to traffic spikes
- ✅ Better SLA compliance for burst traffic
- ❌ Higher CPU overhead
- ❌ More potential for oscillation
Slower Intervals (60-120s):
- ✅ Lower system overhead
- ✅ More stable, less oscillation
- ❌ Slower reaction to traffic changes
- ❌ Risk of SLA breaches during spikes
Recommendation:
// Bursty traffic: Fast response needed
'evaluation_interval_seconds' => 15,
// Steady traffic: Optimize for stability
'evaluation_interval_seconds' => 60,
// Mixed traffic: Balanced approach
'evaluation_interval_seconds' => 30,
Cooldown Period
scaling.cooldown_seconds (a top-level global setting) prevents rapid oscillation.
'scaling' => ['cooldown_seconds' => 60], // Default
Shorter Cooldown (30-45s): fast reactions, better for variable traffic, but risk of oscillation.
Longer Cooldown (90-180s): very stable, but slower to adapt and may overprovision during decreasing load.
Worker Limits
Per-queue bounds live under the workers key — set via profile or override:
'queues' => [
'payments' => ['workers' => ['min' => 5, 'max' => 50]], // Always warm
'emails' => ['workers' => ['min' => 0, 'max' => 20]], // Can scale to zero
],
The right ceiling depends on:
$maxWorkers = min(
$systemCpuCores * 2, // System capacity
$budgetPerHour / $workerCost, // Cost constraints
$maxConcurrentJobs, // Application limits
);
SLA Target
sla.target_seconds drives scaling behavior. Change it via a profile or a per-queue override.
'queues' => [
'payments' => ['sla' => ['target_seconds' => 10]],
'reports' => ['sla' => ['target_seconds' => 300]],
],
Aggressive SLA (5-15s): very responsive, but higher cost and potential overprovisioning. Use CriticalProfile for the full bundle.
Moderate SLA (30-90s): balanced cost and performance — BalancedProfile.
Relaxed SLA (120-300s): cost-optimised — BackgroundProfile.
See Workload Profiles for the full comparison.
Strategy Optimization
Choosing the Right Strategy
HybridStrategy (default):
- ✅ Best all-around performance
- ✅ Adapts to different traffic patterns
- ✅ Predictive capabilities
- Use for: Most production workloads
Custom Strategies:
- Consider if you have:
- Very specific traffic patterns
- Domain-specific knowledge
- Unique cost constraints
- Integration with external data
Tuning Hybrid Strategy
'strategy' => [
'class' => \Cbox\LaravelQueueAutoscale\Scaling\Strategies\HybridStrategy::class,
'options' => [
'trend_weight' => 0.7, // How much to trust trend predictions (0-1)
'safety_margin' => 1.2, // Safety buffer (1.0 = no buffer, 1.5 = 50% buffer)
'min_trend_samples' => 3, // Samples needed for trend analysis
],
],
Aggressive Scaling (Responsive):
'options' => [
'trend_weight' => 0.8, // Trust predictions more
'safety_margin' => 1.3, // 30% safety buffer
'min_trend_samples' => 2, // React quickly
]
Conservative Scaling (Stable):
'options' => [
'trend_weight' => 0.5, // Less trust in predictions
'safety_margin' => 1.1, // 10% safety buffer
'min_trend_samples' => 5, // Wait for more data
]
Resource Efficiency
Worker configuration
Per-worker runtime knobs live under the workers key of a queue config:
'queues' => [
'exports' => [
'workers' => [
'timeout_seconds' => 300, // --max-time= on queue:work
'sleep_seconds' => 3, // --sleep= on queue:work
'tries' => 3, // --tries= on queue:work
],
],
],
Tuning timeout_seconds (how long a worker is kept alive before recycling). Profile your jobs and set it at p95 + ~30%:
// Look at recent job durations in your metrics store or database.
// Set timeout_seconds at p95 + 30%.
Tuning sleep_seconds (how long a worker sleeps when the queue is empty). Higher-frequency queues benefit from 1–2s; background queues save CPU with 5–10s.
System resource limits
The global limits section protects the host from runaway spawning:
'limits' => [
'max_cpu_percent' => 85, // Skip spawning at or above this
'max_memory_percent' => 85, // Same for memory
'worker_memory_mb_estimate' => 128, // Used to derive a per-worker ceiling
'reserve_cpu_cores' => 1, // Cores kept for OS/other services
],
How the worker ceiling is derived (see Resource Constraints for the full math):
$maxByMemory = floor(
$systemMemoryMb * ($limits['max_memory_percent'] / 100) / $limits['worker_memory_mb_estimate']
);
$maxByCpu = ($cpuCores - $limits['reserve_cpu_cores']) * 2;
$hostCeiling = min($maxByMemory, $maxByCpu);
The autoscaler's per-queue workers.max is further capped by this host ceiling.
Queue prioritisation
Route jobs to appropriate queues:
// High priority: tight SLA, always warm
dispatch(new CriticalJob())->onQueue('critical');
// Standard
dispatch(new StandardJob())->onQueue('default');
// Low priority
dispatch(new ReportJob())->onQueue('background');
And pick a profile per tier:
use Cbox\LaravelQueueAutoscale\Configuration\Profiles\CriticalProfile;
use Cbox\LaravelQueueAutoscale\Configuration\Profiles\BalancedProfile;
use Cbox\LaravelQueueAutoscale\Configuration\Profiles\BackgroundProfile;
'queues' => [
'critical' => CriticalProfile::class, // 10s SLA, 5-50 workers
'default' => BalancedProfile::class, // 30s SLA, 1-10 workers
'background' => BackgroundProfile::class, // 300s SLA, 0-5 workers
],
Scaling Patterns
Pattern 1: Predictable Daily Traffic
For traffic with daily patterns (business hours):
use Illuminate\Support\Facades\Schedule;
// Scale up before business hours
Schedule::call(function () {
app(AutoscaleManager::class)->overrideMinWorkers('default', 10);
})->weekdays()->at('08:30');
// Scale down after business hours
Schedule::call(function () {
app(AutoscaleManager::class)->overrideMinWorkers('default', 2);
})->weekdays()->at('18:00');
Or use time-based strategy:
'strategy' => \App\Strategies\TimeBasedStrategy::class,
Pattern 2: Event-Driven Spikes
For predictable events (sales, releases):
// Before major event
Event::listen(MajorEventStarting::class, function () {
app(AutoscaleManager::class)->scaleToCapacity('orders', percentage: 80);
});
// After event
Event::listen(MajorEventEnded::class, function () {
app(AutoscaleManager::class)->resetToNormal('orders');
});
Pattern 3: Gradual Ramp-Up
For smooth scaling during increases:
'options' => [
'max_scale_up_percent' => 50, // Max 50% increase per evaluation
'max_scale_down_percent' => 25, // Max 25% decrease per evaluation
]
Implementation in custom strategy:
$targetWorkers = $this->calculateTarget($metrics, $config);
$currentWorkers = $metrics->activeWorkerCount;
// Limit increase
if ($targetWorkers > $currentWorkers) {
$maxIncrease = (int) ceil($currentWorkers * 0.5); // 50%
$targetWorkers = min($targetWorkers, $currentWorkers + $maxIncrease);
}
// Limit decrease
if ($targetWorkers < $currentWorkers) {
$maxDecrease = (int) ceil($currentWorkers * 0.25); // 25%
$targetWorkers = max($targetWorkers, $currentWorkers - $maxDecrease);
}
Cost Optimization
Calculate Cost Per Job
$workerCostPerHour = 0.50;
$averageJobDuration = 10; // seconds
$jobsPerWorkerPerHour = 3600 / $averageJobDuration; // 360 jobs
$costPerJob = $workerCostPerHour / $jobsPerWorkerPerHour; // $0.00139
Optimize Worker Utilization
Target: 70-90% utilization
// Calculate current utilization
$processingTime = $averageJobDuration * $jobsProcessedPerHour;
$availableTime = $workers * 3600;
$utilization = $processingTime / $availableTime;
if ($utilization < 0.7) {
// Underutilized: Reduce workers
} elseif ($utilization > 0.9) {
// Overutilized: Add workers
}
Cost-Aware Strategy
Implement budget constraints:
class CostAwareStrategy implements ScalingStrategyContract
{
public function calculateTargetWorkers(object $metrics, QueueConfiguration $config): int
{
// Calculate ideal workers
$idealWorkers = $this->calculateIdeal($metrics, $config);
// Apply budget constraint
$hourlyBudget = 100.00;
$workerCost = 0.50;
$maxAffordableWorkers = (int) floor($hourlyBudget / $workerCost);
return min($idealWorkers, $maxAffordableWorkers);
}
}
Spot Instance Strategy
For cloud deployments, use spot instances for cost savings:
'worker_spawn_strategy' => 'spot', // Use spot instances
'worker_fallback_strategy' => 'on_demand', // Fallback to on-demand
'max_spot_workers' => 15, // Most workers on spot
'min_on_demand_workers' => 3, // Guarantee with on-demand
Troubleshooting Performance
Issue: Slow Scaling Response
Symptoms:
- Jobs pile up before workers scale
- Slow reaction to traffic spikes
Diagnosis: run the manager in -vv mode and watch the time between evaluation cycles and the current → target transitions. If several cycles pass with current < target and no spawn, the cooldown or a policy is blocking.
Solutions:
- Reduce
manager.evaluation_interval_seconds(default 5s) - Reduce
scaling.cooldown_seconds(default 60s) - Swap to a profile with a more aggressive forecast policy (
CriticalProfileorBurstyProfile) - Raise
workers.minso cold-start latency is not a factor
Issue: Worker Oscillation
Symptoms:
- Worker count rapidly changing
- Inefficient resource usage
Diagnosis: run the manager in -vv mode during the oscillation window. The log shows every decision with reasoning. If you see scaled UP and scaled DOWN for the same queue within one cooldown window, anti-flapping didn't help — the strategy itself is oscillating.
Alternatively listen on the WorkersScaled event and count direction reversals per queue per minute (see Cookbook → Alert via Log).
Solutions:
- Increase
scaling.cooldown_seconds - Use a profile with a higher
sla.min_samples(larger p95 window smooths noise) - Consider a custom policy that rejects small scale-down steps — see ConservativeScaleDownPolicy
Issue: High Costs
Symptoms:
- Worker count consistently at or near
workers.max - High cloud bills
Diagnosis: listen on the ScalingDecisionMade event and record how often the manager reports limitingFactor === 'config' — that means the configured max is the bottleneck, not capacity or demand. A single log listener with a counter suffices.
Solutions:
- Optimise job performance — faster jobs need fewer workers
- Relax the SLA: swap to
BalancedProfileorBackgroundProfile, or raisesla.target_seconds - Lower
workers.maxif the high count is driving cost faster than it's helping SLA - Use queue prioritisation (critical vs. best-effort queues on separate profiles)
- Batch similar small jobs together
Issue: SLA Breaches
Symptoms:
- Jobs waiting longer than target
SlaBreachedevents firing
Diagnosis: listen on SlaBreached / SlaRecovered and aggregate breach durations. Or run php artisan queue:autoscale:debug --queue=X during a breach to see pickup-time percentiles and backlog.
Solutions:
- Increase
workers.max(you may be capacity-constrained) - Increase
workers.min(cold-start latency at scale-up may be the culprit) - Tighten
sla.target_seconds— counter-intuitive, but a stricter SLA triggers earlier backlog-drain scaling - Check for stuck workers via
ps aux | grep queue:work— a hung worker consumes a slot without draining - Lower
limits.max_cpu_percentif the host is starving workers
Performance Benchmarks
Expected Performance
| Traffic Pattern | SLA Compliance | Avg Utilization | Scaling Latency |
|---|---|---|---|
| Steady | >99% | 75-85% | N/A (stable) |
| Gradual increase | >98% | 70-80% | 30-60s |
| Sudden spike | >95% | 60-90% | 15-45s |
| Burst traffic | >90% | 50-95% | 10-30s |
Tuning for Your Workload
Measure and optimize iteratively:
// 1. Baseline measurement (1 week)
$this->measureBaseline();
// 2. Identify bottlenecks
$this->analyzeMetrics();
// 3. Apply optimizations
$this->tuneConfiguration();
// 4. Measure improvement
$this->comparePerformance();
// 5. Repeat
See Also
- Configuration - Detailed configuration options
- Custom Strategies - Custom strategy development
- Monitoring - Performance monitoring
- How It Works - Algorithm explanation