Performance Tuning
Performance Tuning
Optimize Laravel Queue Autoscale for maximum efficiency and cost-effectiveness.
Table of Contents
- Overview
- Configuration Tuning
- Strategy Optimization
- Resource Efficiency
- Scaling Patterns
- Cost Optimization
- Troubleshooting Performance
Overview
Performance tuning focuses on:
- Response Time: How quickly autoscaling reacts to load changes
- Resource Efficiency: Minimizing wasted capacity
- Cost Effectiveness: Balancing performance and expenses
- SLA Compliance: Meeting service level agreements consistently
Performance Metrics
Key Indicators:
- SLA compliance rate (target: >99%)
- Average worker utilization (target: 70-90%)
- Scaling latency (time to adjust workers)
- Cost per job processed
- Oscillation rate (unnecessary scaling events)
Configuration Tuning
Evaluation Interval
The evaluation_interval_seconds controls how often scaling decisions are made.
'evaluation_interval_seconds' => 30, // Default
Faster Intervals (10-20s):
- ✅ Quicker response to traffic spikes
- ✅ Better SLA compliance for burst traffic
- ❌ Higher CPU overhead
- ❌ More potential for oscillation
Slower Intervals (60-120s):
- ✅ Lower system overhead
- ✅ More stable, less oscillation
- ❌ Slower reaction to traffic changes
- ❌ Risk of SLA breaches during spikes
Recommendation:
// Bursty traffic: Fast response needed
'evaluation_interval_seconds' => 15,
// Steady traffic: Optimize for stability
'evaluation_interval_seconds' => 60,
// Mixed traffic: Balanced approach
'evaluation_interval_seconds' => 30,
Cooldown Period
The scale_cooldown_seconds prevents rapid oscillation.
'scale_cooldown_seconds' => 60, // Default
Shorter Cooldown (30-45s):
- ✅ Faster reactions to changing load
- ✅ Better for highly variable traffic
- ❌ Risk of oscillation
- ❌ More frequent scaling operations
Longer Cooldown (90-180s):
- ✅ Very stable, minimal oscillation
- ✅ Lower scaling overhead
- ❌ Slower to adapt
- ❌ May overprovision during decreasing load
Recommendation:
// Critical queue: Balance speed and stability
'scale_cooldown_seconds' => 30,
// Standard queue: Favor stability
'scale_cooldown_seconds' => 60,
// Background queue: Maximize stability
'scale_cooldown_seconds' => 120,
Worker Limits
Set appropriate min_workers and max_workers.
'min_workers' => 1,
'max_workers' => 20,
Min Workers:
// Can scale to zero: Cost-optimized for idle queues
'min_workers' => 0,
// Always ready: Instant processing for critical queues
'min_workers' => 5,
// Balanced: Some baseline capacity
'min_workers' => 2,
Max Workers:
// Calculate based on:
$maxWorkers = min(
$systemCpuCores * 2, // System capacity
$budgetPerHour / $workerCost, // Cost constraints
$maxConcurrentJobs // Application limits
);
SLA Target
The max_pickup_time_seconds drives scaling behavior.
'max_pickup_time_seconds' => 60, // Default
Aggressive SLA (5-15s):
- ✅ Very responsive system
- ✅ Excellent user experience
- ❌ Higher costs (more workers)
- ❌ May overprovision
Moderate SLA (30-90s):
- ✅ Balanced cost and performance
- ✅ Good for most applications
- ❌ Noticeable delays during spikes
Relaxed SLA (120-300s):
- ✅ Cost-optimized
- ✅ Suitable for background tasks
- ❌ Slow responsiveness
- ❌ Not suitable for user-facing features
Recommendation by Queue Type:
'queues' => [
// User-facing: Aggressive SLA
['queue' => 'notifications', 'max_pickup_time_seconds' => 10],
// Business-critical: Moderate SLA
['queue' => 'orders', 'max_pickup_time_seconds' => 30],
// Background: Relaxed SLA
['queue' => 'reports', 'max_pickup_time_seconds' => 300],
],
Strategy Optimization
Choosing the Right Strategy
HybridPredictiveStrategy (default):
- ✅ Best all-around performance
- ✅ Adapts to different traffic patterns
- ✅ Predictive capabilities
- Use for: Most production workloads
Custom Strategies:
- Consider if you have:
- Very specific traffic patterns
- Domain-specific knowledge
- Unique cost constraints
- Integration with external data
Tuning Hybrid Strategy
'strategy' => [
'class' => \Cbox\LaravelQueueAutoscale\Scaling\Strategies\HybridPredictiveStrategy::class,
'options' => [
'trend_weight' => 0.7, // How much to trust trend predictions (0-1)
'safety_margin' => 1.2, // Safety buffer (1.0 = no buffer, 1.5 = 50% buffer)
'min_trend_samples' => 3, // Samples needed for trend analysis
],
],
Aggressive Scaling (Responsive):
'options' => [
'trend_weight' => 0.8, // Trust predictions more
'safety_margin' => 1.3, // 30% safety buffer
'min_trend_samples' => 2, // React quickly
]
Conservative Scaling (Stable):
'options' => [
'trend_weight' => 0.5, // Less trust in predictions
'safety_margin' => 1.1, // 10% safety buffer
'min_trend_samples' => 5, // Wait for more data
]
Resource Efficiency
Worker Configuration
Optimize per-worker resource allocation:
'worker_memory' => 256, // MB per worker
'worker_timeout' => 300, // seconds
'worker_sleep' => 3, // seconds when idle
Memory:
// Measure actual usage
$averageMemory = DB::table('worker_metrics')
->avg('memory_mb');
// Set 20% above average
'worker_memory' => (int) ceil($averageMemory * 1.2),
Timeout:
// Analyze job durations
$p95Duration = DB::table('jobs')
->selectRaw('PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY duration) as p95')
->value('p95');
// Set timeout at P95 + 30%
'worker_timeout' => (int) ceil($p95Duration * 1.3),
Sleep:
// High-frequency queue: Check often
'worker_sleep' => 1,
// Standard queue: Balance
'worker_sleep' => 3,
// Low-frequency queue: Save CPU
'worker_sleep' => 10,
System Resource Limits
Prevent resource exhaustion:
'resource_limits' => [
'max_total_workers' => 100, // Global worker cap
'max_memory_percent' => 80, // Max system memory usage
'max_cpu_percent' => 90, // Max CPU usage
'reserved_memory_mb' => 1024, // Reserve for system
],
Calculate Optimal Max Workers:
$systemMemoryMb = 16384; // 16 GB
$reservedMemoryMb = 2048; // 2 GB reserved
$workerMemoryMb = 256; // Per worker
$maxWorkersByMemory = floor(
($systemMemoryMb - $reservedMemoryMb) / $workerMemoryMb
); // 56 workers
$cpuCores = 8;
$maxWorkersByCpu = $cpuCores * 2; // 16 workers
// Use the more conservative limit
$maxWorkers = min($maxWorkersByMemory, $maxWorkersByCpu); // 16
Queue Prioritization
Route jobs to appropriate queues:
// High priority: Fast SLA, dedicated workers
dispatch(new CriticalJob())->onQueue('critical');
// Normal priority: Standard processing
dispatch(new StandardJob())->onQueue('default');
// Low priority: Batch processing
dispatch(new ReportJob())->onQueue('background');
Configure different performance profiles:
'queues' => [
[
'queue' => 'critical',
'max_pickup_time_seconds' => 5,
'min_workers' => 5,
'max_workers' => 30,
'scale_cooldown_seconds' => 20,
],
[
'queue' => 'default',
'max_pickup_time_seconds' => 60,
'min_workers' => 1,
'max_workers' => 15,
'scale_cooldown_seconds' => 60,
],
[
'queue' => 'background',
'max_pickup_time_seconds' => 300,
'min_workers' => 0,
'max_workers' => 5,
'scale_cooldown_seconds' => 120,
],
],
Scaling Patterns
Pattern 1: Predictable Daily Traffic
For traffic with daily patterns (business hours):
use Illuminate\Support\Facades\Schedule;
// Scale up before business hours
Schedule::call(function () {
app(AutoscaleManager::class)->overrideMinWorkers('default', 10);
})->weekdays()->at('08:30');
// Scale down after business hours
Schedule::call(function () {
app(AutoscaleManager::class)->overrideMinWorkers('default', 2);
})->weekdays()->at('18:00');
Or use time-based strategy:
'strategy' => \App\Strategies\TimeBasedStrategy::class,
Pattern 2: Event-Driven Spikes
For predictable events (sales, releases):
// Before major event
Event::listen(MajorEventStarting::class, function () {
app(AutoscaleManager::class)->scaleToCapacity('orders', percentage: 80);
});
// After event
Event::listen(MajorEventEnded::class, function () {
app(AutoscaleManager::class)->resetToNormal('orders');
});
Pattern 3: Gradual Ramp-Up
For smooth scaling during increases:
'options' => [
'max_scale_up_percent' => 50, // Max 50% increase per evaluation
'max_scale_down_percent' => 25, // Max 25% decrease per evaluation
]
Implementation in custom strategy:
$targetWorkers = $this->calculateTarget($metrics, $config);
$currentWorkers = $metrics->activeWorkerCount;
// Limit increase
if ($targetWorkers > $currentWorkers) {
$maxIncrease = (int) ceil($currentWorkers * 0.5); // 50%
$targetWorkers = min($targetWorkers, $currentWorkers + $maxIncrease);
}
// Limit decrease
if ($targetWorkers < $currentWorkers) {
$maxDecrease = (int) ceil($currentWorkers * 0.25); // 25%
$targetWorkers = max($targetWorkers, $currentWorkers - $maxDecrease);
}
Cost Optimization
Calculate Cost Per Job
$workerCostPerHour = 0.50;
$averageJobDuration = 10; // seconds
$jobsPerWorkerPerHour = 3600 / $averageJobDuration; // 360 jobs
$costPerJob = $workerCostPerHour / $jobsPerWorkerPerHour; // $0.00139
Optimize Worker Utilization
Target: 70-90% utilization
// Calculate current utilization
$processingTime = $averageJobDuration * $jobsProcessedPerHour;
$availableTime = $workers * 3600;
$utilization = $processingTime / $availableTime;
if ($utilization < 0.7) {
// Underutilized: Reduce workers
} elseif ($utilization > 0.9) {
// Overutilized: Add workers
}
Cost-Aware Strategy
Implement budget constraints:
class CostAwareStrategy implements ScalingStrategyContract
{
public function calculateTargetWorkers(object $metrics, QueueConfiguration $config): int
{
// Calculate ideal workers
$idealWorkers = $this->calculateIdeal($metrics, $config);
// Apply budget constraint
$hourlyBudget = 100.00;
$workerCost = 0.50;
$maxAffordableWorkers = (int) floor($hourlyBudget / $workerCost);
return min($idealWorkers, $maxAffordableWorkers);
}
}
Spot Instance Strategy
For cloud deployments, use spot instances for cost savings:
'worker_spawn_strategy' => 'spot', // Use spot instances
'worker_fallback_strategy' => 'on_demand', // Fallback to on-demand
'max_spot_workers' => 15, // Most workers on spot
'min_on_demand_workers' => 3, // Guarantee with on-demand
Troubleshooting Performance
Issue: Slow Scaling Response
Symptoms:
- Jobs pile up before workers scale
- Slow reaction to traffic spikes
Diagnosis:
SELECT
timestamp,
pending_jobs,
current_workers,
target_workers,
TIMESTAMPDIFF(SECOND, LAG(timestamp) OVER (ORDER BY timestamp), timestamp) as seconds_between_evals
FROM autoscale_metrics
WHERE queue = 'default'
ORDER BY timestamp DESC
LIMIT 20;
Solutions:
- Reduce
evaluation_interval_seconds - Reduce
scale_cooldown_seconds - Increase
trend_weightfor more predictive scaling - Raise
min_workersfor baseline capacity
Issue: Worker Oscillation
Symptoms:
- Worker count rapidly changing
- Inefficient resource usage
Diagnosis:
SELECT
COUNT(*) as scaling_events,
SUM(ABS(worker_change)) as total_churn
FROM autoscale_decisions
WHERE timestamp >= DATE_SUB(NOW(), INTERVAL 1 HOUR)
AND worker_change != 0;
Solutions:
- Increase
scale_cooldown_seconds - Add safety margin in strategy
- Implement gradual scaling limits
- Smooth out metric noise
Issue: High Costs
Symptoms:
- Worker count consistently at or near max
- High cloud bills
Diagnosis:
SELECT
AVG(current_workers) as avg_workers,
MAX(current_workers) as peak_workers,
COUNT(*) as evaluations,
SUM(CASE WHEN current_workers = max_workers THEN 1 ELSE 0 END) / COUNT(*) * 100 as percent_at_max
FROM autoscale_metrics
WHERE timestamp >= DATE_SUB(NOW(), INTERVAL 24 HOUR);
Solutions:
- Optimize job performance (faster jobs = fewer workers)
- Increase
max_pickup_time_seconds(relax SLA) - Implement cost-aware strategy
- Use queue prioritization
- Batch similar jobs together
Issue: SLA Breaches
Symptoms:
- Jobs waiting longer than target
- Poor user experience
Diagnosis:
SELECT
AVG(oldest_job_age) as avg_age,
MAX(oldest_job_age) as max_age,
AVG(max_pickup_time_seconds) as sla_target,
SUM(CASE WHEN oldest_job_age > max_pickup_time_seconds THEN 1 ELSE 0 END) / COUNT(*) * 100 as breach_rate
FROM autoscale_metrics
WHERE timestamp >= DATE_SUB(NOW(), INTERVAL 24 HOUR);
Solutions:
- Increase
max_workers - Reduce
max_pickup_time_seconds(stricter SLA triggers earlier scaling) - Optimize job performance
- Check for stuck workers
- Implement worker health checks
Performance Benchmarks
Expected Performance
| Traffic Pattern | SLA Compliance | Avg Utilization | Scaling Latency |
|---|---|---|---|
| Steady | >99% | 75-85% | N/A (stable) |
| Gradual increase | >98% | 70-80% | 30-60s |
| Sudden spike | >95% | 60-90% | 15-45s |
| Burst traffic | >90% | 50-95% | 10-30s |
Tuning for Your Workload
Measure and optimize iteratively:
// 1. Baseline measurement (1 week)
$this->measureBaseline();
// 2. Identify bottlenecks
$this->analyzeMetrics();
// 3. Apply optimizations
$this->tuneConfiguration();
// 4. Measure improvement
$this->comparePerformance();
// 5. Repeat
See Also
- Configuration - Detailed configuration options
- Custom Strategies - Custom strategy development
- Monitoring - Performance monitoring
- How It Works - Algorithm explanation