Architecture
Architecture
Your app / Cbox packages
│
│ Telemetry facade / TelemetryProvider contract
▼
TelemetryManager
├── Registry ──► MetricStore (redis | apcu | array) ← push instruments
│ └────► ObservableGauge callbacks ← pull instruments
├── Tracer ──► in-memory span buffer (capped)
└── events ──► in-memory event buffer
│
▼
Exporters (filtered by SignalSet)
├── Prometheus — scrape route renders store + observables on demand
├── OTLP — direct HTTP JSON: /v1/traces /v1/metrics /v1/logs
└── Null / custom
The two metric mechanisms
Push (counter, gauge()->set(), histogram): written at event time
to the shared store with atomic operations. This is what makes metrics
correct under shared-nothing PHP — every FPM worker, queue worker and node
increments the same Redis series.
Pull (gauge($name, $callback)): evaluated at scrape/flush time,
nothing stored. Use for values the source of truth can answer cheaply
(queue depth, user count, workers running).
These are deliberately distinct API shapes. A counter you inc() and a
gauge callback are different machines — the API never blurs them.
Span lifecycle
span()starts a span as a child of the current span (or of a remote parent continued from atraceparentheader / queue payload).- The sample decision is made once at the trace root and inherited.
- Finished sampled spans buffer in memory (default cap 5000 — the buffer force-flushes when full).
- The buffer flushes once per request (terminable middleware), after each queue job, and after commands.
Export timing
| Signal | Prometheus | OTLP |
|---|---|---|
| Metrics | rendered at scrape | pushed by scheduled telemetry:flush |
| Traces | — | at terminate (after the response is sent) |
| Events | — | at terminate, as log records |
OTLP metrics are exported with cumulative temporality read from the shared store — sidestepping the per-process delta-state problem that breaks SDK-based metrics under FPM.
Failure policy
Telemetry never throws into the application. Instrument creation with an
invalid name or conflicting type throws (programmer error, caught in
tests); every recording and export path is wrapped and reports through
Telemetry::handleExceptionsUsing() (default: Laravel's report()).
A failing observable-gauge callback drops only its own family, never the
scrape.
Design decisions
The full reasoning — including the prior-art survey of the PHP telemetry
ecosystem — lives in docs/adr and
docs/research/prior-art.md.