Live system throughput from eggs to bonded spaceducks.
Backend: GET /beak/system/status · Fields: database.eggs/ducklings/birth_certificates, agents.total_bonded + GET /beak/metrics for trend
Database + agent telemetryLast checked: —
Eggs—
→
Ducklings—
→
Certified—
→
Spaceducks—
Total certs issued—Waiting for metrics…
Last cert issued—Timestamp pending…
Trend—Weekly movement pending…
Cert Issuance Latency
Loading…
Last-issued age plus rolling 24h certificate volume from live metrics and audit signals.
Backend: GET /beak/metrics · Fields: last_cert_issued_at (+ fallback POST /beak/audit for duck.cert_issued / duck.hatched events)
Latest issuance timestamp + last 24h issuance countLast checked: —
Last issued age—Waiting for issuance telemetry…
Issued in last 24h—Waiting for audit volume…
Latest cert timestamp—Using /beak/metrics when present, then audit fallback.
Mission Control API
Loading…
Request health for the operator-facing APIs used by this page.
Backend: GET /beak/system/status (latency + fields), GET /beak/metrics, POST /beak/audit — all three fetched each refresh cycle
Live API reachabilityLast checked: —
Status route/beak/system/status
Metrics route/beak/metrics
Audit route/beak/audit
Observed latency—
Health—
Governance Actions
Manual approval required
Production changes move through the governance lane: open a request ID, record the target version cycle, and capture approval before any alias change. Frozen surfaces still require T-JOSH sign-off. See GOVERNANCE-LOG.md for the full audit trail.
Alert Thresholds
Reference
Static operator thresholds for quick review in the control surface.
API error ratealert if > 5%
Lambda memoryalert if > 100MB
Cert agealert if > 60 days
Peck failuresalert if > 10 in 24h
Failure Budget
Within budget
24h SLA targets — Galaxy 1.1
Auth flow (signup/signin)99.9% target · est. OK
Cert issuance99.5% target · est. OK
Peck approval flow99.0% target · est. OK
SMS deliverysandbox only · limited
Manual review — update after incidents
CloudWatch Alarms
No alarms
Configured alarm thresholds (checked manually via AWS Console)
Lambda errors > 5%OK
Lambda duration > 5sOK
API Gateway 5xx > 1%OK
DynamoDB throttleOK
Last verified: 2026-03-21 · Update manually after incidents
Recent Deploys
Static snapshot
Last 5 Lambda deployments
2026-03-21 05:40 UTC · v39prod
2026-03-21 05:33 UTC · v38prod
2026-03-21 05:31 UTC · v37prod
2026-03-21 05:24 UTC · v36prod
2026-03-21 05:18 UTC · v35prod
Update this card after each deploy. Refresh it periodically because this is a one-time static snapshot from coordination/DEPLOY-LOG.md.
Lambda Version History
Static snapshot
Last 5 prod promotions from coordination/DEPLOY-LOG.md.
v392026-03-21 05:40 UTC | DG-033 /beak/pageview route added
v372026-03-21 05:31 UTC | DC-038 lambda_alias_version in status response
v362026-03-21 05:24 UTC | SD-034 AgentMail cert email wired into /beak/cert/issue
v352026-03-21 05:18 UTC | birth certificate issuance now captures ip_address, user_agent, and country metadata at issuance time; prod alias promoted to v35
Auth Dependencies
Loading…
Mixed live + operator-confirmed auth provider posture.
Static sectionsCommunication Status and Webhook Health are explicit operator-truth tiles until dedicated API fields exist. Quick links are utility navigation, not health tiles.
RemovedDecorative/dead-weight tiles from the previous layout were replaced by real data surfaces or explicit static operator state.
Recent Events
Loading…
Last five audit entries from POST /beak/audit.
Backend: POST /beak/audit (body: {}) · Fields: entries[].event_type, entries[].timestamp
Live audit feedLast checked: —
Loading recent events…
Route Health
Checking
Live probe of key API routes
Backend: direct HTTP probes — GET /beak, GET /beak/metrics, GET /beak/system/status — independent of main refresh cycle
GET /beak—
GET /beak/metrics—
GET /beak/system/status—
Last checked: —
Recent Activity
Loading…
Last five entries from GET /beak/system/statusrecent_audit when present.
Fallback panel until Lambda log lines are exposed by the status routeLast checked: —
Loading recent activity…
Cert Inventory
Loading…
Duckling and certificate state distribution computed from live database counts.
Backend: GET /beak/system/status · Fields: database.eggs, database.ducklings, database.birth_certificates, agents.total_bonded
3. After each deploy: update coordination/DEPLOY-LOG.md and record version + alias promotion.
4. After incidents: log to coordination/GOVERNANCE-LOG.md with request ID + approver.
Next Priorities
Wire signup/cert audit events in Lambda so the Audit Activity strip shows real counts.
Expose agents.stale_list in /beak/system/status for the Stale Agent Spotlight card.
Expose last_cert_issued_at in /beak/metrics to enable real cert latency display.
Source: coordination/OPERATOR-NOTES.md · Last updated: 2026-03-21 UTC · Batch DC-055–DC-057
Operator Shift Log
Ready
Local shift notes for handoff continuity. Saves the latest three UTC-stamped notes to your browser only.
Collapsible card · localStorage only · keeps last 3 notesLast checked: —
Newest first · UTC timestamps
Last saved: never
Stale Agent Spotlight
Loading…
Dead or never-pulsed agent summary from GET /beak/system/status. Individual agent records are shown when the API exposes them; otherwise summary counts are rendered.
Live Runtime Outranks Board StateBoard task status is text — live runtime state is truth. When in doubt, trust the live API and function configuration over any board entry.
⌨️ Operator Command Palette
One-click copyable diagnostics for rapid operator investigation — curl and aws CLI commands pre-filled for prod.
Commands pre-loaded · Click Copy to grab
📊 Metrics Delta Strip
Waiting…
Change since last refresh — green for growth, red for drops, grey for no change.
Last checked: —
⚠️ Degraded-Service Impact Matrix
Loading…
Translates live sandbox state, dead-agent count, and peck failures into plain-English operator risk.
Last checked: —
🔗 Connection Pressure
Loading…
Connections per bonded agent, peck pending vs failed ratio, and overload thresholds.
Last checked: —
🔄 What Changed Since Last Refresh
Metrics that moved on the latest live poll with old → new values.
Waiting for first refresh comparison…
—
📋 Route Failure Journal
Last route probe failures persisted in localStorage — UTC stamp, endpoint, status code.
Exports diagnostics JSON, anomaly summary, operator notes, route health, and parity state as a downloadable Markdown + JSON pair.
🚨 Operator Drill Checklist
Incident triage, alias verification, route probe review, and rollback preparation steps.
Step 1 — Confirm live statusRun curl .../beak/system/status and verify lambda.version matches prod alias via aws lambda get-alias.
Step 2 — Check route healthUse Route Latency strip — any route >1000ms or error indicates a degradation. Check Route Failure Journal for recent failures.
Step 3 — Review anomaly summaryCheck the Anomaly Banner and Impact Matrix. Classify as Degraded / Partial Outage / Full Outage.
Step 4 — Rollback readinessCheck Deploy Readiness Checklist. If rollback needed: list versions with aws lambda list-versions-by-function, then use Promote Alias command from palette.
Step 5 — HandoffExport Incident Handoff Pack v2 (Markdown + JSON) for shift handoff or incident ticket. Save a Shift Log note with current state.
📈 Route Latency History
Last few probe latencies per route — movement over time for degradation detection.
No latency history yet — waiting for route probes.
🕵️ Stale Data Detector
Checking…
Cards that have not refreshed within the last 120 seconds are flagged here so operators don't mistake cached UI for live truth.
Waiting for first refresh…
—
🚧 Operational Blockers
Loading…
Precise operational constraints for SES, SNS, cert pipeline, and GitHub — not vague waiting states.
Last checked: —
🔄 Board / Runtime Reconciliation
Compare board task state against live runtime truth — exports a compact Markdown handoff for operator review.