How It Works ProdOps Service Dashboard

How It Works

This production dashboard runs at svcdash.cloud.pyebarkerfs.com as a containerized frontend and backend. The backend polls vendor health sources every 5 minutes, keeps the latest results in memory, and serves them through the authenticated BFF/APIM route.

How data flows

1 The browser loads the static dashboard from the frontend nginx container.
2 The dashboard requests /bff/status. bff-auth handles the session and APIM rewrites the call to backend /api/status.
3 The FastAPI backend container returns the latest cached records from memory.
4 APScheduler runs each vendor poller every 5 minutes, plus once when the backend starts.
OK — all clear
Degraded — partial outage or slow
Outage — service down
Unknown — no data yet

Vendor monitoring methods

Microsoft Microsoft Graph API

Microsoft service health is read from Graph using the service announcement APIs. The backend fetches health overviews and active unresolved issues, maps Microsoft statuses to OK / Degraded / Outage / Unknown, and attaches incident detail for the dashboard's incident modal and admin-center links.

Forte Statuspage API

Forte is checked through its public Atlassian Statuspage summary endpoint at status.forte.net/api/v2/summary.json. Active incidents become the dashboard headline.

Billtrust Auth pending

Billtrust is configured to check status.billtrust.com/api/v2/summary.json, but authentication requirements are still unresolved. If the endpoint requires login in production, the app currently cannot read it automatically.

The current poller sends an unauthenticated request. Billtrust monitoring should be treated as unverified until a public endpoint or supported authenticated integration is confirmed.
ServiceTrade Synthetic / parsed status

ServiceTrade is checked from its Uptrends-hosted status page. The backend parses confirmed and unconfirmed probe error counts. Confirmed errors report Outage, unconfirmed errors report Degraded, and no probe errors reports OK.

Profitzoom Synthetic check

Profitzoom has no known public status API. The backend sends an HTTP GET to app.profitzoom.net/pyebarker/. A response below HTTP 400 reports OK; timeout, connection failure, or HTTP 4xx/5xx reports Outage.

This check confirms the app responds but won't detect degraded performance or backend errors that still return 2xx. No public status page has been found for Profitzoom.

Polling and cache behavior

The backend runs five pollers: Microsoft, Forte, Billtrust, ServiceTrade, and Profitzoom. Each poller replaces that vendor's records in the shared in-memory cache. There is no SharePoint list in the current implementation.

Because the cache is in memory, a backend restart clears status until the startup poll completes. The startup poll runs immediately so /api/status can return fresh data as soon as the service is ready.

Deployment

The repository is pbfs/prodops-service-dashboard. There is no dev or stage environment for this app: changes move from a feature branch to main, and every merge to main deploys to production.

The multi-container deployment is defined in .github/containers-spec.json. GitHub Actions calls the shared pbfs/platform-shared-tooling workflow to build and push separate frontend and backend images, then updates the production Azure Container Apps with the commit-SHA image tags.

Target Details
Public URL svcdash.cloud.pyebarkerfs.com
Access group sg-svcdash-users
Frontend container svcdash-prod-frontend-30rg, built from frontend/Dockerfile, nginx, serves index.html and about.html
Backend container svcdash-prod-backend-fzd4, built from backend/Dockerfile, Python 3.12, FastAPI, exposes /health and /api/status
Auth gateway svcdash-prod-bffauth-q13r handles sign-in and forwards authenticated dashboard calls through APIM
Release path Feature branch to reviewed PR to main; main deploys directly to the single production environment