Reference
Status & health
Three ways to check Memoire is healthy, in increasing order of detail.
1. Public health badge
For uptime monitors, status pages, or just a sanity check:
curl -s https://api.trymemoire.com/readyz | jq .readyReturns true when all subsystems are green, false otherwise. The HTTP status code follows the same rule (200 vs 503), so you can wire it into any readiness-check-aware tool without parsing JSON.
2. Dashboard status page
Sign in and visit /dashboard/status. You get:
- Live ready/degraded banner with uptime
- Build version, commit, build time
- Fly.io machine ID, region, and image ref
- Per-subsystem status (memory, runners, LLM router, Slack, watchdog, activity bus)
- Last 20 errors with stack traces — useful when something flaked and you want to know if it was a known issue
- Live event stream — Slack status-emoji transitions, runner spawns, procedure phase changes — auto-refreshing every 10s
3. Raw diagnostics endpoint
For programmatic access or to grep with tools you already use:
curl 'https://api.trymemoire.com/diagnostics?limit=50&category=slack-status' | jqSee the full schema on the Gateway API page.
What the subsystem checks mean
- memory— Core memory blocks loaded. Failing here usually means the org directory wasn't initialised (rare).
- runners — At least one coding agent (Claude or Codex) is reachable. Both being down means new code-shipping tasks will fail; planning/research still works.
- llm_router — At least one LLM provider has a healthy key. A direct API key counts as healthy even when the multi-account router is offline.
- slack— Number of connected workspaces. Zero is fine if you're not using Slack; non-zero means socket-mode is live.
- watchdog — How many subprocess runners the watchdog is currently tracking. Non-zero is normal; means a task is in flight.
- activity_bus— How many SSE clients are subscribed (the dashboard's live activity feed).
What we monitor on our side
Internally, the gateway pages on:
/readyznon-200 for > 60s (uptime monitor)- Runner mortality rate > 5% (process watchdog metric)
- Webhook signature failures > 0 (security alert)
- Stripe webhook backlog > 30s (billing alert)
Public status page at status.trymemoire.com (planned).