Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.getbindu.com/llms.txt

Use this file to discover all available pages before exploring further.

A process can be “running” and still broken. Postgres is down, Redis dropped the connection, the worker pool deadlocked — the agent is still binding port :3773 and answering TCP, but the next real request fails. So /health in Bindu does more than return 200 OK. It walks the runtime — storage, scheduler, task manager — and returns 503 Service Unavailable when any of them is missing or not running. /metrics exposes Prometheus text-format counters, gauges, histograms, and summaries for request traffic, latency, task state, errors, and per-agent task completion. You do not configure either. Both endpoints are registered on BinduApplication startup, both are on the auth allowlist, and both are exempt from the startup gate so Kubernetes can probe the pod while storage and the scheduler are still coming up. Point your load balancer at /health and your scrapers at /metrics — you have visibility from request one.

Liveness, readiness, metrics

Bindu does not split liveness and readiness across multiple paths. There is one health endpoint and it makes the strict choice for you: 200 means every dependency is ready, 503 means at least one is not.
QuestionEndpointStatus codeUse for
Is the process alive at all?/health reachableTCP-levelLiveness probe
Is the agent ready to take work?/health returns 200200 vs 503Readiness probe, load balancer health checks
How has it been behaving over time?/metrics200Prometheus scrape, Grafana dashboards
Strict readiness means storage + scheduler + task manager are all initialized and the task manager’s run loop is active. Anything less and /health returns HTTP 503 with "status": "degraded", "ready": false, "health": "degraded".

Readable

/health returns a single JSON document with runtime, application, and system blocks.

Scrapeable

/metrics is Prometheus text format version=0.0.4 — scrape it with anything OpenMetrics-compatible.

Probe-safe

Both paths are exempt from the startup gate and the auth middleware allowlist, so probes never get a 500 or 401.

Request flow

The metrics middleware runs on every request except /metrics itself (skipped to avoid feedback loops), sanitizes path parameters (UUIDs and numeric IDs are rewritten to :id to keep cardinality bounded), and feeds the global PrometheusMetrics singleton.

/health

1

Probe it

curl -i http://localhost:3773/health
2

Read the fields

FieldTypeMeaning
statusstring"ok" when strict_ready=true, else "degraded"
readyboolMirrors runtime.strict_ready — use this for readiness probes
healthstring"healthy" or "degraded" — same trigger as status
uptime_secondsfloatSeconds since the process imported health.py (monotonic clock)
versionstringValue of bindu.__version__
runtime.storage_backendstring | nullClass name of app._storage (e.g. PostgresStorage, InMemoryStorage)
runtime.scheduler_backendstring | nullClass name of app._scheduler (e.g. RedisScheduler, MemoryScheduler)
runtime.task_manager_runningboolTrue only if app.task_manager.is_running
runtime.strict_readyboolTrue when all of storage, scheduler, task manager are present and running
application.penguin_idstringUUID identifying this server process
application.agent_didstring | nullAgent DID from the manifest, if present
system.python_versionstringsys.version.split()[0]
system.platformstringplatform.system() — e.g. Linux, Darwin
system.environmentstringValue of $ENV env var, defaults to "development"
3

Wire it to your orchestrator

Kubernetes treats any 2xx as healthy; 503 fails the probe. That maps directly to strict_ready, so a single config covers both liveness and readiness:
livenessProbe:
  httpGet:
    path: /health
    port: 3773
  initialDelaySeconds: 5
  periodSeconds: 10
readinessProbe:
  httpGet:
    path: /health
    port: 3773
  initialDelaySeconds: 2
  periodSeconds: 5
During boot — before storage and scheduler initialize inside the ASGI lifespan — /health returns 503 without raising. The startup gate in BinduApplication.__call__ whitelists /health, /healthz, and /metrics, so probes get a proper response code instead of a 500.
Only /health is registered as a route. The /healthz path is whitelisted in the startup gate (so it cannot 500 mid-boot) but no handler exists for it — a request to /healthz returns 404 once the app is running. Use /health everywhere.

/metrics

curl http://localhost:3773/metrics
Returns text/plain; version=0.0.4; charset=utf-8 with Cache-Control: no-cache. Before generating output, the endpoint refreshes the per-agent active-task gauge by counting tasks in submitted, working, and input-required states from storage.
# HELP http_requests_total Total number of HTTP requests
# TYPE http_requests_total counter
http_requests_total{method="GET",endpoint="/health",status="200"} 42
http_requests_total{method="POST",endpoint="/",status="200"} 17

# HELP http_request_duration_seconds HTTP request latency
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.1"} 55
http_request_duration_seconds_bucket{le="0.5"} 58
http_request_duration_seconds_bucket{le="1.0"} 59
http_request_duration_seconds_bucket{le="+Inf"} 59
http_request_duration_seconds_sum 4.2
http_request_duration_seconds_count 59

# HELP agent_tasks_active Currently active tasks
# TYPE agent_tasks_active gauge
agent_tasks_active{agent_id="did:bindu:abc123"} 3

# HELP agent_tasks_completed_total Total completed tasks
# TYPE agent_tasks_completed_total counter
agent_tasks_completed_total{agent_id="did:bindu:abc123",status="success"} 128

# HELP http_requests_in_flight Current number of HTTP requests being processed
# TYPE http_requests_in_flight gauge
http_requests_in_flight 1

Metric reference

Total HTTP requests handled by MetricsMiddleware, labelled by method, sanitized endpoint, and status code.Path parameters are normalized: UUIDs match [0-9a-f]{8}-… and runs of digits both collapse to /:id. That keeps cardinality bounded even with high-traffic task endpoints like /tasks/<uuid>.
rate(http_requests_total[5m])
sum by (status) (rate(http_requests_total[1m]))
Request latency in seconds. Buckets: 0.1, 0.5, 1.0, +Inf. Exposes _bucket, _sum, _count.The histogram is global, not per-endpoint — there are no method or path labels on the buckets. If you need per-route p95, build it from request count buckets in your aggregator, or alert on the global p95.
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
Tasks currently in submitted, working, or input-required state for the agent. Updated lazily — only when /metrics is scraped, by querying storage.count_tasks(status=…) for each active state. Only emitted when at least one agent has reported.
Cumulative finished tasks per (agent_id, status). Status values come from the task manager and are typically success, failed, or canceled. Only emitted when at least one task has completed.
sum by (status) (rate(agent_tasks_completed_total[5m]))
End-to-end task execution time, labelled by agent_id and status. Buckets: 1, 5, 10, 30, 60, +Inf seconds. Exposes _bucket, _sum, _count. Only emitted when at least one task has finished.
histogram_quantile(0.95,
  sum by (le, agent_id) (rate(task_duration_seconds_bucket[5m]))
)
Errors raised during task execution, labelled by agent_id and error_type (e.g. timeout, validation, execution). Only emitted when at least one error has been recorded.
Aggregate byte totals and counts for request and response bodies. Exposed as Prometheus summaries with _sum and _count only (no quantiles). Sizes are read from the Content-Length header — chunked responses without a length register as zero.
Number of concurrent in-flight HTTP requests. Incremented in the middleware before call_next, decremented in the finally block, so it always reflects current concurrency even on exceptions.Always emitted, even at zero — useful for “agent is alive but idle” dashboards.

Prometheus scrape config

scrape_configs:
  - job_name: bindu-agents
    metrics_path: /metrics
    scrape_interval: 15s
    static_configs:
      - targets:
          - localhost:3773
        labels:
          service: bindu
          env: production
The metrics middleware skips collection for /metrics itself, so scrape calls do not inflate http_requests_total. Your dashboards measure user traffic, not Prometheus polling.

Use cases

Use /health for both probes. The endpoint returns 503 while storage or scheduler is initializing, so kubelet does not route traffic to a half-booted pod, and it returns 200 only once task_manager.is_running is true.
Point your LB at GET /health. Anything other than 200 should drain the instance. status: "degraded" is your signal that the process is up but should not receive traffic.
Scrape /metrics every 15s. Build dashboards on rate(http_requests_total), histogram_quantile(0.95, …) for p95 latency, agent_tasks_active for concurrency, and agent_errors_total for fault rates.
If an agent appears live but rejects work, hit /health and inspect the runtime block. A null storage_backend or scheduler_backend means initialization never completed; task_manager_running: false means the loop exited or never started.
curl -s http://localhost:3773/health | jq .runtime
Bindu’s Sentry integration filters /health, /healthz, and /metrics out of transaction traces by default (SentrySettings.filter_transactions), so probe and scrape traffic does not eat your Sentry quota or pollute performance dashboards.

Best practices

Probe readiness, not just the port

A TCP check on :3773 will pass while storage is still connecting. Use HTTP /health and treat any non-200 as not-ready.

Scrape every 15s

The default Prometheus interval is enough for agent_tasks_active to track real concurrency. Below 10s mostly buys noise.

Alert on degraded, not down

By the time /health is unreachable you are already on fire. Alert on status="degraded" or health="degraded" returns for an early warning.

Keep both paths public

Both /health and /metrics are in the auth allowlist by default. Do not put them behind your OAuth gateway — probes and scrapers do not carry tokens.

Sunflower LogoWith Bindu, agent health is like a field of sunflowers — each one standing on its own, yet easy to observe and trust.