> ## Documentation Index > Fetch the complete documentation index at: https://docs.getbindu.com/llms.txt > Use this file to discover all available pages before exploring further. # Frequently Asked Questions > Answers to common questions about Bindu, setup, architecture, and troubleshooting ## General Concepts Bindu is an AI agent framework that speaks the **A2A** protocol (Agent-to-Agent communication) and the **X402** micropayment extension. It handles infrastructure, identity (DIDs + Hydra OAuth2 + optional mTLS), observability, and payments, turning any local agent script into a production-ready microservice without rewriting code. Native AP2 support is on the roadmap, not shipped. A2A is a JSON-RPC 2.0 protocol that defines task lifecycle, context sharing, agent cards, push notifications, and extensions — Bindu implements **A2A v0.3.0**. The reason Bindu uses it instead of rolling something custom: agents written in different languages, on different frameworks, deployed by different teams can talk to each other without a translation layer. The task-first execution model (every interaction becomes a tracked task with state) is what makes orchestration possible across that boundary. Use a **Workflow** for sequential, step-by-step processing within a single agent (e.g., search -> extract -> summarize). Use a **Team** when you need multiple specialized agents with different identities and tools collaborating on a complex problem. No. With auth off, the middleware never inspects the request and no DID is required. As soon as you turn auth on, the bearer token's `client_id` becomes a DID (the Bindu default), and the agent demands a matching `X-DID` header + signature on every call. The cleanest path is to generate a DID once when you `bindufy()` your agent — `bindufy` writes it to `.bindu/oauth_credentials.json` and the same DID survives across restarts. No — the seed **is** the private key. Without it, you cannot sign requests as that DID and you cannot prove ownership. Generate a fresh seed, derive a new DID, and re-register it with Hydra (with the new public key in `metadata.public_key`). Anything that referenced the lost DID (allowlists, peer agent configs) needs to be updated to point at the new one. Treat the seed file with `.ssh/id_rsa`-level care: chmod 600, never commit, back up to a password manager or HSM. No. The DID's last segment is `sha256(public_key)[0..32]` formatted as a UUID — change the keypair and the derived DID changes too. This is intentional: the DID is a fingerprint of the public key, so "rotating keys but keeping the DID" would defeat the verification chain. If you need to rotate, mint a new DID, register the new public key in Hydra, and update consumers. *** ## Getting Started Confusion Three layers in the same stack, doing different jobs: * **Bindu** — the agent framework itself. You wrap your handler with `bindufy()` and get a Starlette HTTP server speaking A2A on port 3773 by default. * **Gateway** — a separate service (`gateway/`) that plans and fans out work across multiple Bindu agents. You call `POST /plan` with a high-level intent; the gateway decides which agents to invoke and orchestrates the multi-agent flow. * **Inbox** — the UI (`inbox/`) that bootstraps a personal agent, registers it with Hydra, signs outbound messages, and renders agents as Gmail-shaped addresses (`fleet+agent@getbindu.com`). It's the easiest way to talk to a fleet without writing a caller. You can run Bindu standalone without the gateway or inbox. The other two are convenience layers. Yes — `bindufy()` starts a uvicorn server and blocks. That's the intended behavior for a single-agent process. If you want to do other work after starting the server, run it in a background thread, use `asyncio.create_task` on the underlying coroutine, or run multiple agents as separate processes orchestrated by a process manager like `supervisord` or `concurrently`. Technically yes, but it's almost always the wrong call. Each agent occupies its own port and has its own DID, manifest, and storage. Running them as separate processes is simpler, makes scaling and crash isolation cleaner, and matches how the `examples/gateway_test_fleet/` pattern works. If you really need in-process composition, build one Bindu agent whose **handler** orchestrates several underlying framework agents (Agno team, LangGraph nodes, CrewAI crew). Default is `http://localhost:3773` (`default_host=localhost`, `default_port=3773` in `settings.py`). Override with `BINDU_PORT=4000 python agent.py` or `BINDU_HOST=0.0.0.0 BINDU_PORT=4000 python agent.py`. If you need a different public URL than the bind address (e.g., behind a reverse proxy), set `BINDU_DEPLOYMENT_URL=https://my-agent.example.com` — that's what the agent card advertises. `BinduApplication` defaults `url` to `"http://localhost"` and `bindufy()` only overrides it when you pass `deployment.url` through the manifest. Set `BINDU_DEPLOYMENT_URL` in your env, or pass an explicit `url=...` into the deployment dict you give to `bindufy()`. Peers fetching `/.well-known/agent.json` will then see the real address. Three options, ordered by use case: 1. **Development / demo:** `bindufy(launch=True, ...)` creates an FRP tunnel and publishes a public URL. Convenient, not durable. 2. **Production:** put the agent behind your own load balancer + TLS terminator and set `BINDU_DEPLOYMENT_URL` to the public address. 3. **Peer-to-peer over public internet:** turn on mTLS so the agent serves HTTPS directly with a step-ca cert. See the mTLS section below. Only if auth is on and you point at the hosted Hydra (`https://hydra.getbindu.com`). With `AUTH__ENABLED=false`, `STORAGE_TYPE=memory`, `SCHEDULER_TYPE=memory`, and a local LLM (Ollama, LM Studio), the agent runs fully offline. Hydra is the only mandatory external dependency when auth is on — and you can self-host Hydra in an air-gapped network. *** ## Installation & Setup Bindu requires **Python 3.12 or higher**. Bindu is optimized for the `uv` package manager. You can install the core framework by running `uv add bindu`. Using UV, run `uv venv --python 3.12.9`. Then activate it using `source .venv/bin/activate` (macOS/Linux) or `.venv\Scripts\activate` (Windows) before installing dependencies. Yes, but UV is strongly recommended. If you use Conda, create your environment (`conda create -n bindu-env python=3.12`), activate it, and then run `pip install uv` to manage your Bindu dependencies. Run `uv add bindu --upgrade`. Bindu follows semantic versioning, meaning minor and patch updates are backward-compatible. Partially. Bindu can run entirely locally using `InMemoryStorage` and local LLMs (like Ollama). However, initial setup (downloading the LLM weights and installing Python packages) requires an internet connection. *** ## Environment Variables Yes. Bindu automatically loads variables from a `.env` file in your project root. At a minimum, you must set an API key for your chosen LLM provider (e.g., `OPENROUTER_API_KEY`, `OPENAI_API_KEY`, or `MINIMAX_API_KEY`). Use the `export` command in your terminal (e.g., `export BINDU_PORT=4000`), or add them to your `~/.bashrc` or `~/.zshrc` file. In PowerShell, use the `$env:` prefix (e.g., `$env:BINDU_PORT="4000"`). Create separate files (e.g., `.env.staging` and `.env.production`) and load the specific file in your deployment environment, or inject the variables directly via your CI/CD pipeline. Yes. You can use the `ENV` instruction in your Dockerfile, pass them via `docker run -e`, or use the `environment` mapping in a `docker-compose.yml` file. Bindu delegates API call execution to your driver framework (Agno, LangChain). Most frameworks read the API key at initialization, meaning a restart is usually required unless you implement a custom dynamic key loader in your handler function. *** ## Switching Models Bindu is framework-agnostic — it doesn't talk to the LLM directly, your driver framework does. Switch providers by changing the model config in your framework (e.g., Agno: swap `OpenAIChat` → `Anthropic`; LangChain: swap `ChatOpenAI` → `ChatAnthropic`; CrewAI: swap the `llm=` argument) and update the API keys in your `.env`. The Bindu wrapper around your handler stays unchanged. Configure your driver framework to point at your local endpoint (`http://localhost:11434` for Ollama, `http://localhost:1234/v1` for LM Studio) and pass that agent to `bindufy()`. Bindu sees a normal handler — it doesn't care that the LLM happens to be local. Yes. Instantiate each agent with its own model in your driver framework (Agno multi-agent team, LangGraph nodes, CrewAI crew, AutoGen group chat), wrap the orchestration logic in a single handler, and pass that handler to `bindufy()`. Bindu treats the whole composition as one agent with one DID; the model choices live entirely inside your handler. Implement a standard `try/except` fallback loop or use a framework like Tenacity inside your handler function to catch API errors and retry the prompt with a secondary model instance. *** ## Agent Communication Because Bindu uses the standard A2A JSON-RPC protocol, you can use any HTTP client (like `httpx` or `requests`) to send a formatted POST request to the external agent's URL from inside your handler. Every Bindu agent has a unique Decentralized Identifier (DID). Agents expose their capabilities and DID via a `.well-known/agent.json` endpoint, allowing for cryptographically verifiable discovery. Yes. You simply write your LangChain or LangGraph execution logic inside the `handler` function that you pass to `bindufy()`. Use the `context_id` provided in the incoming A2A message payload. Passing this ID to subsequent agents ensures they all read and write to the same conversational memory thread. The task state transitions to `failed` and becomes immutable — refinements create a new task instead of reopening the old one. Bindu's internal scheduler and storage layers have retry-with-backoff for transient infrastructure failures (Redis hiccups, Postgres locks), but **your handler is not wrapped in automatic retries**. If you want LLM-call retries, wrap them yourself with a library like `tenacity` inside your handler. See the [Retry overview](/bindu/learn/retry/overview) for what the framework retries on your behalf. *** ## Tasks & A2A Protocol Mechanics Three IDs, three scopes: * **`messageId`** — one inbound or outbound message. Cheap, single-use. * **`taskId`** — one unit of work the agent is tracking. Has a lifecycle (`submitted` → `working` → `input-required` or `completed`). The task is the thing you poll, cancel, and reference. * **`contextId`** — a conversation thread that may contain many tasks. Use the same `contextId` across calls when you want the agent to see prior turns. Rule of thumb: every task lives inside exactly one context. Every message lives inside exactly one task. A2A's task immutability rule. Once a task reaches a terminal state (`completed`, `failed`, `canceled`, `rejected`), it cannot be reopened. To continue the conversation, create a **new task** in the same `contextId`. If the new task should build on prior outputs, include the old task's ID in `referenceTaskIds`. That's the explicit dependency edge an orchestrator uses to chain work. Send another `message/send` call with the same `taskId` (still in `input-required`, not terminal yet) and the user's answer in the message body. The agent's handler will be called again with the new message appended to the task history. Once your handler returns a non-`input-required` result, the task transitions to its terminal state. It's how you express dependencies between tasks. When you create Task4 and need Task2 and Task3's outputs available, pass `referenceTaskIds: [task2_id, task3_id]` in the message. The agent (and any orchestrator like Sapthami) can then read the referenced tasks' artifacts when planning Task4's work. Without `referenceTaskIds`, dependencies live only in your application logic and orchestrators can't reason about them. Every `message/send` request must include `params.configuration.acceptedOutputModes`, even if it's just `["application/json"]`. The JSON-RPC schema validator rejects the request before auth or the handler even runs. Minimum valid params: ```json theme={null} { "message": { "role": "user", "parts": [{"kind":"text","text":"..."}], "messageId": "...", "contextId": "...", "taskId": "..." }, "configuration": { "acceptedOutputModes": ["application/json"] } } ``` Yes — A2A defines a `FilePart` with two flavors: `FileWithBytes` (inline base64 payload) for small files, `FileWithUri` (presigned URL) for large ones. Your handler returns a part with `kind: "file"`, and the artifact attached to the completed task carries the file part. Clients fetch it via `tasks/get` after the task reaches `completed`. Yes. Call `message/stream` instead of `message/send` — the agent returns a Server-Sent Events (SSE) stream of intermediate status updates and the final result. Use this when your handler produces incremental output (token-by-token LLM responses, multi-step tool calls) and you want clients to render progress instead of waiting for the whole task. Use push notifications. Call `tasks/pushNotificationConfig/set` with a webhook URL on a task, and Bindu will POST a notification to that URL on every status transition. Webhook payloads are DID-signed by the agent so the receiver can verify origin. See the [Notifications](/bindu/learn/notification/overview) page for the webhook contract. With `STORAGE_TYPE=memory`, tasks live for the lifetime of the process. With `STORAGE_TYPE=postgres`, tasks persist forever unless you implement a retention policy yourself — Bindu does not auto-purge completed tasks. For production, set up a periodic job to archive or delete tasks older than your retention window (e.g., 30 or 90 days). *** ## Multi-Agent Topologies Yes. Inside your handler, use any HTTP client (`httpx`, `requests`) to POST a JSON-RPC `message/send` to the other agent's URL. If the called agent has auth on, you need a bearer token + DID signature on the outbound call — easiest to use the gateway as a signing proxy, or use the helper in `bindu/utils/did/signature.py` to build the signed headers. Pass `contextId` through if you want the called agent to see your task's conversation history. Every Bindu agent serves its public manifest at `/.well-known/agent.json`. To discover an agent, fetch that URL — you get the agent's DID, advertised skills, supported protocols, and capabilities. There is no central registry: a fleet "exists" because each agent's URL is known to its callers (registered in the inbox, configured in the gateway, hardcoded by your orchestrator). For more sophisticated discovery, the negotiation extension (`/agent/negotiation`) lets agents bid on capability requests. Use the **gateway** when you need plan-then-execute multi-agent workflows from outside (a client sends a high-level intent; the gateway decides who runs what). Use **handler-to-handler** calls when one agent has a deterministic dependency on another (an orchestrator agent that always delegates summarization to a summary agent, for example). Rule of thumb: gateway for dynamic, intent-driven fan-out; direct calls for static, code-driven composition. Two patterns: 1. **Gateway-driven:** POST a single `/plan` request to the gateway with the high-level intent. The gateway parses, dispatches to several agents in parallel, and aggregates the results. 2. **Handler-driven:** in your orchestrator agent's handler, use `asyncio.gather` to call N peer agents concurrently, then synthesize the responses. Each peer call produces its own `taskId`; you can include all of them in `referenceTaskIds` on the synthesis task so the dependency graph is explicit. Yes — `PostgresStorage` namespaces every row by the agent's DID. Pointing N agents at the same `DATABASE_URL` is safe and is the recommended way to run a horizontally scaled fleet: each agent has its own slice, but ops only manages one database. Tasks, contexts, and artifacts never leak across DIDs. Yes — there's a parallel gRPC transport in `bindu/grpc/` for language-agnostic agent clients. The gRPC port is separate from the HTTP port (3773 by default for HTTP). Use gRPC when you need an agent driver in a non-Python language and don't want to hand-roll the A2A JSON-RPC contract. See the [Multi-Language Sidecar](/bindu/grpc/overview) docs. *** ## Memory & State Bindu supports Task History (A2A message arrays), Context Memory (threaded conversations), and Agent State. It provides `InMemoryStorage` for testing and `PostgresStorage` for production. Set `STORAGE_TYPE=postgres` and provide a `DATABASE_URL` in your environment variables. Bindu will automatically persist task histories and contexts. Bindu delegates Retrieval-Augmented Generation (RAG) to your driver framework. Use tools like Agno's `PDFKnowledgeBase` or LangChain's vector store retrievers inside your handler logic. Context window management (like sliding windows or token summarization) must be handled by your driver framework before passing the message array to the LLM. *** ## Tools & Integrations Define your custom tools as standard Python functions and provide them to your driver framework's tool array. Yes. A skill is a directory containing either a `skill.yaml` or a `SKILL.md` file describing the capability (id, name, description, input/output schema). You pass the directory path to `load_skills([...])`, and the agent advertises each loaded skill at `/agent/skills/{skill_id}`. There is no implicit `skills/` folder the framework auto-scans — paths are registered explicitly via `bindufy()`. Use the database or vector integrations provided by your driver framework (e.g., LangChain's `SQLDatabaseToolkit` or Agno's `PgVector`). If your agent needs human approval, have your handler return a dictionary with `"state": "input-required"` and a `"prompt"` asking for clarification. Bindu will pause the task until the user responds. Bindu delegates this to the driver framework. You can easily attach tools like `DuckDuckGoTools` or `ScrapeGraph` to your agent logic. *** ## Structured Outputs Enforce JSON schemas using your driver framework (e.g., Pydantic `response_model` passing) and ensure your client sends `acceptedOutputModes: ["application/json"]` in the A2A request configuration. If the underlying model does not support native JSON mode, you must prompt the model to return JSON text and manually parse/extract it in your handler function. Use a retry library like `Tenacity` inside your handler to catch JSON parsing errors or Pydantic `ValidationError`s, and feed the error back to the LLM as a retry prompt. *** ## Rate Limiting & Cost Management TPM limits are enforced by your LLM provider (OpenAI, Anthropic). You can resolve this by adding retry logic with exponential backoff to your handler, or switching to an enterprise tier. Use Bindu's Redis integration by setting `SCHEDULER_TYPE=redis`. This allows you to queue tasks and control worker concurrency across distributed instances. Request caching and token counting must be implemented inside your driver framework's handler logic before the API call is made. *** ## Debugging & Common Errors Set the environment variable `LOGGING__DEFAULT_LEVEL=DEBUG` before starting your agent. The real error class is `AuthenticationRequiredError` (code `-32009`), and it means the bearer token is missing, expired, or not recognized by the Hydra introspection endpoint the agent is pointing at. Note: Bindu uses **opaque Hydra tokens** introspected at runtime, not JWTs — the legacy error message mentions "JWT" but the actual flow goes through `/admin/oauth2/introspect`. Quick fixes in order: (1) mint a fresh token, (2) confirm your agent's `HYDRA__ADMIN_URL` matches the Hydra that issued the token, (3) for local development, set `AUTH__ENABLED=false` to bypass auth entirely. See [Making Authenticated Requests](/bindu/learn/authentication/making-requests) for the 4-gate decoder table. These are the three DID-signature failure modes the auth middleware reports — they only appear when the bearer token's `client_id` is a DID (the Bindu default): * **`did_mismatch`** — the `X-DID` header doesn't equal the token's `client_id`. Mint the token with the same DID you send. * **`public_key_unavailable`** — Hydra has no `metadata.public_key` for that DID. `GET /admin/clients/` and patch the metadata. * **`invalid_signature`** — body bytes changed between sign and send, JSON canonicalization mismatch (the JS-vs-Python whitespace gotcha), wrong seed, or clock skew >300s. Full troubleshooting table and a canonical cross-language fixture lives at [Making Authenticated Requests](/bindu/learn/authentication/making-requests). `curl http://localhost:3773/health` — if this returns `200` without an `Authorization` header, the agent is up but health is intentionally public. To check enforcement, POST a JSON-RPC method without a token: `curl -X POST http://localhost:3773/ -H 'Content-Type: application/json' -d '{"jsonrpc":"2.0","id":"1","method":"message/send","params":{"message":{"role":"user","parts":[{"kind":"text","text":"hi"}],"messageId":"x","contextId":"y","taskId":"z"},"configuration":{"acceptedOutputModes":["application/json"]}}}'`. If you get `-32009`, auth is on. If you get a normal task response, `AUTH__ENABLED` is false. Tool looping is a model behavior issue. You can prevent it by setting a maximum tool iteration limit in your driver framework (e.g., `max_tool_iterations=5`). This happens when system prompts are ambiguous. Ensure your tool descriptions are highly specific, and use validation inside your handler to catch and reject hallucinated tool calls. Use Python's `unittest.mock.patch` to mock the LLM provider's response, or build a simple mock handler that returns hardcoded text if a `TEST_MODE` environment variable is true. Set `TELEMETRY_ENABLED=true` and configure `OLTP_ENDPOINT`, `OLTP_SERVICE_NAME`, and `OLTP_HEADERS` in your environment. Bindu will automatically export traces to platforms like Langfuse, Arize, or any OTLP-compatible collector. The env var prefix is `OLTP_*` (not `OTLP_*`) in the current code. Yes, that's a typo of the OpenTelemetry Protocol acronym — it's load-bearing in [`bindu/utils/config/enricher.py`](https://github.com/getbindu/Bindu/blob/main/bindu/utils/config/enricher.py), so use `OLTP_*` when configuring your `.env`. *** ## Deployment Deploy Bindu inside a Docker container, set `STORAGE_TYPE=postgres` with a `DATABASE_URL` and `SCHEDULER_TYPE=redis` with a `REDIS_URL`, and call `bindufy()` with `launch=False` (the default) so no public FRP tunnel is created. Front the container with your own load balancer + TLS, and turn auth on with `AUTH__ENABLED=true` pointing at a production Hydra. For wire-level encryption between agents, see the [Security Stack](/bindu/learn/authentication/security-stack) page for the full mTLS env block. Deploy multiple instances of your Bindu agent container behind a load balancer. Point all instances to the same Redis instance (`REDIS_URL`) and PostgreSQL database (`DATABASE_URL`) so they can share the task queue and memory state. Prometheus-formatted metrics including request rate, request latency histograms, active task counts grouped by state (`submitted`, `working`, `input-required`), worker utilization, queue depth, and storage operation durations. Scrape interval 15s is a reasonable default. Combine with the OTLP traces (`TELEMETRY_ENABLED=true`) for full request-path observability. Run at least two replicas behind a load balancer, both pointed at the same `DATABASE_URL` and `REDIS_URL`. Roll one replica at a time: drain in-flight HTTP requests via a `SIGTERM`-triggered graceful shutdown (uvicorn handles this), let the second replica pick up new tasks from the shared Redis queue, then bring the new version up. Because tasks live in Postgres and the queue lives in Redis, the rolling restart doesn't drop work in flight. `GET /health` returns 200 with a JSON body including `application.penguin_id`, `application.agent_did`, runtime version, and storage/scheduler readiness. The `/healthz` endpoint is a stricter k8s-style readiness probe — returns 200 only when storage and scheduler are both reachable. Use `/health` for liveness, `/healthz` for readiness gates. Yes. A typical deployment is a `Deployment` with N replicas, a `Service` for in-cluster traffic, an `Ingress` for external traffic, plus a `ConfigMap` for non-secret env (`STORAGE_TYPE=postgres`, `SCHEDULER_TYPE=redis`) and a `Secret` for credentials (`DATABASE_URL`, `REDIS_URL`, Hydra client secrets). Use `/healthz` as the readiness probe and `/health` as the liveness probe. For mTLS, store cert files in a `Secret` mounted at `~/.bindu/` or let the agent fetch them from step-ca on boot. *** ## mTLS & Wire Security You don't need mTLS for a single-tenant agent behind your own TLS terminator — Hydra + DID signing already prove who the caller is and that the body wasn't tampered with. You **do** need mTLS when peer agents talk to each other directly over the open internet (no shared load balancer), when bearer tokens shouldn't traverse the wire in cleartext at any layer, or when you want a cryptographic bind between the TCP socket and the DID. See [Security Stack](/bindu/learn/authentication/security-stack) for the three-layer model. ```bash theme={null} export AUTH__ENABLED=true export AUTH__PROVIDER=hydra export HYDRA__ADMIN_URL=https://hydra-admin.getbindu.com export HYDRA__PUBLIC_URL=https://hydra.getbindu.com export MTLS__ENABLED=true export MTLS__MODE=hybrid # mtls + Hydra both checked export MTLS__REQUIRE_CLIENT_CERT=false # set true for strict mTLS export MTLS__CA_URL=https://ca.getbindu.com export MTLS__CA_ROOT_URL=https://ca.getbindu.com/roots.pem ``` The agent then registers with Hydra, exchanges an OIDC token at step-ca for a 24h X.509 cert, and serves uvicorn over HTTPS. Cert TTL is 24h; renewal kicks in 8h before expiry. The #1 cause is `load_dotenv` ordering. Bindu's `app_settings` is constructed at module-import time. If your `agent.py` imports `bindu` before calling `load_dotenv()`, your `MTLS__*` env vars land in `os.environ` but never reach the settings singleton, and the agent silently falls back to HTTP. **Fix:** call `load_dotenv()` first, before any `bindu` import. Confirm by greping the boot log for `Bootstrapping mTLS` — if it's missing, the settings never saw your env block. Your Hydra client was registered before mTLS was enabled, so its `audience` array doesn't include `step-ca`. Recent Bindu builds reconcile this drift on every boot — restart the agent and the registration flow patches the audience. If you're on an older build, delete `.bindu/oauth_credentials.json` and restart to force a fresh registration. ```bash theme={null} openssl x509 \ -in ~/.bindu/personal/.bindu/tls_cert.pem \ -noout -subject -dates -ext subjectAltName ``` You should see the agent's DID in the SAN URI as `https://hydra.getbindu.com#did:bindu:...`. Issuer should be `CN=Bindu Intermediate CA`. Cert validity should be 24h from the renewal timestamp. Delete the cert files and restart the agent — it regenerates on next boot: ```bash theme={null} rm ~/.bindu/personal/.bindu/tls_*.pem ~/.bindu/personal/.bindu/ca_bundle.pem ``` There is no CRL or OCSP in this design — short TTL + renewal **is** the revocation strategy.