> ## Documentation Index
> Fetch the complete documentation index at: https://docs.getbindu.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Open a plan; stream SSE of the orchestration.

> Accepts a user question + agent catalog, starts (or resumes) a
session, and streams Server-Sent Events as the planner runs.

### Session continuation

Pass `session_id` to resume an existing session — history persists,
the planner sees prior turns. Omit to start a fresh session. The
server returns the resolved `session_id` in the first SSE frame
(`event: session`), even for new sessions, so clients can cache it.

### Catalog immutability per session

The `agents` catalog is stored on first plan and refreshed on each
subsequent call; agents added or removed between plans take effect
immediately but don't retroactively change prior turns' tool sets.

### Streaming & abort

Closing the HTTP connection aborts the plan — in-flight A2A calls
receive an `AbortSignal` and the planner loop terminates. Clients
that want a partial result should buffer `text.delta` frames
client-side rather than relying on `final`.




## OpenAPI

````yaml /gateway-openapi.yaml post /plan
openapi: 3.1.0
info:
  title: Bindu Gateway API
  version: 1.0.0
  summary: >-
    External HTTP surface of the Bindu Gateway — a task-first orchestrator that
    plans over a caller-supplied catalog of A2A agents.
  description: >
    # Bindu Gateway API


    The **Bindu Gateway** sits between an external system (your app, a custom

    frontend, another service) and one or more **Bindu A2A agents**. It takes

    a user question + an agent catalog and returns a streaming plan: the

    gateway's planner LLM decomposes the request, invokes A2A agents via the

    polling protocol, and emits Server-Sent Events in real time.


    Distinct from the per-agent **Bindu Agent API** (see the repo-root

    `openapi.yaml`), which describes what a single `bindufy()`-built agent

    exposes. This spec documents the **gateway** — the orchestrator sitting

    one layer up.


    ---


    ## Mental model: one endpoint, many turns


    Every orchestration goes through `POST /plan`. Inside, the planner LLM

    runs an agentic loop — it calls A2A agents as tools, the results feed

    back into the LLM, and the loop continues up to `max_steps` or until the

    plan resolves.


    Two auxiliary endpoints support health probing and DID-based peer

    authentication:


    | Path | Purpose |

    |---|---|

    | `POST /plan` | Open a new plan or resume an existing session. Streams SSE.
    |

    | `GET /health` | Liveness + cheap config probe. |

    | `GET /.well-known/did.json` | The gateway's own DID document (only when a
    DID identity is configured via env). |


    ---


    ## Request shape


    A `/plan` request carries three things:


    1. **`question`** — the user's natural-language input.

    2. **`agents[]`** — the catalog of A2A peers the planner may call, each
       with an endpoint, authentication descriptor, and list of skills.
       The gateway does **not** host agents; the caller is always the
       source of truth for "what can we reach."
    3. **`preferences`** and **`session_id`** (both optional) — caps and
       continuation handles.

    The shape is stable and additive; unknown top-level keys are accepted

    (forward-compatible `.passthrough()`), but `preferences` keys are strict

    snake_case. Clients sending camelCase preferences will have them

    silently dropped — match the schema below.


    ---


    ## Response shape — Server-Sent Events


    The happy path returns `200 OK` with `Content-Type: text/event-stream`.

    Errors surface in three ways depending on when they occur:


    - **Before streaming starts** (auth failure, invalid JSON, malformed
      request, session creation failure): `401`/`400`/`500` with a JSON
      `{ error, detail? }` body.
    - **During streaming** (planner or tool failure): a single
      `event: error` SSE frame, followed by `event: done`.
    - **Never silent** — every successful plan closes with `event: done`
      (empty payload). Consumers should treat the absence of `done` as
      an incomplete stream.

    SSE events emitted during a plan, in typical order:


    | Event | When | Purpose |

    |---|---|---|

    | `session` | Once, before the plan starts | Carries session identifiers so
    clients can correlate. |

    | `plan` | Once, when the planner starts its first turn | Announces plan_id.
    |

    | `text.delta` | Many (streaming planner output) | Incremental text chunks
    for the final assistant message. |

    | `task.started` | Per A2A tool call | The planner decided to call a peer
    agent. |

    | `task.artifact` | Per A2A tool call | The peer returned an artifact,
    wrapped in a `<remote_content>` envelope. |

    | `task.finished` | Per A2A tool call | Terminal state of the peer call. |

    | `final` | Once, at the end | Stop reason + usage counters. |

    | `error` | Only on failure during streaming | Human-readable message. |

    | `done` | Always last | Empty marker so clients can close cleanly. |


    ---


    ## Recipes (internal)


    The gateway supports **progressive-disclosure recipes** — markdown

    playbooks the planner lazy-loads when a task matches (e.g.,

    "multi-agent research", "payment-required flow"). Recipes are operator-

    authored and not part of this HTTP API surface: they live in

    `gateway/recipes/` and are injected automatically into the planner's

    system prompt as metadata, with the body fetched on demand via an

    internal `load_recipe` tool.


    You cannot upload, list, or invoke recipes via the HTTP API; they

    influence the planner's behavior transparently. See the gateway README

    §Recipes for authoring details.


    ---


    ## A2A protocol pass-through


    The gateway speaks A2A (JSON-RPC 2.0 over HTTP) to every peer in

    `agents[]` — `message/send` + `tasks/get` polling, with DID signature

    verification when configured. A2A task states (`submitted`, `working`,

    `input-required`, `auth-required`, `payment-required`, `completed`,

    `failed`, `canceled`) flow through to the planner; terminal states

    become `task.finished` events, non-terminal states can surface as

    planner text or trigger recipe-based handling (e.g., surfacing a

    `payment-required` URL to the user).


    See the Bindu Agent API spec (`openapi.yaml` at the repo root) for the

    full A2A protocol surface.
  contact:
    name: Bindu Team
    url: https://docs.getbindu.com/
  license:
    name: Apache-2.0
servers:
  - url: http://localhost:3774
    description: Local development (default port)
  - url: https://gateway.example.com
    description: Production deployment (replace with your host)
security: []
tags:
  - name: Plan
    description: |
      Open a new plan or resume an existing session. Server-Sent Events
      stream back the planner's turn-by-turn output, tool calls, and
      final answer.
  - name: Health
    description: Liveness and basic configuration probes.
  - name: Identity
    description: |
      The gateway's self-published DID document, for A2A peers that
      need to verify `did_signed` outbound calls. Only exposed when
      the gateway has a DID identity configured via env.
paths:
  /plan:
    post:
      tags:
        - Plan
      summary: Open a plan; stream SSE of the orchestration.
      description: |
        Accepts a user question + agent catalog, starts (or resumes) a
        session, and streams Server-Sent Events as the planner runs.

        ### Session continuation

        Pass `session_id` to resume an existing session — history persists,
        the planner sees prior turns. Omit to start a fresh session. The
        server returns the resolved `session_id` in the first SSE frame
        (`event: session`), even for new sessions, so clients can cache it.

        ### Catalog immutability per session

        The `agents` catalog is stored on first plan and refreshed on each
        subsequent call; agents added or removed between plans take effect
        immediately but don't retroactively change prior turns' tool sets.

        ### Streaming & abort

        Closing the HTTP connection aborts the plan — in-flight A2A calls
        receive an `AbortSignal` and the planner loop terminates. Clients
        that want a partial result should buffer `text.delta` frames
        client-side rather than relying on `final`.
      operationId: postPlan
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/PlanRequest'
            examples:
              minimal:
                summary: Simplest possible plan (no agents)
                value:
                  question: What's the capital of France?
              singleAgent:
                summary: One agent with one skill, no auth
                value:
                  question: Find 3 recent papers on LLM evaluation.
                  agents:
                    - name: research
                      endpoint: http://localhost:3773
                      auth:
                        type: none
                      skills:
                        - id: search
                          description: Web search.
              multiAgentDIDSigned:
                summary: Two agents, DID-signed auth, session continuation
                value:
                  session_id: client-session-42
                  question: >-
                    Compare AWS and GCP pricing for a 5-node Kubernetes cluster;
                    then summarize for a non-technical audience.
                  agents:
                    - name: pricing
                      endpoint: https://pricing.example.com
                      auth:
                        type: did_signed
                      trust:
                        verifyDID: true
                        pinnedDID: did:bindu:pricing-agent-key-1
                      skills:
                        - id: compare
                          description: Compare cloud pricing.
                          inputSchema:
                            type: object
                            properties:
                              provider_a:
                                type: string
                              provider_b:
                                type: string
                              workload:
                                type: string
                            required:
                              - provider_a
                              - provider_b
                              - workload
                    - name: summarizer
                      endpoint: https://summarize.example.com
                      auth:
                        type: bearer_env
                        envVar: SUMMARIZER_TOKEN
                      skills:
                        - id: summarize
                          description: Summarize text for a target audience.
                  preferences:
                    max_steps: 8
                    timeout_ms: 60000
      responses:
        '200':
          description: |
            SSE stream of the plan. Each event is one of the types
            documented under `SSEEvent` below. The stream closes after
            `event: done`.
          content:
            text/event-stream:
              schema:
                $ref: '#/components/schemas/SSEStream'
              examples:
                happyPath:
                  summary: Plan with one tool call and a final answer
                  value: >
                    event: session

                    data:
                    {"session_id":"s_01H...","external_session_id":"client-session-42","created":true}


                    event: plan

                    data: {"plan_id":"m_01H...","session_id":"s_01H..."}


                    event: task.started

                    data:
                    {"task_id":"call_01H...","agent":"research","agent_did":null,"skill":"search","input":{"input":"Find
                    3 recent papers on LLM evaluation."}}


                    event: task.artifact

                    data:
                    {"task_id":"call_01H...","agent":"research","agent_did":null,"content":"<remote_content
                    agent=\"research\" verified=\"unknown\">Paper A ...\nPaper B
                    ...\nPaper C
                    ...</remote_content>","title":"@research/search"}


                    event: task.finished

                    data:
                    {"task_id":"call_01H...","agent":"research","agent_did":null,"state":"completed"}


                    event: text.delta

                    data:
                    {"session_id":"s_01H...","part_id":"p_01H...","delta":"Here
                    are three recent papers on LLM evaluation:\n\n"}


                    event: final

                    data:
                    {"session_id":"s_01H...","stop_reason":"stop","usage":{"inputTokens":1820,"outputTokens":312,"totalTokens":2132,"cachedInputTokens":0}}


                    event: done

                    data: {}
        '400':
          description: |
            Malformed JSON, missing required fields, schema validation
            failure, or a catalog that would produce colliding tool ids
            (two entries whose `<agent>_<skill>` combination normalizes
            to the same value — silently swallowed before this guard,
            which let one peer mask another).
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              examples:
                missingField:
                  summary: Schema validation failure
                  value:
                    error: invalid_request
                    detail: 'question: Required; question must be a non-empty string'
                collidingToolIds:
                  summary: Two catalog entries produce the same normalized tool id
                  value:
                    error: invalid_request
                    detail: >-
                      agents catalog has colliding tool ids — toolId
                      "call_research_search" produced by: research/search,
                      research/search
        '401':
          description: Missing or invalid bearer token.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error: unauthorized
        '500':
          description: |
            Session creation failed (database unreachable, Supabase row
            insertion error, etc.). Only emitted **before** the SSE stream
            opens — once streaming starts, errors surface as `event: error`
            on the stream.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error: session_failed
                detail: 'Supabase insert failed: connection refused'
      security:
        - bearerAuth: []
components:
  schemas:
    PlanRequest:
      type: object
      additionalProperties: true
      required:
        - question
      properties:
        question:
          type: string
          minLength: 1
          description: |
            The user's natural-language question. Non-empty — an empty
            string is rejected upstream because some LLM providers
            (Anthropic) reject empty user messages with a 400 mid-stream,
            surfacing as a vague "Provider returned error". Validating
            here gives a clean 400 with `invalid_request` instead.
          example: Summarize the latest quarterly results for Apple.
        agents:
          type: array
          default: []
          description: |
            Catalog of A2A peers the planner may call. Empty array =
            planner runs with no tools (useful for questions the
            configured planner LLM can answer on its own, e.g.,
            general knowledge).
          items:
            $ref: '#/components/schemas/AgentRequest'
        preferences:
          $ref: '#/components/schemas/PlanPreferences'
        session_id:
          type: string
          description: |
            Opaque external session identifier. If provided AND a session
            row exists with the matching `external_session_id`, that
            session is resumed (history persists). If omitted or
            unmatched, a new session is created and its server-assigned
            id is surfaced in the first SSE `session` event.
          example: client-session-42
    SSEStream:
      type: string
      description: |
        The `text/event-stream` body is a sequence of `event:` / `data:`
        pairs. Each `data:` value is a JSON object matching one of the
        `SSEEvent_*` schemas below. OpenAPI doesn't model SSE natively;
        `$ref` the per-event schemas to generate typed consumers.
    ErrorResponse:
      type: object
      required:
        - error
      properties:
        error:
          type: string
          enum:
            - unauthorized
            - invalid_request
            - session_failed
          description: Machine-readable error code.
        detail:
          type: string
          description: >-
            Human-readable explanation. Absent for `unauthorized` (don't leak
            whether a token matched any configured value).
    AgentRequest:
      type: object
      required:
        - name
        - endpoint
      properties:
        name:
          type: string
          description: |
            Display name of the peer. Used to derive the tool id exposed
            to the planner LLM (`call_<name>_<skillId>`) and to correlate
            SSE events back to the catalog entry. Operator-chosen and
            potentially collision-prone — use `trust.pinnedDID` for a
            cryptographically stable identifier.
          example: research
        endpoint:
          type: string
          format: uri
          description: |
            Absolute HTTP(S) URL where the peer's A2A endpoint is
            reachable. The gateway POSTs JSON-RPC envelopes here for
            `message/send` and `tasks/get`.
          example: http://localhost:3773
        auth:
          $ref: '#/components/schemas/PeerAuth'
        trust:
          $ref: '#/components/schemas/PeerTrust'
        skills:
          type: array
          default: []
          description: |
            Peer capabilities the planner may invoke. Each becomes one
            dynamic tool scoped to this request. The gateway does NOT
            discover skills from the peer's `AgentCard` — the caller
            declares them, ensuring the planner sees only capabilities
            the caller vouches for.
          items:
            $ref: '#/components/schemas/SkillRequest'
    PlanPreferences:
      type: object
      additionalProperties: true
      description: |
        Caps and shaping hints. All keys are **snake_case**; an earlier
        draft declared them camelCase, which caused docs-compliant clients
        to silently lose the caps — the schema is now strict on casing
        and unknown keys pass through via `additionalProperties: true`
        for forward compatibility.
      properties:
        response_format:
          type: string
          description: |
            Advisory hint for the planner's final-message format
            (`"markdown"`, `"plain"`, `"json"`, etc.). Not enforced by
            the gateway; the planner may honor or ignore it.
        max_hops:
          type: integer
          minimum: 1
          description: |
            Maximum number of A2A hops (recursive peer-to-peer calls)
            the gateway allows. Phase 2+ enforced; currently informational.
        timeout_ms:
          type: integer
          minimum: 1000
          maximum: 21600000
          description: |
            Overall wall-clock budget for the `/plan` call, in
            milliseconds. Applies to the entire planner loop including
            LLM calls, compaction, and every downstream peer call
            combined. When the budget expires, in-flight peer polls
            are aborted and a best-effort `tasks/cancel` is dispatched
            to each peer; the gateway then returns
            `BinduError(-32040, AbortedByCaller)` with
            `data.reason = "deadline"`.

            Default when unset: **1,800,000** ms (30 minutes).
            Minimum: 1,000 ms. Maximum: 21,600,000 ms (6 hours).
            Requests above the ceiling are rejected at the API
            boundary as `invalid_request` — callers with genuine
            multi-hour workloads set it explicitly.
          example: 1800000
        max_steps:
          type: integer
          minimum: 1
          description: |
            Maximum agentic loop steps. Overrides the planner agent's
            default (`agent.steps`). A "step" is one LLM call — tool
            calls inside a step don't count.
          example: 8
    PeerAuth:
      description: |
        How the gateway authenticates its outbound calls to this peer.
        Discriminated on `type`:

        - `none` — anonymous; peer must accept unauthenticated calls.
        - `bearer` — static token passed literally in `Authorization`.
          Caller includes the secret in the request, so only use over TLS.
        - `bearer_env` — gateway reads the token from the named env var.
          Keeps secrets out of the wire; rotation = restart.
        - `did_signed` — gateway signs the request body with its
          configured Ed25519 identity and attaches an OAuth2 token. By
          default uses the gateway's own auto-acquired Hydra token;
          pass `tokenEnvVar` to use a per-peer federated token.
      oneOf:
        - $ref: '#/components/schemas/PeerAuth_None'
        - $ref: '#/components/schemas/PeerAuth_Bearer'
        - $ref: '#/components/schemas/PeerAuth_BearerEnv'
        - $ref: '#/components/schemas/PeerAuth_DidSigned'
      discriminator:
        propertyName: type
        mapping:
          none:
            $ref: '#/components/schemas/PeerAuth_None'
          bearer:
            $ref: '#/components/schemas/PeerAuth_Bearer'
          bearer_env:
            $ref: '#/components/schemas/PeerAuth_BearerEnv'
          did_signed:
            $ref: '#/components/schemas/PeerAuth_DidSigned'
    PeerTrust:
      type: object
      description: |
        Per-peer trust policy. Both fields are optional; omitting both
        means "trust the peer's identity at face value — don't verify."
      properties:
        verifyDID:
          type: boolean
          description: |
            When true, the gateway verifies every Ed25519 signature on
            artifacts returned by this peer. Mismatched signatures fail
            the task. Requires a resolvable DID on the peer.
        pinnedDID:
          type: string
          description: |
            DID the peer is expected to present. Used both for
            correlation (SSE `agent_did`) and, when `verifyDID` is true,
            to reject responses signed by a different key.
          example: did:bindu:research-agent-key-1
    SkillRequest:
      type: object
      required:
        - id
      properties:
        id:
          type: string
          description: |
            The skill id the A2A peer recognizes. Passed back to the
            peer inside `message/send` so it can route to the right
            internal handler.
          example: search
        description:
          type: string
          description: |
            Human-readable description. The planner LLM relies heavily
            on this to decide whether to invoke the skill — write 3–4
            sentences covering intent, inputs, outputs, and when to use
            it. Descriptions under 120 chars are auto-padded server-side
            with agent/skill context so the LLM still gets enough
            signal.
          example: Search the open web and return a ranked list of passages.
        inputSchema:
          description: |
            Optional JSON Schema for structured inputs. When present,
            the planner LLM emits a JSON object matching this shape
            and the gateway forwards it as the message text (serialized).
            When omitted, the planner sends a plain-text `input` string.
          type: object
          additionalProperties: true
        outputModes:
          type: array
          items:
            type: string
          description: |
            Advisory list of output MIME-like hints the peer may return
            (e.g., `text/plain`, `application/json`). Surfaced in the
            tool description so the planner knows what to expect back.
          example:
            - text/plain
            - application/json
        tags:
          type: array
          items:
            type: string
          description: |
            Free-form tags — helps the planner disambiguate when
            multiple peers expose similarly-named skills.
          example:
            - research
            - web
    PeerAuth_None:
      type: object
      required:
        - type
      properties:
        type:
          type: string
          enum:
            - none
    PeerAuth_Bearer:
      type: object
      required:
        - type
        - token
      properties:
        type:
          type: string
          enum:
            - bearer
        token:
          type: string
          description: 'Literal bearer token to include in `Authorization: Bearer <token>`.'
    PeerAuth_BearerEnv:
      type: object
      required:
        - type
        - envVar
      properties:
        type:
          type: string
          enum:
            - bearer_env
        envVar:
          type: string
          description: >-
            Name of the env var on the gateway process whose value is the bearer
            token.
          example: PEER_A_TOKEN
    PeerAuth_DidSigned:
      type: object
      required:
        - type
      properties:
        type:
          type: string
          enum:
            - did_signed
        tokenEnvVar:
          type: string
          description: |
            Optional. Env var name for a pre-acquired OAuth2 token to pair
            with the DID signature. Omit to use the gateway's own Hydra
            auto-acquired token (requires `BINDU_GATEWAY_HYDRA_*` env).
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      bearerFormat: opaque
      description: >
        Shared-secret bearer token(s) configured via
        `config.gateway.auth.tokens`.

        Validated in constant time against a SHA-256 hash of each configured

        token, so neither timing nor length leaks which token matched. Set

        `gateway.auth.mode: "none"` in config to disable bearer auth

        (not recommended outside of localhost).

````