Skip to main content
Budget about 45 minutes if you’re reading straight through and running the commands. If you skip the commands and just read, ~15 minutes. No prior knowledge of AI agents required — we’ll introduce each idea when you need it, and never before.
You’ve heard the words. Agent. Planner. A2A. Multi-agent orchestration. By the end of this section you’ll have run all of those things yourself, watched them talk to each other, and taught them a new trick.

The roadmap

Why a gateway exists

The problem an orchestrator solves — and what it is not.

Hello, gateway

Seven steps. One question. One agent. One streaming response.

Adding a second agent

Three agents, one question — watch the planner choose the order.

Recipes

Teach the planner reusable patterns in plain markdown.

DID signing

Cryptographic identity for production peer calls.

Going to production

What to read, what to try, what to pin down.

Why a gateway exists

Imagine you’ve built three AI agents. Each is a small program that listens on an HTTP port and answers specific kinds of questions:

Research

Searches the web for facts.

Math

Solves numerical problems.

Poet

Writes short verse.
Now a user asks:
“Look up the population of Tokyo, then calculate 0.5% of it, then write a four-line poem about that number of people.”
Without a gateway, you — the programmer — have to:
1

Decide the question needs all three agents.

Read intent, split the work.
2

Call the research agent first.

Wait for reply.
3

Parse the answer to extract '36.95 million'.

Fragile string munging.
4

Pass that to the math agent.

Wait again.
5

Parse '184,750'.

Hope the format didn’t change.
6

Pass that to the poet agent.

Wait one last time.
7

Collect and return the final poem.

Write the user-facing response yourself.
That’s not hard for one question. But what about the next hundred questions? Each one needs its own chain, its own parsing, its own error handling. And as soon as a new agent joins the roster, every existing chain might want to use it.
The gateway is the thing that does steps 1–7 for you. You hand it a question and a list of agents. It figures out which agents to call, in what order, with what input. You get back a stream of what happened and, at the end, a final answer.

How does it “figure it out”?

The gateway has one trick: it uses an LLM — a large language model, like Claude or GPT — as a planner. The planner sees:
  • The user’s question.
  • A short description of each available agent.
  • Its own system prompt (general instructions the gateway operator wrote).
Then it decides, turn by turn, which agent to call next. The output of each call feeds back into the planner’s context, and it decides whether to call another agent, write a final answer, or ask the user a clarifying question.
Modern LLMs are surprisingly good at this. Anthropic calls it tool use, OpenAI calls it “function calling” — same idea. The gateway wires your agents up as “tools” the planner can invoke and lets the LLM drive.

What the gateway is not

It doesn’t generate answers itself. It orchestrates the ones you already have.
You give it a list of agents per request. The agents run wherever they run — your laptop, a cluster, a third-party service. The gateway just calls them.
As long as each agent speaks A2A (a small JSON-RPC 2.0 protocol), the gateway can call it. The Bindu team authored A2A, and bindufy()-built agents speak it out of the box.

What you’ll build by the end

Chapter 3

Three agents running locally, chained automatically to answer a multi-part question.

Chapter 4

A recipe — a short markdown file that teaches the planner a reusable pattern without writing any code.

Chapter 5

A cryptographic identity — outbound calls get signed so downstream agents can verify the calls are really from your gateway.
Let’s go. Next up: Hello, gateway.