The Problem
You built a great agent in TypeScript. It uses the OpenAI SDK, calls GPT-4o, and handles multi-turn conversations. But turning it into a real microservice means rebuilding identity, authentication, payments, scheduling, storage, and the A2A protocol. That is the expensive part. And if the next team wants Kotlin or Rust, you do it all again.The Sidecar Model
Bindu solves that with a sidecar architecture. You write the logic, like the driver of the agent. The Bindu Core runs beside it like the engine, handling infrastructure you should not have to reimplement.bindufy(), write a handler, and get the full Bindu stack.
The Big Picture
Once the sidecar idea clicks, the rest of the model becomes easier to follow. One process owns the agent logic. The other owns the infrastructure.Why Two Processes?
Because the alternative is worse. Option A: Rewrite the Bindu Core in every language. DID, auth, x402, scheduler, storage, A2A protocol in TypeScript, then Kotlin, then Rust. Every bug gets fixed multiple times. Option B: Keep one core, connect to it over a wire, and let thin SDKs translate between the developer and the core. Bindu chooses Option B. The sidecar is the boundary, and gRPC is the wire.What Actually Happens
At runtime, the sidecar model turns into a short sequence of concrete steps.1. The SDK starts the Bindu Core as a child process
The Python core handles DID, auth, x402, scheduling, storage, and the HTTP server. The developer does not run a separate service manually. The SDK detects how to launch it and spawns it.2. The SDK registers the agent over gRPC
It sends the config, skills, and callback address to the core. The core runs the fullbindufy pipeline and starts an A2A HTTP server.
3. When messages arrive, the core calls the SDK’s handler over gRPC
A client sends an A2A message to:3773. The core receives it, builds task context, and calls manifest.run(messages). For a gRPC agent, that becomes a HandleMessages call back to the SDK process.
Two Services, Two Directions
The sidecar pattern only works because calls move both ways. The SDK talks to the core during registration. The core talks back to the SDK during execution.BinduService (lives in the Python core on :3774)
The SDK calls this to register and manage its agent:
| Method | What it does |
|---|---|
RegisterAgent | ”Here is my config, skills, and callback address. Turn me into a microservice.” |
Heartbeat | ”I am still alive.” (every 30 seconds) |
UnregisterAgent | ”I am shutting down. Clean up.” |
AgentHandler (lives in the SDK on a dynamic port)
The core calls this when work arrives:| Method | What it does |
|---|---|
HandleMessages | ”A user sent this message. Run your handler and give me the response.” |
GetCapabilities | ”What can you do?” |
HealthCheck | ”Are you still there?” |
Message Flow
Now we can follow one request from the outside world to your code and back again. A user sends “What is the capital of France?” to a TypeScript agent that has already been bindufied:ManifestWorker picks up the task
Builds conversation history from storage, calls
manifest.run(messages)manifest.run is a GrpcAgentClient
Converts messages to protobuf, calls
HandleMessages on the SDK’s gRPC serverTypeScript SDK receives the call
Deserializes messages:
[{role: "user", content: "What is the capital of France?"}]
Calls the developer’s handler functionManifestWorker processes the result
ResultProcessor normalizes it -> ResponseDetector determines task state -> “completed”
ArtifactBuilder creates a DID-signed artifact
The Invisible Bridge
Inside the core, one component keeps this model elegant:GrpcAgentClient.
It is a Python class that behaves like a handler function. ManifestWorker calls it the same way it would call a local Python handler:
manifest.run is local. For a gRPC agent, it is a GrpcAgentClient instance. The rest of the pipeline does not need to care.
That is a key design win. The sidecar changes the transport, not the downstream architecture.
Startup Lifecycle
The request path explains runtime. The startup path explains how the sidecar comes alive in the first place.
When the developer presses
Ctrl+C, the SDK kills the Python child process and exits cleanly.
Python vs gRPC Agents
From the outside, both models look the same. Inside, the execution path is different.| Python Agent | gRPC Agent | |
|---|---|---|
| Developer calls | bindufy(config, handler) | bindufy(config, handler) (identical) |
| Handler runs in | Same process as core | Separate process |
| Core started by | bindufy() directly | SDK spawns as child process |
| Communication | In-process function call | gRPC over localhost |
| Latency overhead | 0ms | 1-5ms |
| Language | Python only | Any language with gRPC |
| DID, auth, x402 | Full support | Full support (identical) |
| Skills | Loaded from filesystem | Sent as data during registration |
| Streaming | Supported | Not yet implemented |
From the outside, there is no visible difference. The agent card looks the same. The DID is generated the same way. The A2A responses have the same structure. The artifacts carry the same DID signatures. A client cannot tell whether the agent behind
:3773 is Python, TypeScript, or Kotlin.Real Examples
The sidecar model is easiest to trust when you can see it in real projects.TypeScript + OpenAI
GPT-4o agent with one
bindufy() callTypeScript + LangChain
LangChain.js research assistant
Kotlin + OpenAI
Kotlin agent with the same pattern
Quick Test
If you want to validate the transport layer before writing SDK code, you can test the core directly withgrpcurl.
grpcurl:
bindufy pipeline ran: config validation, DID key generation, manifest creation, and HTTP server startup.
Ports
| Port | Protocol | Purpose |
|---|---|---|
:3773 | HTTP | A2A protocol (clients connect here) |
:3774 | gRPC | Agent registration (SDKs connect here) |
:XXXXX | gRPC | Handler execution (core calls SDKs here, dynamic port) |
Known Limitations
The sidecar model already gives you full parity for core infrastructure. The current gaps are about transport features, resilience, and operations.Streaming Responses
Status: Not implemented The proto definesHandleMessagesStream, a server-side streaming RPC where the SDK yields response chunks incrementally. But GrpcAgentClient does not call it. Remote agents can only return complete responses.
For short answers, this usually does not matter. For long answers, the UX is worse because users wait for the whole response at once.
Workaround
Return complete responses. Most agents do this anyway. The gap matters most in chat-like interfaces where perceived latency matters.What needs to happen
No TLS
gRPC connections usegrpc.insecure_channel. Traffic between the core and SDK is unencrypted.
Why it is okay for now: The core and SDK run on the same machine (
localhost). The SDK spawns the core as a child process. There is no network exposure.No Automatic Reconnection
If the SDK process crashes mid-execution, theGrpcAgentClient does not retry. The task fails, and the agent must be re-registered.
ManifestWorker catches the gRPC UNAVAILABLE error and marks the task as failed. The user gets an error response. On restart, the SDK calls RegisterAgent again.
No Connection Pooling
EachGrpcAgentClient creates a single gRPC channel. Under high concurrency, all calls share one channel.
That is fine for most agents because gRPC multiplexes well. At very high concurrency, connection pooling would reduce contention.
No gRPC-Specific Metrics
The/metrics endpoint reports HTTP request metrics but not gRPC call metrics. You cannot see HandleMessages latency, error rates, or call counts in the dashboard.
Workaround: check the core log output, which includes timing information for each handler call.
No Load Balancing
If you run two instances of the same TypeScript agent, each one registers separately with a different callback address. There is no built-in routing to spread load across instances. Workaround: use a reverse proxy such as Envoy in front of the SDK instances, and register the proxy address as the callback.Feature Comparison
| Feature | Python Agents | gRPC Agents |
|---|---|---|
| Unary responses | works | works |
| Streaming responses | works | not implemented |
| DID identity | works | works |
| x402 payments | works | works |
| Skills | works | works |
| State transitions (input-required) | works | works |
| Health checks | works | works |
| Multi-language | Python only | any language |
| Latency overhead | 0ms | 1-5ms |
| TLS | N/A (in-process) | not implemented |
| Auto-reconnection | N/A (in-process) | not implemented |
Bottom line: the driver/engine split is already solid. gRPC agents have parity with Python agents for identity, auth, payments, skills, and the A2A protocol. The missing pieces are streaming, security hardening, and resilience.