Streaming Responses
Status: Not implemented The proto definesHandleMessagesStream — a server-side streaming RPC where the SDK yields response chunks incrementally. But GrpcAgentClient doesn’t call it. Remote agents can only return complete responses.
What this means in practice
You’re building a TypeScript agent with GPT-4o. In a Python agent, you could stream tokens back to the user as they’re generated — they see the response forming word by word. With a gRPC agent, the user waits for the entire response, then sees it all at once. For short answers (< 2 seconds), this doesn’t matter. For long responses (analysis, code generation, research), the UX is noticeably worse.Workaround
Return complete responses. Most agents do this anyway — the streaming gap only matters for chat-like interfaces where perceived latency matters.What needs to happen
No TLS
gRPC connections usegrpc.insecure_channel. Traffic between the core and SDK is unencrypted.
Why it’s okay for now: The core and SDK run on the same machine (localhost). The SDK spawns the core as a child process. There’s no network exposure.
No Automatic Reconnection
If the SDK process crashes mid-execution, theGrpcAgentClient doesn’t retry. The task fails, and the agent must be re-registered.
What happens
ManifestWorker catches the gRPCUNAVAILABLE error and marks the task as failed. The user gets an error response. On restart, the SDK calls RegisterAgent again and the agent is back.
What would be better
Automatic reconnection with exponential backoff, so transient failures (SDK restart, brief network blip) recover without re-registration.No Connection Pooling
EachGrpcAgentClient creates a single gRPC channel. Under high concurrency (many simultaneous tasks), all calls share one channel.
For most agents this is fine — gRPC channels handle multiplexing well. But for agents processing hundreds of concurrent requests, a connection pool would reduce contention.
No gRPC-Specific Metrics
The/metrics endpoint (Prometheus) reports HTTP request metrics but not gRPC call metrics. You can’t see HandleMessages latency, error rates, or call counts in the dashboard.
Workaround: Check the core’s log output, which includes timing information for each handler call.
No Load Balancing
If you run two instances of the same TypeScript agent, each one registers separately with a different callback address. There’s no built-in routing to spread load across instances. Workaround: Use a reverse proxy (like Envoy) in front of the SDK instances, and register the proxy address as the callback.Feature Comparison
| Feature | Python Agents | gRPC Agents |
|---|---|---|
| Unary responses | ✅ works | ✅ works |
| Streaming responses | ✅ works | ❌ not implemented |
| DID identity | ✅ works | ✅ works |
| x402 payments | ✅ works | ✅ works |
| Skills | ✅ works | ✅ works |
| State transitions (input-required) | ✅ works | ✅ works |
| Health checks | ✅ works | ✅ works |
| Multi-language | Python only | ✅ any language |
| Latency overhead | 0ms | 1-5ms |
| TLS | N/A (in-process) | ❌ not implemented |
| Auto-reconnection | N/A (in-process) | ❌ not implemented |
The bottom line: gRPC agents have full feature parity with Python agents for the core functionality (DID, auth, payments, skills, A2A protocol). The gaps are in streaming, security, and resilience — all planned for future releases.