| Severity | critical |
| Status | fixed |
| Found | 2026-04-18 |
| Fixed | 2026-04-18 |
| Area | bindu/server |
| Commit | d664e1e |
Symptom
In any Bindu deployment with more than one authenticated caller — e.g. two customers running their own agents behind a shared Bindu instance, or a SaaS provider hosting one agent for many tenants — a valid bearer token was sufficient to read, cancel, or destroy another caller’s tasks and contexts over the A2A JSON-RPC API. Concretely, Bob (authenticated with his own valid Hydra token) could:- Call
tasks/listand receive every task stored by the server, including Alice’s prompts, her agent’s replies, and any attached artifacts. No UUID enumeration required — the response paginated over the entire table. - Call
tasks/getwith any task UUID he learned or guessed and receive Alice’s full task body. - Call
tasks/cancelon Alice’s in-flight task. - Call
contexts/clearon Alice’s context, destroying the full conversation thread including its task rows. - Call
message/sendreferencing Alice’scontext_id, splicing his own messages into her conversation. - Register a webhook on Alice’s task via
tasks/pushNotification/config/setand receive all lifecycle events for her execution.
Root cause
The bug was not a single buggy line — it was a missing concept. The storage layer had no notion of task ownership, and the handlers never consulted the caller’s identity when serving a request. Pre-fix state atbindu/server/handlers/task_handlers.py:51-62:
cancel_task, list_tasks, task_feedback,
list_contexts, clear_context, and the four
push-notification handlers. storage.list_tasks(length) was a
global query with no WHERE clause.
Why did it ship this way? The server was designed around two
orthogonal concepts that never intersected:
- Authentication (“who is this caller?”) — implemented as a
Hydra ASGI middleware that attached
user_infotoscope.state.user. - Task routing (“dispatch by method name”) — implemented in
the A2A endpoint as
getattr(task_manager, handler_name)with no user context passed along.
require_permissions flag). Row-level authorization
— the part every multi-tenant system needs — was simply missing.
No bug in any one file; a bug in the contract between files.
Fix
Landed in four phases on branchfix/task-ownership-idor, each
self-contained and deployable independently so the rollout could
be incremental:
Phase 1 — plumbing (commits 2101d6d, bb97d13)
- New nullable
owner_didcolumn + index ontasksandcontexts(bindu/server/storage/schema.py) - Alembic migration
20260418_0001_add_owner_did.py StorageABC gainedget_task_owner/get_context_ownerandsubmit_task(caller_did=...)- A2A endpoint resolves
caller_didfromscope.state.user.client_idand threads it through everyTaskManagerhandler method. Every handler now carries the caller’s identity; enforcement landed separately.
9424272, d664e1e)
storage.submit_taskraisesOwnershipErrorwhen the referenced context exists with a different owner; handlers translate toContextNotFoundErroron the wire so existence cannot be probed across tenants.list_tasks/list_contexts/list_tasks_by_contextaccept an optionalowner_didfilter that hits the new indexes.get_task,cancel_task,task_feedback,clear_context, and all four push-notification handlers compareget_*_owner(id)vscaller_didand returnTaskNotFoundError/ContextNotFoundErroron mismatch.
7db5945)
scripts/backfill_owner_did.pyassigns pre-existing NULL-owner rows to a designated DID before enforcement is deployed.alembic/README.mddocuments the upgrade ordering: migrate → backfill → deploy enforcement.
- Integration test
tests/integration/test_task_ownership.pydrives the realTaskManagerwith two synthetic DIDs and asserts cross-tenant denial on every public handler (10 cases). - This postmortem.
- The
idor-task-context-no-ownership-checkentry removed frombugs/known-issues.md.
Why the tests didn’t catch it
Every existing handler test intests/unit/server/handlers/
exercised a single synthetic caller with mocked storage. There
was no “two tenants” scenario — no test where task A was created
by one identity and then fetched by another. The handler returned
the row because the mock returned the row, and the test passed
because the assertion only checked the response shape.
The integration tests for gRPC
(tests/integration/grpc/test_grpc_e2e.py)
likewise ran one caller end-to-end. Auth was either disabled or
mocked out, so scope.state.user was whatever the test supplied.
Access-control bugs are particularly invisible to single-actor
tests — they only manifest when an actor touches something that
another actor created. The fix includes exactly that: every new
test in the integration suite performs an action as one DID and
asserts the opposite DID cannot see or mutate the result.
Class of bug — where else to watch
IDOR / missing row-level authz is a shape that can hide in any handler that accepts an ID from the request and returns data keyed by it. Audit anywhere the patternload_by_id(user_input)
appears without an adjacent ownership check. In this codebase:
bindu/server/endpoints/negotiation.py:220—app.task_manager.storage.list_tasks()is called with no owner filter. The negotiation endpoint is on a different auth path and wasn’t in scope for this fix, but the same row-level authz question applies: can peer A negotiate over peer B’s task inventory? Worth a follow-up audit.bindu/server/endpoints/metrics.py:43-49—count_tasks(status=...)returns a global count across all tenants. For aggregate operational metrics this is acceptable, but if metrics are ever exposed per-caller (e.g. a “your usage” endpoint) the same shape would leak tenant sizes. Flag this before any such change.bindu/extensions/— any future extension that attaches to a task ID (x402 payment sessions, skills registration, etc.) should verify the caller owns the task before allowing the attachment. The push-notification enforcement in this fix is the reference pattern.- The per-DID schema feature from
20260119_0001_add_schema_support.pyisolates agents (each agent’s own DID gets its own schema) but still shares a schema across all the callers of that agent. Theowner_didcolumn added by this fix is required inside every DID schema; thecreate_bindu_tables_in_schemastored procedure has not yet been updated to include it, so existing DID schemas need a manualALTER TABLE ... ADD COLUMN owner_did VARCHAR(255)+ index. Tracked as a follow-up below.
Follow-ups
- Update
create_bindu_tables_in_schemaso DID-specific schemas created after this fix pick upowner_didautomatically. - Add row-level owner filtering to the negotiation endpoint (or document explicitly why global listing is correct there).
- The authz scope-enforcement flag
(
auth.require_permissions) remains optional — see slugauthz-scope-check-behind-optional-flaginbugs/known-issues.md. Row-level ownership and scope-based authz are complementary; both belong on by default in production.