Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.getbindu.com/llms.txt

Use this file to discover all available pages before exploring further.

Picture this. Your agent is mid-task — a user asked it to summarize a 200-page document, it’s been thinking for 90 seconds — and the process restarts. Maybe a deploy, maybe an OOM, doesn’t matter. With in-memory storage, that task is gone. The user sees an error. The agent doesn’t even know it was working on something. That’s the trap: an agent that holds state in process memory is a demo, not a service. The moment uptime matters, you need persistence — and the moment you run more than one replica, you need a store every replica can see. Bindu defaults to in-memory because local laptops don’t have a Postgres handy. Switch to Postgres by setting two env vars and the same handler code keeps running — tasks, contexts, artifacts, and webhook configs all land in a durable store you can query, replay, and audit.
memory is the default. The only other backend is postgres. There is no SQLite or file-based backend. Pick via STORAGE_TYPE — no code change in your handler.

How Bindu Storage Works

Every backend implements the same abstract Storage interface in bindu/server/storage/base.py. The TaskManager, workers, and handlers call that interface; the backend behind it is chosen at startup by the factory.

What gets stored

Bindu’s storage layer persists four things — not three. The earlier model in the spec missed webhook_configs:

Tasks

One row per task in tasks. State, message history, artifacts, metadata, owner DID, timestamps.

Contexts

One row per conversation in contexts. Shared message history that spans multiple tasks, plus arbitrary context_data.

Task feedback

Optional ratings and comments per task in task_feedback. One task can have many feedback rows.

Webhook configs

One push-notification config per task in webhook_configs. Survives restarts so long-running tasks can still call back.
Artifacts are not a separate table — they live inside tasks.artifacts as a JSONB array on the task row.

Backends at a glance

MemoryPostgreSQL
ClassInMemoryStoragePostgresStorage
StoragePython dicts in processJSONB tables via SQLAlchemy async + asyncpg
PersistenceLost on restartSurvives restarts
ReplicasOne process onlyMany replicas share one DB
ConcurrencySingle event loopConnection pool (pool_min=2, pool_max=10 by default)
MigrationsNoneAlembic (manual or auto on startup)
Retries@retry_storage_operation decoratorTenacity wrapper on every call
Compare-and-swapNo await between read and writeSingle UPDATE ... WHERE state = :from
SetupNoneRequires a running Postgres
STORAGE_TYPEmemorypostgres
Both backends honour OwnershipError and caller_did checks the same way. The protocol surface is identical — the only differences are durability, concurrency, and operational cost.

Configuration

Storage is configured purely from environment variables. Credentials don’t belong in a Python config dict that ends up in git.

Memory (default)

# Default — no env vars needed. Just run your agent.
STORAGE_TYPE=memory
Tasks, contexts, feedback, and webhook configs live in a dict keyed by UUID. Gone when the process stops. Fine for uv run agent.py while you’re building, useless beyond one replica.

PostgreSQL

STORAGE_TYPE=postgres
DATABASE_URL=postgresql+asyncpg://bindu_user:secret@localhost:5432/bindu_db
DATABASE_URL is the canonical name (postgres_url also works). Internally, if you pass a bare postgresql:// URL, Bindu rewrites it to postgresql+asyncpg:// so the async driver is used — but it’s cleaner to write it correctly up front.
VariableDefaultMeaning
STORAGE_TYPEmemorymemory or postgres.
DATABASE_URLnoneFull async DSN. Required when STORAGE_TYPE=postgres.
STORAGE__POSTGRES_POOL_MIN2Minimum pooled connections (advisory).
STORAGE__POSTGRES_POOL_MAX10Max pool size. Passed to SQLAlchemy pool_size.
STORAGE__POSTGRES_TIMEOUT60Pool-checkout timeout in seconds.
STORAGE__POSTGRES_COMMAND_TIMEOUT30Per-command timeout.
STORAGE__POSTGRES_MAX_RETRIES3Retry attempts for transient errors.
STORAGE__POSTGRES_RETRY_DELAY1.0Base delay between retries (seconds).
STORAGE__RUN_MIGRATIONS_ON_STARTUPfalseRun Alembic upgrade head on connect.
DID / POSTGRES_DIDnoneIf set, isolates this agent’s tables into a per-DID Postgres schema.
run_migrations_on_startup defaults to false — the spec previously suggested tables are created on first run, but in current code production deployments run Alembic explicitly. Turn it on locally if you want zero-touch bootstrap.

The Storage Interface

Everything goes through Storage in bindu/server/storage/base.py. The methods your handler ends up exercising:
# Tasks
async def submit_task(context_id, message, caller_did=None) -> Task
async def load_task(task_id, history_length=None) -> Task | None
async def update_task(task_id, state, new_artifacts=None,
                      new_messages=None, metadata=None) -> Task
async def update_task_state_if(task_id, from_state, to_state) -> bool  # CAS
async def list_tasks(length=None, offset=0, owner_did=None) -> list[Task]
async def list_tasks_by_context(context_id, length=None, offset=0,
                                owner_did=None) -> list[Task]
async def count_tasks(status=None) -> int

# Contexts
async def load_context(context_id) -> ContextT | None
async def append_to_contexts(context_id, messages) -> None
async def update_context(context_id, context) -> None
async def list_contexts(length=None, offset=0, owner_did=None) -> list[ContextT]

# Webhooks (persisted per task for long-running callbacks)
async def save_webhook_config(task_id, config) -> None
async def load_webhook_config(task_id) -> PushNotificationConfig | None
async def delete_webhook_config(task_id) -> None
async def load_all_webhook_configs() -> dict[UUID, PushNotificationConfig]

# Ownership / lifecycle
async def get_task_owner(task_id) -> str | None
async def get_context_owner(context_id) -> str | None
async def clear_context(context_id) -> None
async def clear_all() -> None
async def close() -> None
A few things worth calling out:
  • update_task_state_if is a compare-and-swap. In Postgres it’s a single UPDATE ... WHERE state = :from; in memory it’s a comparison with no await between read and write. This is how cancel-vs-complete races stay correct.
  • OwnershipError is raised by submit_task when a context already exists with a different owner_did. Handlers catch it and convert to ContextNotFoundError so they don’t leak existence to the caller.
  • Terminal states are immutable. submit_task on a task already in completed, failed, canceled, or rejected raises ValueError — you must create a new task with reference_task_ids.

Using the factory

from bindu.server.storage.factory import create_storage, close_storage

storage = await create_storage()        # reads STORAGE_TYPE from env
# or, with DID-based schema isolation:
storage = await create_storage(did="did:bindu:alice:agent1:abc123")

task = await storage.load_task(task_id)
# ... use storage ...
await close_storage(storage)
If STORAGE_TYPE=postgres and SQLAlchemy / asyncpg aren’t installed, the factory raises a clear ValueError telling you to pip install sqlalchemy[asyncio] asyncpg.

The Storage Lifecycle

Every state transition is a write. If the process dies between working and completed, the row still exists in submitted or working — the scheduler can re-pick it up and the worker resumes from the last persisted state.

The Postgres Schema

Defined in bindu/server/storage/schema.py via SQLAlchemy Core. Bindu uses imperative mapping — the protocol TypedDicts in bindu/common/protocol/types.py are the model; the schema module is just the table definition.
id                UUID         PRIMARY KEY
context_id        UUID         FK contexts(id) ON DELETE CASCADE
owner_did         VARCHAR(255) NULL          -- authenticated caller
kind              VARCHAR(50)  NOT NULL DEFAULT 'task'
state             VARCHAR(50)  NOT NULL      -- submitted|working|...
state_timestamp   TIMESTAMPTZ  NOT NULL
history           JSONB        NOT NULL DEFAULT '[]'
artifacts         JSONB        NULL DEFAULT '[]'
metadata          JSONB        NULL DEFAULT '{}'
created_at        TIMESTAMPTZ  NOT NULL DEFAULT now()
updated_at        TIMESTAMPTZ  NOT NULL DEFAULT now() ON UPDATE now()
Indexes: context_id, state, created_at, updated_at, owner_did, and a GIN index on metadata.
id              UUID         PRIMARY KEY
owner_did       VARCHAR(255) NULL          -- write-once on first submit
context_data    JSONB        NOT NULL DEFAULT '{}'
message_history JSONB        NULL DEFAULT '[]'
created_at      TIMESTAMPTZ  NOT NULL DEFAULT now()
updated_at      TIMESTAMPTZ  NOT NULL DEFAULT now() ON UPDATE now()
Indexes: created_at, updated_at, owner_did, GIN on context_data.
id            SERIAL       PRIMARY KEY
task_id       UUID         FK tasks(id) ON DELETE CASCADE
feedback_data JSONB        NOT NULL
created_at    TIMESTAMPTZ  NOT NULL DEFAULT now()
Indexes: task_id, created_at.
task_id    UUID         PRIMARY KEY, FK tasks(id) ON DELETE CASCADE
config     JSONB        NOT NULL
created_at TIMESTAMPTZ  NOT NULL DEFAULT now()
updated_at TIMESTAMPTZ  NOT NULL DEFAULT now() ON UPDATE now()
One config per task. Loaded into memory on startup so long-running tasks can still call their callback after a restart.
A sample task row, decoded:
{
  "id": "9b7f5e10-7b8d-4d4d-8c3d-6b6a4b8c1a99",
  "context_id": "3f0e2c1a-1f4b-4f5b-8a2e-cdb8a9c3e7d1",
  "owner_did": "did:bindu:alice:research-agent:abc123",
  "kind": "task",
  "state": "completed",
  "state_timestamp": "2026-05-18T12:00:00+00:00",
  "history": [
    {"role": "user", "parts": [{"text": "Summarize this document"}]},
    {"role": "agent", "parts": [{"text": "Here is the summary..."}]}
  ],
  "artifacts": [
    {"name": "summary", "parts": [{"text": "The document covers..."}]}
  ],
  "metadata": {},
  "created_at": "2026-05-18T11:59:58+00:00",
  "updated_at": "2026-05-18T12:00:00+00:00"
}

Helpers

bindu/server/storage/helpers/ ships small utilities used by PostgresStorage:
  • validation.validate_uuid_type / normalization.normalize_uuid — coerce string UUIDs to UUID and reject bad input early.
  • normalization.normalize_message_uuids — fix up task_id, context_id, message_id, reference_task_ids on inbound messages.
  • serialization.serialize_for_jsonb — recursively convert UUID to str so JSONB stays JSON-safe.
  • security.mask_database_url — strip the password before logging (user:***@host).
  • security.sanitize_identifier — used when DID-based schema isolation rewrites search_path; rejects anything that isn’t [A-Za-z0-9_].
  • db_operations.get_current_utc_timestamp, prepare_jsonb_value, create_update_values — common building blocks for UPDATE statements.

PostgreSQL Setup

1

Run Postgres

docker run -d \
  --name bindu-postgres \
  -e POSTGRES_USER=bindu_user \
  -e POSTGRES_PASSWORD=secret \
  -e POSTGRES_DB=bindu_db \
  -p 5432:5432 \
  postgres:16
2

Install the driver

uv add 'sqlalchemy[asyncio]' asyncpg alembic
The factory imports PostgresStorage lazily — without these packages you’ll get a clear error at startup, not at runtime.
3

Set the env vars

export STORAGE_TYPE=postgres
export DATABASE_URL='postgresql+asyncpg://bindu_user:secret@localhost:5432/bindu_db'
For managed Postgres (Neon, RDS, Supabase) append ?sslmode=require.
4

Create the schema

Either let Bindu do it on connect:
export STORAGE__RUN_MIGRATIONS_ON_STARTUP=true
Or run Alembic yourself:
uv run alembic upgrade head
Other useful Alembic commands:
uv run alembic history          # see applied/pending migrations
uv run alembic downgrade -1     # rollback one revision
5

Run your agent

Your handler code doesn’t change. The factory picks up STORAGE_TYPE, PostgresStorage.connect() opens the pool (pool_pre_ping=True), and everything else is identical to memory mode.

Retries, Pooling, Durability

  • Retries. Memory wraps each call in a @retry_storage_operation decorator (3 attempts, 0.1s → 1.0s backoff). Postgres routes every call through execute_with_retry from bindu.utils.retry, governed by STORAGE__POSTGRES_MAX_RETRIES and STORAGE__POSTGRES_RETRY_DELAY. Same Tenacity primitives used elsewhere — see Retry.
  • Pooling. PostgresStorage creates a single async_engine per process with pool_size=postgres_pool_max, max_overflow=0, and pool_pre_ping=True so dead connections are detected before they bite a request.
  • Transactions. Multi-step writes (submit_task, update_task) run inside session.begin(). The status update and the history/artifacts append commit atomically.
  • Durability. Memory loses everything on restart, including queued webhook callbacks. Postgres persists every state transition; webhook configs reload on startup via load_all_webhook_configs.

Real-World Use Cases

A user starts research, the agent asks a clarifying question (state=input-required), the user replies. Postgres keeps both the task row and the context’s message_history so a follow-up tasks/send picks up exactly where the agent paused — even across a redeploy.
Task takes minutes. Client registers a push-notification config; Bindu writes it to webhook_configs. On restart, load_all_webhook_configs rehydrates the in-memory map so the worker can still call the callback when the task finishes.
Every state transition writes a row update. history and artifacts are append-only JSONB arrays. Query by owner_did, state, or created_at range — the indexes are already there.
With Postgres, N replicas of the same agent share one DB. Whoever wins update_task_state_if(submitted → working) owns the task. The compare- and-swap is a single SQL UPDATE — no two workers can both win.
Pass DID=did:bindu:alice:agent1:... and PostgresStorage rewrites search_path to a sanitized per-DID Postgres schema. Same database, isolated tables — handy for multi-tenant hosting.

Security Best Practices

Keep credentials out of code

DATABASE_URL belongs in .env (gitignored) locally and in your orchestrator’s secret manager in prod. Bindu logs the URL with the password masked via mask_database_url.

Least-privilege DB user

The agent only needs DML on its own tables (plus DDL if you let it run migrations on startup). Don’t give it SUPERUSER.

Enable TLS

Append ?sslmode=require (or stronger) when connecting to managed Postgres. asyncpg supports the full sslmode ladder.

Respect ownership

caller_did is recorded on every task and context. Pass the authenticated DID through and let OwnershipError keep cross-tenant leakage out of your error responses.

Sunflower LogoStorage is what turns a stateless script into an agent that remembers — every task, every turn, every artifact, every callback.