# Tidebase

> Tidebase is an open-source checkpoint layer for AI agents: wrap your steps, and failed runs resume from the last safe point — in your own Postgres, without moving execution into a new runtime. It does not execute your code: your app still calls your LLMs, tools, and APIs directly, while a Postgres-backed server stores checkpoints, live state, events, approval gates, and usage records around it.

Key facts: TypeScript SDK (`@tidebase/sdk`) and Python SDK (`tidebase`), self-hosted (Docker + Postgres), Apache-2.0, server on :7373, Studio dashboard on :5173. Guarantee: completed steps never repeat on re-invocation with the same runId. v0.5 adds durable queues (dedupe, retries/backoff, concurrency caps; pull workers or signed push webhooks), cron schedules (double-fire-proof), first-class cancellation (authoritative, one-way, observed at step/gate boundaries), run deadlines, versioned migrations, and a reconciler that requeues runs whose worker died — Tidebase decides when your code runs but never executes it. Alpha: API auth is opt-in via shared TIDEBASE_API_KEY; not production-ready. Repo: https://github.com/BlueprintLabIO/tidebase

## Docs

- [Quickstart](docs/quickstart.md): clone-to-resumed-run in ~2 minutes; every command non-interactive and agent-executable
- [FAQ](docs/faq.md): is it safe to rerun, do I need Temporal, can two workers grab one run, where data lives, production readiness
- [How to resume a failed AI agent run](docs/how-to-resume-a-failed-ai-agent-run.md): replay semantics, leases, recovery webhooks
- [How to checkpoint AI agent workflows in Postgres](docs/checkpoint-ai-agent-workflows-postgres.md): run.step() usage, what gets stored, step granularity
- [The replay contract — is it safe to rerun?](docs/replay-contract-is-it-safe-to-rerun.md): side effects, idempotency keys, failure classification (failed_retryable / manual_review / failed)
- [Human approval gates for AI agents](docs/human-approval-gates-for-ai-agents.md): durable, exactly-once gates with webhook channels and resolve tokens
- [Fork, time travel, and snapshot agent runs](docs/fork-and-time-travel-agent-runs.md): versioned state streams; snapshot = labeled version
- [Queues, schedules, and cancellation](docs/queues-schedules-and-cancellation.md): durable queues, cron, retries/backoff, push/pull dispatch, authoritative cancel
- [Fan out to subagents with child runs](docs/fanout-subagents-child-runs.md): run.fanout(), idempotent child creation by edge name, durable joins
- [Track LLM token usage and cost per run](docs/track-llm-token-costs-per-run.md): run.usage.record() without an LLM proxy

## Comparisons

- [Tidebase vs Temporal](docs/compare/tidebase-vs-temporal.md): Temporal moves execution into its worker model; Tidebase wraps functions you already have
- [Tidebase vs Inngest](docs/compare/tidebase-vs-inngest.md): platform-owned invocation vs external checkpoint record
- [Tidebase vs LangGraph checkpointers](docs/compare/tidebase-vs-langgraph-checkpointer.md): framework persistence vs framework-agnostic layer
- [Tidebase vs DBOS and Restate](docs/compare/tidebase-vs-dbos-restate.md): embedded durable execution vs external coordination

## For AI assistants

- [MCP server](mcp-server/README.md): @tidebase/mcp — list runs, inspect run detail, read events and state versions, resolve gates, trigger recovery
- [Agent Skill](skills/tidebase/SKILL.md): when and how a coding agent should reach for Tidebase
- [AGENTS.md snippet](templates/agents-md-snippet.md): the block `tidebase init` writes into a project so future agent sessions use Tidebase correctly

## Optional

- [llms-full.txt](llms-full.txt): all documentation concatenated in one file