How to checkpoint CrewAI crews
To make a CrewAI pipeline durable with Tidebase, wrap each crew.kickoff(...) in a checkpointed step. A multi-crew pipeline that dies between crews resumes with finished crews replaying from Postgres — no agent re-runs, no tokens re-billed — and human approval can sit between crews as a durable gate.
from crewai import Agent, Crew, Task
from tidebase import Tidebase
tide = Tidebase() # reads TIDEBASE_URL, default http://localhost:7373
def pipeline(run, input):
research = run.step(
"research-crew",
lambda: build_research_crew().kickoff(inputs={"topic": input["topic"]}).raw,
input={"topic": input["topic"]},
)
draft = run.step(
"writing-crew",
lambda: build_writing_crew().kickoff(inputs={"findings": research}).raw,
input={"findings": research},
)
decision = run.gate("approve-publish", "Publish this draft?", data={"preview": draft[:200]})
if not decision.approved:
return {"published": False}
run.step(
"publish",
lambda: cms.publish(draft),
input={"draft": draft},
side_effects=["cms"],
idempotency_key=f"publish-{run.run_id}",
)
return {"published": True}
tide.run("content-pipeline", pipeline, run_id=run_id, input={"topic": topic})
Tidebase is an open-source checkpoint layer for AI agents: wrap your steps, and failed runs resume from the last safe point — in your own Postgres, without moving execution into a new runtime.
The honest tradeoff: Tidebase does not execute your code — after a crash, something (a Tidebase queue worker, a recovery webhook handler, your own cron or retry button) must re-invoke the pipeline with the same run_id. Tidebase’s guarantee is that doing so is safe: completed crews replay, the gate’s decision replays, and the publish never fires twice.
Step granularity: one step per kickoff
A crew.kickoff() is the natural checkpoint unit — it’s expensive, nondeterministic, and runs to completion. Checkpoint its output (.raw, or .json_dict for structured outputs — pick something JSON-serializable, not the whole result object), and pass the crew’s inputs as the step input so a changed topic invalidates the stale checkpoint loudly instead of silently replaying.
If the process dies mid-kickoff, that crew re-runs from the start on resume — there is no finished checkpoint to replay. CrewAI agents that use tools with external side effects (post to an API, send email) deserve the same treatment as the publish step: wrap the tool’s body in a run.step with side_effects and an idempotency_key, so a re-run crew can’t double-fire what already happened. The classification rules are in the replay contract.
A durable gate between crews
The approve-publish gate parks the run in Postgres until a human decides — from Studio, a webhook channel into Slack, or your own UI — and the decision is exactly-once and recorded with the actor. This is the natural place for editorial review in content pipelines: the expensive crews are checkpointed behind you, so the run can wait hours without holding any compute. See human approval gates.
Recording cost per run
CrewAI exposes token usage after a kickoff via the crew’s usage metrics. Record it inside the step so a replay doesn’t double-count:
metrics = crew.usage_metrics
run.usage.record(
kind="llm",
provider="openai",
label="research-crew",
input_tokens=metrics.prompt_tokens,
output_tokens=metrics.completion_tokens,
)
Across many runs this gives you per-pipeline cost as queryable Postgres rows — see tracking LLM token costs per run.
What Tidebase does not do here
- It does not orchestrate the crew. Task order, delegation, and agent collaboration stay entirely in CrewAI; Tidebase checkpoints at the boundaries you choose.
- It does not replace CrewAI’s memory. Crew memory is conversational context; Tidebase is the durable run record around it.
- Alpha, opt-in auth. Self-hosted alpha — set
TIDEBASE_API_KEYbefore exposing the server beyond localhost.
Repo: https://github.com/BlueprintLabIO/tidebase · See also: How to resume a failed AI agent run