# How to add durable checkpoints to Mastra agents

To make Mastra agent calls durable with Tidebase, wrap each `agent.generate()` call in a `run.step()` inside a Tidebase workflow. If the process dies mid-pipeline, re-invoking with the same `runId` replays completed steps from Postgres and continues from the first incomplete one — no agent call runs twice.

```typescript
import { Tidebase } from '@tidebase/sdk'
import { Agent } from '@mastra/core/agent'

const tide = new Tidebase()

const researcher = new Agent({
  id: 'researcher',
  name: 'researcher',
  instructions: 'Research the topic and return key findings as bullets.',
  model: 'openai/gpt-4o-mini', // any model id your Mastra version supports
})

const writer = new Agent({
  id: 'writer',
  name: 'writer',
  instructions: 'Turn findings into a short publishable post.',
  model: 'openai/gpt-4o-mini',
})

await tide.run('research-and-publish', { runId }, async (run, input: { topic: string }) => {
  const findings = await run.step('research', { input: { topic: input.topic } }, async () => {
    const res = await researcher.generate(`Research: ${input.topic}`)
    await run.usage.record({
      kind: 'llm',
      provider: 'openai',
      model: 'gpt-4o-mini',
      inputTokens: res.usage.inputTokens,
      outputTokens: res.usage.outputTokens,
    })
    return res.text
  })

  const draft = await run.step('draft', { input: { findings } }, async () => {
    const res = await writer.generate(`Write a post from these findings:\n${findings}`)
    return res.text
  })

  await run.state.set({ phase: 'awaiting-approval', draftPreview: draft.slice(0, 200) })

  const decision = await run.gate('approve-publish', {
    prompt: 'Publish this draft to the blog?',
    data: { draft },
  })
  if (decision.decision !== 'approved') return { published: false }

  await run.step(
    'publish',
    { input: { draft }, sideEffects: ['cms'], idempotencyKey: `publish-${runId}` },
    () => publishToCms(draft)
  )
  return { published: true }
})
```

Tidebase is an open-source checkpoint layer for AI agents: wrap your steps, and failed runs resume from the last safe point — in your own Postgres, without moving execution into a new runtime.

## Why one step per agent call

Each `agent.generate()` gets its own step, with the prompt material passed as the step's `input`. Tidebase hashes that input: on replay, a completed step returns its checkpointed text without calling the model again, and if the prompt changed since the checkpoint was written, replay fails loudly instead of silently reusing a stale answer. Recording usage *inside* the step means token costs land in the per-run ledger exactly once — replayed steps don't re-record.

## Tool-using agents

If a Mastra agent has tools that touch the outside world (send email, write to a CRM, charge a card), the step wrapping that `generate()` call has external side effects. Declare them and pin an `idempotencyKey`, as in the `publish` step above. Per the [replay contract](../replay-contract-is-it-safe-to-rerun.md), a failed step with undeclared side effects is parked for `manual_review` rather than blindly retried — which is what you want when the tool may have half-fired.

The `approve-publish` gate is a durable, exactly-once [human approval gate](../human-approval-gates-for-ai-agents.md): the run parks in Postgres until someone decides, surviving restarts and deploys in between.

## Resuming a failed run

Re-invoke the same workflow with the same `runId` — from a Tidebase queue, a recovery webhook, a cron, or a retry button. `research` and `draft` replay from checkpoints; execution continues at the gate or the publish step. The honest tradeoff: Tidebase does not execute your code. Something — Tidebase's queues, or your own infrastructure — must re-invoke the workflow function; Tidebase guarantees that doing so is safe and that completed steps never repeat.

## Do you even need this? Mastra has workflows

Be honest with yourself here. Mastra ships its own workflow engine with suspend/resume and pluggable storage. If your pipeline is a pure Mastra workflow, persisted with Mastra's storage, running entirely inside Mastra — that may be enough, and adding Tidebase would be a second source of truth you don't need.

Reach for Tidebase when:

- **You want the run record in your own Postgres** — run state, approval gates, the replay contract, and the token/cost ledger as queryable rows next to your app's data, not inside a framework's storage abstraction.
- **You're framework-agnostic** — the same checkpoint layer wraps Mastra agents today and plain SDK calls, LangChain chains, or hand-rolled steps tomorrow, with one operational surface.
- **The agent call is one step in a larger pipeline** — you're calling `agent.generate()` inside ingestion jobs, ETL, or API handlers that are not Mastra workflows, and you want durability around the whole pipeline, not just the agent part.

Repo: <https://github.com/BlueprintLabIO/tidebase> · See also: [How to resume a failed AI agent run](../how-to-resume-a-failed-ai-agent-run.md)