How to checkpoint Vercel AI SDK agents

To checkpoint a Vercel AI SDK agent with Tidebase, wrap each generateText/generateObject call and each tool execute body in run.step, passing the prompt or tool arguments as input. Re-invoking the workflow with the same runId after a crash replays completed model calls and tool calls from Postgres instead of re-running them.

import { Tidebase } from '@tidebase/sdk'
import { generateText, tool } from 'ai'
import { openai } from '@ai-sdk/openai'
import { z } from 'zod'

const tide = new Tidebase() // reads TIDEBASE_URL, defaults to http://localhost:7373

await tide.run('research-agent', { runId }, async (run, input) => {
  const search = tool({
    description: 'Search the web',
    inputSchema: z.object({ query: z.string() }),
    execute: ({ query }) =>
      run.step(`search:${query}`, { input: { query } }, () => searchApi(query)),
  })

  const result = await run.step(
    'generate:answer',
    { input: { prompt: input.prompt } },
    () =>
      generateText({
        model: openai('gpt-4.1-mini'),
        tools: { search },
        prompt: input.prompt,
      })
  )

  await run.usage.record({
    kind: 'llm',
    provider: 'openai',
    model: 'gpt-4.1-mini',
    inputTokens: result.usage.inputTokens,
    outputTokens: result.usage.outputTokens,
  })

  await run.state.patch({ progress: 1 })
  return result.text
})

Tidebase is an open-source checkpoint layer for AI agents: wrap your steps, and failed runs resume from the last safe point — in your own Postgres, without moving execution into a new runtime.

The honest tradeoff: Tidebase does not execute your code. Something — a Tidebase queue worker, a recovery webhook handler, your own cron or retry button — must re-invoke the workflow after a failure. And replaying a checkpointed model call returns the recorded completion from Postgres; it does not re-run the model. That is the point — you don’t pay for the same tokens twice — but it means a replayed step reflects what the model said the first time.

Wrap the model call, key it by the prompt

The generateText call is the expensive, nondeterministic unit, so it gets its own step. Pass the prompt (or messages array) as input: the step name plus an input hash identify the checkpoint, so if you change the prompt, the stale checkpoint is rejected and the model runs again instead of silently replaying an answer to a question you no longer asked.

Wrap tool `execute` bodies for external APIs

Tools that hit external APIs are where a crashed agent loop double-fires. Wrap the body of each execute in run.step with a content-keyed name (e.g. `search:${query}`) so each distinct call gets its own checkpoint. For writes, declare sideEffects and an idempotencyKey:

const createTicket = tool({
  description: 'Create a support ticket',
  inputSchema: z.object({ title: z.string(), body: z.string() }),
  execute: ({ title, body }) =>
    run.step(
      `create-ticket:${title}`,
      {
        input: { title, body },
        sideEffects: ['ticketing-api'],
        idempotencyKey: `ticket-${runId}-${title}`,
        retries: 2,
      },
      () => ticketApi.create({ title, body })
    ),
})

A side-effecting step that fails without an idempotency key is classified for manual review rather than blindly retried — see the replay contract for the exact rules.

If the process dies inside a generateText call, the model call itself re-runs on resume (it never completed, so there is no checkpoint), but any tool steps that did complete replay by their content-keyed names. The model may issue the same tool calls again; the checkpoints answer them without re-hitting the API.

Record usage per run

The AI SDK exposes token counts on every result as result.usage.inputTokens and result.usage.outputTokens. Record them after each model step with run.usage.record(...) — add costUsd if you price tokens yourself — and you get a per-run ledger you can query later. See tracking LLM token costs per run.

Resume after a crash

Re-invoke the same workflow with the same runId. The generate:answer step and every completed search:* and create-ticket:* step return their checkpointed results; execution continues at the first incomplete step. Who does the re-invoking — Tidebase queues, recovery webhooks, or your own infra — is covered in how to resume a failed AI agent run.

What about streaming?

Checkpoint the completed result, not the stream. With streamText, await the final text/usage inside the step so the checkpoint is written once the stream finishes; on replay, the recorded completion comes back at once. A checkpoint of a partial stream is not a thing — if the process dies mid-stream, that model call re-runs.

What Tidebase does not do here

No LLM proxying. Your generateText calls go straight to the provider with your keys; Tidebase only stores step results and usage you record.
No agent loop ownership. The AI SDK still drives tool calling and multi-step reasoning; Tidebase checkpoints around it.
Alpha, opt-in auth. Tidebase is a self-hosted alpha; turn auth on before exposing it beyond localhost.

Repo: https://github.com/BlueprintLabIO/tidebase · See also: How to resume a failed AI agent run