# Pathrule Pattern: AI SDK (Vercel AI SDK) (1.0.0)
# ::pathrule:package:ai-sdk

### [RULE] Call models server-side only; never expose provider keys  (path: /src/ai)
<!-- scope: folder | priority: high | strict -->

A model call carries a secret. If it runs in the browser, the key is in the bundle and the cost is the attacker's.

- Run `streamText`, `generateText`, `generateObject`, and `embed` only in server code: a route handler (`app/api/.../route.ts`), a server action, or a backend service. Never import a provider or call the SDK from a client component.
- Read keys (`AI_GATEWAY_API_KEY`, `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`) from server-side env. Do not prefix any of them with `NEXT_PUBLIC_` / `VITE_` / `PUBLIC_`; that publishes the secret to the browser.
- The client talks to your own endpoint, not to the provider. The browser sends messages to `/api/chat`; your handler holds the key and streams tokens back.
- Enforce auth, rate limiting, and per-user budget on that endpoint before you call the model. An unauthenticated streaming endpoint is an open invoice.

---

### [RULE] Stream user-facing responses; do not block on the full generation  (path: /app/api/chat)
<!-- scope: folder | priority: high | advisory -->

Latency to first token is the experience. A blocking call makes a fast model feel slow.

- For any response a user watches appear, use `streamText` and return `result.toUIMessageStreamResponse()` from the route handler. On the client, render it with `useChat` from `@ai-sdk/react`.
- Use `generateText` / `generateObject` only for server steps where you need the complete result before doing anything else (a classification, an extraction feeding the next step, a cron job). Do not `await generateText` to produce a chat reply.
- Propagate cancellation: pass the request's `AbortSignal` into the call so a user who navigates away stops the generation and the billing.
- Always render or handle the error part of the stream. A silent failed stream looks like a hang; surface it.

---

### [RULE] Schema-validate every tool input and structured output  (path: /src/ai)
<!-- scope: folder | priority: high | strict -->

The model emits text. A schema is the only thing that turns that text into data you can trust.

- Define each tool with `tool({ description, inputSchema: z.object({...}), execute })`. The description is the model's documentation for when to call it; write it for the model. The schema is validated before `execute` runs, so the handler receives typed, checked arguments.
- For structured extraction, use `generateObject({ schema, ... })` (or `streamObject`) rather than asking for JSON in the prompt and `JSON.parse`-ing the reply. The SDK validates against the schema and retries malformed output.
- Keep schemas tight: enums over free strings, `.describe()` on fields the model gets wrong, required fields required. A loose schema lets bad data through.
- A tool's `execute` is real code with real side effects. Validate authorization inside it; a model deciding to call a tool is not authorization to perform the action.

---

### [MEMORY] Route models through the AI Gateway as provider/model strings  (path: /src/ai)

Models are configuration, not code. We keep the call site provider-agnostic so swapping or falling back is a string change, not a refactor.

- Pass the model as a string, e.g. `streamText({ model: 'anthropic/claude-sonnet-4.6', ... })`. The AI Gateway resolves it; no `@ai-sdk/anthropic` import at the call site for the default path. Reach for a provider package only when you explicitly need direct provider wiring the gateway does not expose.
- The gateway gives one API key, unified billing, request observability, and model fallbacks. Configure a fallback chain so a provider outage degrades to another model instead of erroring.
- Centralize model IDs in one module (e.g. `src/ai/models.ts`) as named constants - `CHAT_MODEL`, `FAST_MODEL`, `EMBEDDING_MODEL` - so prompts and routes never hardcode a raw string and a model upgrade is one edit.
- Pick the tier by task: a small fast model for classification and routing, a frontier model for reasoning and agent loops. Do not default everything to the most expensive model.

See /src/ai for the agent loop memory and the structured-output rule.

---

### [MEMORY] Build agents with the tool loop, bounded by stopWhen  (path: /src/ai)

An "agent" in the AI SDK is not a special class. It is a normal generation given tools and allowed to take multiple steps: the model calls a tool, the SDK runs `execute`, feeds the result back, and the model decides the next step.

- Enable multi-step by setting a stop condition: `stopWhen: stepCountIs(5)` (AI SDK v5+ replaced the older `maxSteps` number with composable `stopWhen` conditions). Without a bound, a confused model can loop until it burns the budget.
- Each step is billed. More tools and more steps cost more tokens and more latency; give the model the fewest tools that cover the task and the smallest step budget that completes it.
- Make tools idempotent and side-effect-honest so a retried or repeated step heals instead of double-acting (charging twice, sending two emails). See the auth/billing patterns for the idempotency discipline.
- Inspect `steps` in the result (or stream parts) when debugging: it shows which tools were called with which arguments, which is where most agent bugs actually live.
- For durable, resumable agents that survive a crash mid-loop, run the loop inside a workflow/queue rather than a single request.

See /src/ai for the gateway routing memory and the schema-validation rule.

---

### [SKILL] ai-sdk-chat-route-review  (path: /)

---
name: ai-sdk-chat-route-review
description: Review checklist for Vercel AI SDK chat and agent endpoints. Run before merging any route handler, tool, or useChat client that calls a model.
---

# AI SDK chat/agent route review

- [ ] The model call runs on the server only; no provider import or key in client code, and no `NEXT_PUBLIC_`/`PUBLIC_`/`VITE_` prefix on any provider or gateway key.
- [ ] The endpoint enforces auth and rate limiting before calling the model; per-user budget is bounded.
- [ ] User-facing replies use `streamText` + `toUIMessageStreamResponse()` and `useChat`; `generateText`/`generateObject` are only used for fully-consumed backend steps.
- [ ] The request `AbortSignal` is passed through so navigation/cancel stops generation and billing.
- [ ] Every tool declares `description` + `inputSchema` (Zod); `generateObject`/`streamObject` declares a `schema`. No hand-parsed model JSON.
- [ ] Each tool's `execute` re-checks authorization for its action; a tool call is not authorization.
- [ ] Agent loops set `stopWhen: stepCountIs(n)` (or equivalent); tools given are the minimum needed and are idempotent.
- [ ] Model is a `provider/model` string via the gateway, sourced from a central models module; tier matches the task (cheap for routing, frontier for reasoning).
- [ ] The error part of the stream is rendered/handled; a failed stream is not silent.
- [ ] No PII or secrets are logged in prompts/traces beyond what telemetry policy allows.
