Skip to main content

OpenAI-Compatible API

Lattice Cloud accepts the same request shapes your OpenAI, Anthropic, or Groq client already speaks. Swap the base URL, swap the API key, and your existing code keeps working. The request can route across Lovelace-hosted capacity, marketplace providers, or a user's paired personal runtime.

ts
// Before: OpenAI
import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Summarize this." }],
});
ts
// After: Lattice Cloud (same SDK, two lines changed)
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.LOVELACE_KEY,
  baseURL: "https://lattice.uselovelace.com/v1",
});

const response = await client.chat.completions.create({
  model: "lovelace:llama-3.2-3b-instruct",
  messages: [{ role: "user", content: "Summarize this." }],
});

Endpoints we mirror

OpenAI endpointLattice Cloud aliasWhat runs
POST /v1/chat/completionsPOST /v1/chat/completionsAny text-generation canonical model
POST /v1/embeddingsPOST /v1/embeddingsAny text-embeddings canonical model
POST /v1/audio/transcriptionsPOST /v1/audio/transcriptionsWhisper family, Parakeet, Voxtral
Streaming (SSE)SameIdentical chunk shape

All four aliases are thin wrappers around a single generic POST /v1/tasks/:task dispatch, so a new task class such as realtime sessions, image understanding, or image generation is added by a catalog write, not by a gateway redeploy. Apps that only need the OpenAI surface never see that layer.

What changes when you swap the base URL

Nothing your app needs to notice. Same JSON, same streaming shape, same tool-call format, same error codes where Lattice can preserve them. The canonical model name is the only field the SDK sees that's different because lovelace:* points at the catalog entry, not a specific vendor's model.

Under the hood: the gateway picks a provider for the request (a peer device in your workspace, or a marketplace provider if you've opted in), mints the dispatch, streams the response back, meters the usage, and bills you on the same Stripe invoice. Your SDK doesn't see any of this.

Why drop-in matters

Every backend engineer has reflexes for the OpenAI SDK. pip install openai, const client = new OpenAI(...), messages: [...], streaming iterators. Asking someone to learn a new client just to evaluate a distributed-compute platform is asking them to do homework before they can even try it. We don't.

Once you've confirmed it works with the SDK you already have, the Lovelace SDK (@lovelace-ai/compute-client) exposes typed helpers for catalog browsing, structured errors, idempotency, and personal runtime selection. Adopt it when you need those; until then, the OpenAI client works fine. The SDK quickstart shows the typed path, and personal runtimes explains how a lattice:personal:* model maps to a paired local daemon.

Authentication

Two modes, picked by the shape of your bearer token:

  • Developer API key (sk_…) — you pay for the compute, your users see nothing. Billed to your Lovelace account on your monthly invoice.
  • Sign in with Lovelace OAuth (eyJ…) — your user authorizes your app once, and their compute pool serves their requests. Prompt bytes never transit our gateway on workspace-direct calls. See Lattice Cloud SDK quickstart for the OAuth flow.

Personal runtime model IDs

When you want a request to run on a user's paired local daemon, use the personal runtime model selector:

ts
const response = await client.chat.completions.create({
  model: "lattice:personal:lat_abc123",
  messages: [{ role: "user", content: "Run this on my local model." }],
});

The lat_abc123 value comes from the personal runtime resource your backend created or discovered. The gateway still receives an ordinary /v1/chat/completions request, but the selected runtime controls where the model work runs.

Next steps

  • Grab a key at developers.uselovelace.com/api-keys and point your existing SDK at the base URL above.
  • Pair a local daemon with personal runtimes when the user's own machine should serve the request.
  • When you outgrow the OpenAI surface, switch to @lovelace-ai/compute-client for typed runtime selection and structured gateway errors.
  • Use the sample app to verify token-provider, runtime-selection, revocation, and private-only failure handling.
  • Use the local Lattice docs when you are configuring the daemon that serves a personal runtime.