Skip to main content

Lattice Cloud

Lattice Cloud is the hosted gateway for AI compute on Lovelace. Your app sends a normal authenticated request to one API, and the gateway decides whether that request should run on Lovelace-hosted capacity, marketplace providers, or a specific personal Lattice runtime owned by the user.

If you are building a third-party app, start here rather than the local Lattice daemon docs. The local daemon is the runtime that can serve work on a user's machine. Lattice Cloud is the public gateway your app calls from a backend, worker, or other trusted service boundary.

The most important idea is that Lattice Cloud is a control plane. It authenticates the caller, checks scopes, chooses a runtime, meters the request, and returns the response. When you route to a personal runtime, the model work still runs on the user's machine through their outbound Lattice daemon.

Loading diagram…

What You Can Build

Use Lattice Cloud when you want a single compute API that can serve several deployment models without rewriting your application.

If you want to...Start with
Build a user-authorized third-party appQuickstart
Copy the smallest checked backend integration shapeSample app
Keep your existing OpenAI SDKOpenAI-compatible API
Route to a user's Mac, PC, or workstationPersonal runtimes
Understand marketplace and workspace provider routingLattice Cloud provider routing
Understand provider device trustProvider attestation

Choose The Right Integration Path

Most apps choose one of these paths:

PathCredentialBest forFirst doc
Typed Lovelace SDKOAuth token or developer keyRuntime discovery, structured errors, retriesQuickstart
Existing OpenAI-compatible SDKOAuth token or developer keyMinimal migration from an OpenAI clientOpenAI-compatible API
Local daemon and CLILocal machine configurationPersonal workflows on the developer's machineLattice quickstart
Runtime setup and bootstrap toolingDeveloper keyPairing a user's daemon with the gatewayPersonal runtimes

Use the TypeScript SDK when you want the gateway concepts to be explicit in your code:

ts
import { ComputeClient } from "@lovelace-ai/compute-client";

const compute = new ComputeClient({
  credential: { kind: "oauth", accessToken },
});

const [runtime] = await compute.listReachablePersonalRuntimes();

if (runtime !== undefined) {
  await compute.chatCompleteForPersonalRuntime(runtime, {
    messages: [{ role: "user", content: "Summarize this file." }],
  });
}

The SDK owns HTTP request construction and structured gateway errors. Shared protocol types come from the compute packages, so app docs should link to the SDK instead of redefining those types inline.

Personal Runtime Mental Model

A personal runtime is a named local Lattice daemon that a Lovelace account has paired with the hosted gateway. It lets a developer build an app that feels like a normal cloud integration while the actual model inference runs on the user's machine.

The flow has four moving parts:

  1. Runtime record: Your backend creates lat_..., an account-owned record that represents one user machine.
  2. Bootstrap token: The gateway mints a short-lived one-time token for that runtime.
  3. Local daemon: The user runs lattice-ctl relay connect, which redeems the bootstrap token and starts outbound polling.
  4. Model-compatible request: Your app sends model: "lattice:personal:<latticeId>" to the gateway.

There is no inbound port on the user's machine. The daemon polls the relay outbound, receives sealed work, runs the configured local model or assistant runtime, and sends the result back through the gateway response.

Authentication Modes

Lattice Cloud supports two integration modes:

ModeCredentialWho controls the compute
Developer-paidDeveloper API key from API keysYour Lovelace developer account
User-authorizedSign in with Lovelace OAuth access tokenThe signed-in user's authorized account

Most server-side app backends start with a developer API key. Browser apps should call your backend, not the gateway directly with a developer key. Use Sign in with Lovelace when the end user should authorize and pay for their own compute pool.

User-authorized apps can list the user's readable personal runtimes and target a reachable runtime with the same public SDK helpers used by developer-key traffic. Runtime management is separate: ordinary apps should not request personal-runtime management authority unless they are deliberately acting as a trusted setup surface.

REST Surface

Personal runtimes use REST resources:

EndpointPurpose
POST /v1/personal-runtimesCreate an unpaired runtime record
GET /v1/personal-runtimesList runtime records with live reachability
GET /v1/personal-runtimes/{latticeId}Read one caller-owned runtime
DELETE /v1/personal-runtimes/{latticeId}Revoke a runtime
POST /v1/personal-runtimes/{latticeId}/bootstrap-tokensMint a one-time daemon bootstrap token
POST /v1/personal-runtime-bootstrap-redemptionsRedeem the bootstrap token from lattice-ctl relay connect
POST /v1/chat/completionsSend OpenAI-compatible chat requests, including lattice:personal:* models

Start Here

  1. Read Quickstart to build the user-authorized app flow.
  2. Read Sample app to see the checked TypeScript integration shape.
  3. Read Personal runtimes to pair a local daemon and route an app request through it.
  4. Read OpenAI-compatible API if your app already uses an OpenAI SDK and you want the smallest client change.

Related