Skip to main content

Personal Lattice Runtimes

Personal runtimes let your app use a user's local model infrastructure through the hosted Lattice Cloud gateway. Your application keeps a normal HTTPS API integration. The user keeps model execution on their own machine.

The developer experience should feel like this:

  1. Create a Lovelace account and API key.
  2. Install Lattice on the machine that should run inference.
  3. Create a personal runtime record.
  4. Run one lattice-ctl relay connect command locally.
  5. Send a chat request with model: "lattice:personal:<latticeId>".

The daemon never accepts inbound traffic from the public internet. It polls the relay outbound, processes sealed work locally, and returns the result through the same gateway request your app already made.

If you are building the ordinary third-party app flow, start with the SDK quickstart and come back here when you need to pair or troubleshoot a user's machine.

What Gets Created

Loading diagram…

Prerequisites

You need:

  • A Lovelace developer account.
  • A developer API key with lattice:compute:inference and lattice:compute:personal_runtime:manage.
  • Lattice installed on the user's machine. See Lattice installation.
  • A local provider configured on that machine, such as Ollama or LM Studio. See Local models setup.

Store the developer key in trusted backend code or a setup tool:

bash
export LATTICE_GATEWAY_URL="https://lattice.uselovelace.com"
export LOVELACE_API_KEY="sk_..."

Do not put a developer API key, bootstrap token, or daemon relay token in browser code.

1. Install And Start Lattice

Install Lattice on the machine that should run local inference:

bash
curl -fsSL https://install.uselovelace.com/lattice | sh

Confirm the tools are installed:

bash
lattice-ctl --version
lattice-daemon --version

Start the daemon:

bash
lattice-ctl daemon start
lattice-ctl ping

For an Ollama-backed local runtime, also confirm Ollama is reachable:

bash
curl -sS http://localhost:11434/api/version

2. Create The Runtime Record

From your app backend or setup CLI, create an account-owned runtime:

bash
curl -sS -X POST "$LATTICE_GATEWAY_URL/v1/personal-runtimes" \
  -H "x-api-key: $LOVELACE_API_KEY" \
  -H "content-type: application/json" \
  -d '{
    "displayName": "Mac Studio in the office",
    "executionBackend": "local-model"
  }'

The response contains the durable latticeId your app stores or shows in a device picker:

json
{
  "runtime": {
    "latticeId": "lat_abc123",
    "accountId": "acct_abc123",
    "displayName": "Mac Studio in the office",
    "executionBackend": "local-model",
    "status": "unpaired",
    "createdAt": "2026-06-07T00:00:00.000Z",
    "updatedAt": "2026-06-07T00:00:00.000Z"
  }
}

Use local-model when the daemon should call a local provider such as Ollama. Use assistant-runtime only when the daemon should execute through its embedded assistant runtime.

3. Mint A Bootstrap Token

Mint a short-lived one-time token for that runtime:

bash
export LATTICE_ID="lat_abc123"

curl -sS -X POST \
  "$LATTICE_GATEWAY_URL/v1/personal-runtimes/$LATTICE_ID/bootstrap-tokens" \
  -H "x-api-key: $LOVELACE_API_KEY" \
  -H "content-type: application/json" \
  -d '{"ttlSeconds": 600}'

The response includes a command safe to show to the user during setup:

json
{
  "latticeId": "lat_abc123",
  "bootstrapToken": "plrt_v1...",
  "expiresAt": "2026-06-07T00:10:00.000Z",
  "command": "lattice-ctl relay connect --gateway 'https://lattice.uselovelace.com' --bootstrap-token 'plrt_v1...'"
}

Bootstrap tokens are one-time-use. If the user closes the terminal or waits too long, mint another token from the same runtime record.

4. Pair The Local Daemon

Run the returned command on the local machine:

bash
lattice-ctl relay connect \
  --gateway "$LATTICE_GATEWAY_URL" \
  --bootstrap-token "$BOOTSTRAP_TOKEN"

The command redeems the token and writes the relay configuration the daemon needs. A successful connection gives the daemon a relay URL, the runtime id, the account id, the runtime display name, a daemon auth token, and the selected execution backend.

After the daemon starts with that configuration, it registers with the relay and polls for work. The user's network only needs outbound HTTPS access.

5. Wait For Reachability

List runtimes to show setup state in your app:

bash
curl -sS "$LATTICE_GATEWAY_URL/v1/personal-runtimes" \
  -H "x-api-key: $LOVELACE_API_KEY"

The gateway overlays live relay state onto each durable runtime record:

StatusMeaning
unpairedThe runtime exists, but no daemon has registered yet.
reachableThe relay currently sees the daemon online.
offlineThe daemon was paired before, but is not currently reachable.
revokedThe runtime is disabled and cannot mint new bootstrap credentials.

Only send requests to runtimes that are reachable.

6. Send A Request

Use the normal OpenAI-compatible chat-completions endpoint and select the personal runtime as the model:

bash
curl -sS -X POST "$LATTICE_GATEWAY_URL/v1/chat/completions" \
  -H "x-api-key: $LOVELACE_API_KEY" \
  -H "content-type: application/json" \
  -d "{
    \"model\": \"lattice:personal:$LATTICE_ID\",
    \"messages\": [
      {
        \"role\": \"user\",
        \"content\": \"Summarize this using my local runtime.\"
      }
    ]
  }"

From your app's perspective this is one HTTPS request. Behind the gateway, Lattice Cloud leases the work to the paired daemon, the daemon runs the local provider, and the gateway returns an OpenAI-compatible response.

TypeScript SDK

For simple app backends, use the high-level compute client:

ts
import { ComputeClient } from "@lovelace-ai/compute-client";

const compute = new ComputeClient({
  credential: { kind: "apiKey", apiKey: process.env.LOVELACE_API_KEY! },
});

const runtimes = await compute.listReachablePersonalRuntimes();
const runtime = runtimes[0];

if (!runtime) {
  throw new Error("No reachable personal runtime is available.");
}

const completion = await compute.chatCompleteForPersonalRuntime(runtime, {
  messages: [
    {
      role: "user",
      content: "Summarize this using the user's local model.",
    },
  ],
});

console.log(completion.choices[0]?.message.content);

If you need to create and bootstrap runtimes from code, use the typed REST helpers from @lovelace-ai/compute:

ts
import { createLatticeCloudClient } from "@lovelace-ai/compute";

const lattice = createLatticeCloudClient({
  baseUrl: process.env.LATTICE_GATEWAY_URL!,
  apiKey: process.env.LOVELACE_API_KEY!,
});

const runtime = await lattice.personalRuntimes.create({
  displayName: "Mac Studio in the office",
  executionBackend: "local-model",
});

const bootstrap = await lattice.personalRuntimes.createBootstrapToken(
  runtime.latticeId,
  { ttlSeconds: 600 },
);

console.log(bootstrap.command);

For Sign in with Lovelace flows, construct the client with the user's OAuth access token instead of a developer key:

ts
const compute = new ComputeClient({
  credential: { kind: "oauth", accessToken },
});

That OAuth grant can read and use personal runtimes when it carries lattice:compute:inference. It should not create, revoke, or bootstrap runtimes unless the user granted lattice:compute:personal_runtime:manage. The SDK quickstart covers the ordinary third-party app path.

Backend Integration Pattern

The clean application shape is:

  1. Your backend stores the selected latticeId for the signed-in user.
  2. The browser sends ordinary chat input to your backend.
  3. Your backend calls Lattice Cloud with the developer key or the user's bearer token.
  4. Your backend returns the completion response to the browser.

Keep runtime creation and bootstrap-token minting behind trusted code. The browser can select a runtime and send prompts, but it should not handle the developer key or any daemon relay credential.

Revoking A Runtime

When a user disconnects a machine, revoke it:

bash
curl -sS -X DELETE "$LATTICE_GATEWAY_URL/v1/personal-runtimes/$LATTICE_ID" \
  -H "x-api-key: $LOVELACE_API_KEY"

Revocation prevents new bootstrap credentials from being minted for that runtime. The runtime remains visible for account history and auditability.

Troubleshooting

SymptomCheck
Runtime stays unpairedConfirm the user ran the latest lattice-ctl relay connect command before the bootstrap token expired.
Runtime becomes offlineConfirm the daemon is running with lattice-ctl ping and the machine has outbound HTTPS access.
Chat request times outConfirm the local provider is running and the model can answer locally.
insufficient_scopesRecreate the developer key or OAuth grant with lattice:compute:inference and, for setup actions, lattice:compute:personal_runtime:manage.
revoked runtimeCreate a new runtime record and pair the daemon again.

Related