Personal Lattice Runtimes
Personal runtimes let your app use a user's local model infrastructure through the hosted Lattice Cloud gateway. Your application keeps a normal HTTPS API integration. The user keeps model execution on their own machine.
The developer experience should feel like this:
- Create a Lovelace account and API key.
- Install Lattice on the machine that should run inference.
- Create a personal runtime record.
- Run one
lattice-ctl relay connectcommand locally. - Send a chat request with
model: "lattice:personal:<latticeId>".
The daemon never accepts inbound traffic from the public internet. It polls the relay outbound, processes sealed work locally, and returns the result through the same gateway request your app already made.
If you are building the ordinary third-party app flow, start with the SDK quickstart and come back here when you need to pair or troubleshoot a user's machine.
What Gets Created
Prerequisites
You need:
- A Lovelace developer account.
- A developer API key with
lattice:compute:inferenceandlattice:compute:personal_runtime:manage. - Lattice installed on the user's machine. See Lattice installation.
- A local provider configured on that machine, such as Ollama or LM Studio. See Local models setup.
Store the developer key in trusted backend code or a setup tool:
export LATTICE_GATEWAY_URL="https://lattice.uselovelace.com"
export LOVELACE_API_KEY="sk_..."
Do not put a developer API key, bootstrap token, or daemon relay token in browser code.
1. Install And Start Lattice
Install Lattice on the machine that should run local inference:
curl -fsSL https://install.uselovelace.com/lattice | sh
Confirm the tools are installed:
lattice-ctl --version
lattice-daemon --version
Start the daemon:
lattice-ctl daemon start
lattice-ctl ping
For an Ollama-backed local runtime, also confirm Ollama is reachable:
curl -sS http://localhost:11434/api/version
2. Create The Runtime Record
From your app backend or setup CLI, create an account-owned runtime:
curl -sS -X POST "$LATTICE_GATEWAY_URL/v1/personal-runtimes" \
-H "x-api-key: $LOVELACE_API_KEY" \
-H "content-type: application/json" \
-d '{
"displayName": "Mac Studio in the office",
"executionBackend": "local-model"
}'
The response contains the durable latticeId your app stores or shows in a
device picker:
{
"runtime": {
"latticeId": "lat_abc123",
"accountId": "acct_abc123",
"displayName": "Mac Studio in the office",
"executionBackend": "local-model",
"status": "unpaired",
"createdAt": "2026-06-07T00:00:00.000Z",
"updatedAt": "2026-06-07T00:00:00.000Z"
}
}
Use local-model when the daemon should call a local provider such as Ollama.
Use assistant-runtime only when the daemon should execute through its embedded
assistant runtime.
3. Mint A Bootstrap Token
Mint a short-lived one-time token for that runtime:
export LATTICE_ID="lat_abc123"
curl -sS -X POST \
"$LATTICE_GATEWAY_URL/v1/personal-runtimes/$LATTICE_ID/bootstrap-tokens" \
-H "x-api-key: $LOVELACE_API_KEY" \
-H "content-type: application/json" \
-d '{"ttlSeconds": 600}'
The response includes a command safe to show to the user during setup:
{
"latticeId": "lat_abc123",
"bootstrapToken": "plrt_v1...",
"expiresAt": "2026-06-07T00:10:00.000Z",
"command": "lattice-ctl relay connect --gateway 'https://lattice.uselovelace.com' --bootstrap-token 'plrt_v1...'"
}
Bootstrap tokens are one-time-use. If the user closes the terminal or waits too long, mint another token from the same runtime record.
4. Pair The Local Daemon
Run the returned command on the local machine:
lattice-ctl relay connect \
--gateway "$LATTICE_GATEWAY_URL" \
--bootstrap-token "$BOOTSTRAP_TOKEN"
The command redeems the token and writes the relay configuration the daemon needs. A successful connection gives the daemon a relay URL, the runtime id, the account id, the runtime display name, a daemon auth token, and the selected execution backend.
After the daemon starts with that configuration, it registers with the relay and polls for work. The user's network only needs outbound HTTPS access.
5. Wait For Reachability
List runtimes to show setup state in your app:
curl -sS "$LATTICE_GATEWAY_URL/v1/personal-runtimes" \
-H "x-api-key: $LOVELACE_API_KEY"
The gateway overlays live relay state onto each durable runtime record:
| Status | Meaning |
|---|---|
unpaired | The runtime exists, but no daemon has registered yet. |
reachable | The relay currently sees the daemon online. |
offline | The daemon was paired before, but is not currently reachable. |
revoked | The runtime is disabled and cannot mint new bootstrap credentials. |
Only send requests to runtimes that are reachable.
6. Send A Request
Use the normal OpenAI-compatible chat-completions endpoint and select the personal runtime as the model:
curl -sS -X POST "$LATTICE_GATEWAY_URL/v1/chat/completions" \
-H "x-api-key: $LOVELACE_API_KEY" \
-H "content-type: application/json" \
-d "{
\"model\": \"lattice:personal:$LATTICE_ID\",
\"messages\": [
{
\"role\": \"user\",
\"content\": \"Summarize this using my local runtime.\"
}
]
}"
From your app's perspective this is one HTTPS request. Behind the gateway, Lattice Cloud leases the work to the paired daemon, the daemon runs the local provider, and the gateway returns an OpenAI-compatible response.
TypeScript SDK
For simple app backends, use the high-level compute client:
import { ComputeClient } from "@lovelace-ai/compute-client";
const compute = new ComputeClient({
credential: { kind: "apiKey", apiKey: process.env.LOVELACE_API_KEY! },
});
const runtimes = await compute.listReachablePersonalRuntimes();
const runtime = runtimes[0];
if (!runtime) {
throw new Error("No reachable personal runtime is available.");
}
const completion = await compute.chatCompleteForPersonalRuntime(runtime, {
messages: [
{
role: "user",
content: "Summarize this using the user's local model.",
},
],
});
console.log(completion.choices[0]?.message.content);
If you need to create and bootstrap runtimes from code, use the typed REST
helpers from @lovelace-ai/compute:
import { createLatticeCloudClient } from "@lovelace-ai/compute";
const lattice = createLatticeCloudClient({
baseUrl: process.env.LATTICE_GATEWAY_URL!,
apiKey: process.env.LOVELACE_API_KEY!,
});
const runtime = await lattice.personalRuntimes.create({
displayName: "Mac Studio in the office",
executionBackend: "local-model",
});
const bootstrap = await lattice.personalRuntimes.createBootstrapToken(
runtime.latticeId,
{ ttlSeconds: 600 },
);
console.log(bootstrap.command);
For Sign in with Lovelace flows, construct the client with the user's OAuth access token instead of a developer key:
const compute = new ComputeClient({
credential: { kind: "oauth", accessToken },
});
That OAuth grant can read and use personal runtimes when it carries
lattice:compute:inference. It should not create, revoke, or bootstrap
runtimes unless the user granted lattice:compute:personal_runtime:manage.
The SDK quickstart covers the ordinary third-party app
path.
Backend Integration Pattern
The clean application shape is:
- Your backend stores the selected
latticeIdfor the signed-in user. - The browser sends ordinary chat input to your backend.
- Your backend calls Lattice Cloud with the developer key or the user's bearer token.
- Your backend returns the completion response to the browser.
Keep runtime creation and bootstrap-token minting behind trusted code. The browser can select a runtime and send prompts, but it should not handle the developer key or any daemon relay credential.
Revoking A Runtime
When a user disconnects a machine, revoke it:
curl -sS -X DELETE "$LATTICE_GATEWAY_URL/v1/personal-runtimes/$LATTICE_ID" \
-H "x-api-key: $LOVELACE_API_KEY"
Revocation prevents new bootstrap credentials from being minted for that runtime. The runtime remains visible for account history and auditability.
Troubleshooting
| Symptom | Check |
|---|---|
Runtime stays unpaired | Confirm the user ran the latest lattice-ctl relay connect command before the bootstrap token expired. |
Runtime becomes offline | Confirm the daemon is running with lattice-ctl ping and the machine has outbound HTTPS access. |
| Chat request times out | Confirm the local provider is running and the model can answer locally. |
insufficient_scopes | Recreate the developer key or OAuth grant with lattice:compute:inference and, for setup actions, lattice:compute:personal_runtime:manage. |
revoked runtime | Create a new runtime record and pair the daemon again. |