Skip to main content

Cloud Providers Setup

While local models offer privacy and offline capability, cloud LLM providers bring tremendous power and convenience. Cloud APIs give you access to the most capable models without requiring powerful local hardware. The models stay current as providers continuously improve them. You never worry about running out of local resources, and you can scale workloads far beyond what your machine could handle alone.

The tradeoff, of course, is that your data leaves your machine and travels to the provider's infrastructure. You pay per request rather than having unlimited local inference. Internet connectivity becomes a hard requirement. But for many use cases—especially when you need maximum capability or are working with workloads that exceed local resources—cloud providers are the right choice.

Anthropic (Claude)

Anthropic's Claude models are renowned for their strong reasoning capabilities, nuanced understanding, and helpful personality. Claude particularly excels at complex analysis, coding assistance, and maintaining context over long conversations. The latest Claude Sonnet 4 offers an excellent balance of speed, cost, and intelligence that makes it ideal for most production use cases.

Getting Your API Key

Before you can use Anthropic's models, you need an API key that authorizes your requests. Visit console.anthropic.com and create an account if you don't already have one. Once logged in, navigate to the "API Keys" section of the console. You'll see options to create new keys and manage existing ones.

Click "Create Key" to generate a new API key. Anthropic will display the key once—copy it immediately because you won't be able to see it again. API keys follow the format sk-ant- followed by a long string of characters. Treat this key like a password: anyone with access to it can make requests on your behalf and incur charges to your account.

Store the key securely. The recommended approach is to set it as an environment variable in your shell configuration file:

bash
# Add to ~/.bashrc or ~/.zshrc
export ANTHROPIC_API_KEY="sk-ant-api03-..."

# Reload your shell configuration
source ~/.bashrc

Setting it in your shell configuration ensures the variable is available every time you open a terminal. The daemon inherits this environment when it starts, giving it access to the key without storing it in configuration files that might be shared or committed to version control.

Configuring Lattice

With your API key secured in an environment variable, configure Lattice to use Anthropic as a provider. Edit your configuration file at ~/.lovelace/lattice/config.toml:

toml
[orchestrator]
default_provider = "anthropic"

[providers.anthropic]
api_key_env = "ANTHROPIC_API_KEY"
default_model = "claude-sonnet-4-6"
enabled = true

The api_key_env field tells Lattice to read the API key from the ANTHROPIC_API_KEY environment variable. This level of indirection keeps credentials out of configuration files. The default_model specifies which Claude model to use when you don't explicitly request a different one. The enabled field activates this provider for use.

After saving your configuration, restart the daemon to load the new settings:

bash
lattice-ctl daemon stop
lattice-ctl daemon start

Verify the provider is configured correctly and accessible:

bash
lattice-ctl provider test anthropic

If the test succeeds, Lattice can authenticate with Anthropic and make requests. If it fails, check that your API key is set correctly and that you have internet connectivity.

Understanding Claude Models

Anthropic offers several Claude models, each optimized for different priorities. Claude Sonnet (claude-sonnet-4-6) provides the best balance for most users—it's fast, capable, and reasonably priced. Use Sonnet as your default unless you have specific needs that require a different tier.

Claude Opus (claude-opus-4-8) represents the most powerful option, delivering the highest quality reasoning and output. Opus excels at extremely complex tasks, nuanced analysis, and scenarios where absolute quality matters more than speed or cost. However, it's significantly more expensive per request and slower to respond than Sonnet.

Claude Haiku (claude-haiku-4-5) optimizes for speed and cost. Haiku responds quickly and costs less per request, making it ideal for simple tasks, high-volume workloads, or scenarios where speed trumps sophistication. Use Haiku for straightforward tasks like classification, simple summarization, or cases where you need rapid responses.

All Claude models support a 200,000 token context window, allowing them to work with extremely long inputs or maintain extensive conversation history. This large context makes Claude particularly suitable for analyzing lengthy documents, maintaining complex multi-turn conversations, or tasks requiring substantial reference material.

Managing Costs

Cloud APIs charge per request based on token usage—both input tokens (what you send) and output tokens (what the model generates). Costs add up quickly if you're not mindful of usage patterns.

Monitor your spending through the Anthropic console's usage dashboard. Set up billing alerts to notify you when spending approaches your budget limits. Many users find it helpful to set monthly spending caps that prevent runaway costs from unexpected usage spikes.

Different tasks warrant different models based on their cost-performance tradeoffs. Use Haiku for simple, high-volume tasks where sophisticated reasoning isn't required. Reserve Sonnet for general-purpose work requiring solid capabilities. Only escalate to Opus when you need maximum quality and the cost is justified by the task's importance.

Keep prompts concise and focused to minimize token usage. Long, rambling prompts waste tokens without improving output quality. Similarly, when you don't need lengthy responses, specify desired output length to avoid paying for verbose answers you don't need.

OpenAI

Get API Key

  1. Go to platform.openai.com
  2. Sign up or log in
  3. Go to "API Keys"
  4. Click "Create new secret key"
  5. Copy the key (starts with sk-)

Configuration

Environment Variable:

bash
export OPENAI_API_KEY="sk-..."

Config File:

toml
# ~/.lovelace/lattice/config.toml

[orchestrator]
default_provider = "openai"

[providers.openai]
api_key_env = "OPENAI_API_KEY"
default_model = "gpt-4"
enabled = true

Restart and test:

bash
lattice-ctl daemon stop
lattice-ctl daemon start
lattice-ctl provider test openai

Available Models

bash
lattice-ctl provider models openai

GPT-4 Turbo

  • Model: gpt-4-turbo
  • Latest GPT-4 with improvements
  • 128K context window

GPT-4

  • Model: gpt-4
  • Original GPT-4
  • 8K context window

GPT-3.5 Turbo

  • Model: gpt-3.5-turbo
  • Fast and capable
  • 16K context window
  • Lower cost

Pricing

See openai.com/pricing

Google Gemini

Get API Key

  1. Go to ai.google.dev
  2. Click "Get API Key"
  3. Follow setup instructions
  4. Copy your API key

Configuration

Environment Variable:

bash
export GOOGLE_API_KEY="..."

Config File:

toml
# ~/.lovelace/lattice/config.toml

[orchestrator]
default_provider = "google"

[providers.google]
api_key_env = "GOOGLE_API_KEY"
default_model = "gemini-pro"
enabled = true

Restart and test:

bash
lattice-ctl daemon stop
lattice-ctl daemon start
lattice-ctl provider test google

Multiple Providers

Configure multiple providers and switch between them:

toml
# ~/.lovelace/lattice/config.toml

[orchestrator]
default_provider = "anthropic"  # Default

[providers.anthropic]
api_key_env = "ANTHROPIC_API_KEY"
default_model = "claude-sonnet-4-6"
enabled = true

[providers.openai]
api_key_env = "OPENAI_API_KEY"
default_model = "gpt-4-turbo"
enabled = true

[providers.google]
api_key_env = "GOOGLE_API_KEY"
default_model = "gemini-pro"
enabled = true

Switch provider at runtime:

bash
# Start daemon with specific provider
lattice-ctl daemon start --provider openai --model gpt-4-turbo

# Or change in config and restart
lattice-ctl daemon stop
# Edit config.toml: default_provider = "openai"
lattice-ctl daemon start

API Key Security

Best Practices

✅ DO:

  • Use environment variables
  • Add to .env or shell config
  • Never commit to git
  • Rotate keys regularly

❌ DON'T:

  • Hardcode in config files
  • Commit to version control
  • Share keys publicly
  • Use same key everywhere

Storage Locations

bash
# Shell config (recommended)
~/.bashrc
~/.zshrc

# Lattice config (reference env var only)
~/.lovelace/lattice/config.toml

# Project-specific (not committed)
.env

Example .bashrc:

bash
# ~/.bashrc or ~/.zshrc

# Lattice API Keys
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."
export GOOGLE_API_KEY="..."

# Reload after editing:
# source ~/.bashrc

Rate Limits & Quotas

Anthropic

OpenAI

Google

  • Free tier: Generous limits
  • Paid: Based on usage
  • Check: ai.google.dev

Error Handling

Invalid API key

bash
lattice-ctl provider test anthropic
# Error: Invalid API key

# Fix:
echo $ANTHROPIC_API_KEY  # Check set correctly
export ANTHROPIC_API_KEY="sk-ant-..."  # Set if missing
lattice-ctl daemon restart  # Reload environment

Rate limit exceeded

bash
# Error: Rate limit exceeded

# Solutions:
# 1. Wait and retry
# 2. Upgrade tier (check provider console)
# 3. Switch to different provider temporarily
lattice-ctl daemon start --provider openai

Network timeout

bash
# Error: Request timeout

# Check internet connection
ping anthropic.com

# Increase timeout in config
[daemon]
ipc_timeout_ms = 30000  # 30 seconds

Cost Management

Monitor Usage

Anthropic:

  • Dashboard: console.anthropic.com → Usage

OpenAI:

  • Dashboard: platform.openai.com → Usage

Google:

  • Dashboard: ai.google.dev → Usage

Set Limits

Configure in provider dashboards to prevent runaway costs:

  • Monthly spending limits
  • Rate limits
  • Email alerts

Optimize Costs

  1. Use appropriate models:

    • Haiku (claude-haiku-4-5) / GPT-3.5 Turbo for simple tasks
    • Sonnet (claude-sonnet-4-6) / GPT-4 for complex tasks
    • Opus (claude-opus-4-8) only when maximum quality is required
  2. Local models for development:

    toml
    # Use Ollama locally, cloud in production
    [orchestrator]
    default_provider = "ollama"  # Development
    # default_provider = "anthropic"  # Production
    
  3. Monitor token usage:

    bash
    # Check request sizes in logs
    lattice-ctl daemon start --foreground
    

Next Steps

Related

OpenAI

OpenAI pioneered widely-accessible large language models and continues to push the frontier with GPT-4 and its variants. OpenAI's models excel at creative tasks, code generation, and general-purpose reasoning. The extensive ecosystem around OpenAI's API and broad model selection make it a versatile choice for many scenarios.

Obtaining Your API Key

Navigate to platform.openai.com and sign in with your OpenAI account, or create one if you're new to the platform. Once authenticated, go to the "API Keys" section from your account dashboard. You'll see a list of any existing keys and an option to create new ones.

Click "Create new secret key" to generate a fresh API key. OpenAI displays the complete key immediately after creation—this is your only opportunity to copy it. The key starts with sk- followed by a long alphanumeric string. Store this securely, as anyone with access can charge requests to your account.

Set the API key in your shell environment for secure, persistent access:

bash
export OPENAI_API_KEY="sk-..."

Add this export to your shell configuration file (~/.bashrc or ~/.zshrc) to ensure it's available in every new terminal session.

Configuring Lattice for OpenAI

Edit your Lattice configuration to include OpenAI as a provider:

toml
[orchestrator]
default_provider = "openai"

[providers.openai]
api_key_env = "OPENAI_API_KEY"
default_model = "gpt-4-turbo"
enabled = true

This configuration tells Lattice to read the API key from your environment variable, use GPT-4 Turbo as the default model, and enable OpenAI for use. After saving the configuration file, restart the daemon and test connectivity:

bash
lattice-ctl daemon stop
lattice-ctl daemon start
lattice-ctl provider test openai

Successful test output confirms that Lattice can authenticate with OpenAI and make requests.

Navigating OpenAI's Model Lineup

OpenAI offers several model tiers, each with different capabilities and price points. GPT-4 Turbo (gpt-4-turbo) represents the latest iteration of GPT-4 with improved performance and a larger 128K context window. It balances capability with reasonable cost and speed, making it suitable for most production use cases.

The original GPT-4 (gpt-4) model remains available with an 8K context window. While slightly less capable than Turbo, it's well-tested and reliable for tasks that don't require the extended context or latest improvements.

GPT-3.5 Turbo (gpt-3.5-turbo) offers a cost-effective option for less demanding tasks. It's significantly faster and cheaper than GPT-4 variants while still handling many common use cases competently. Use GPT-3.5 Turbo for simple classification, straightforward summarization, or high-volume tasks where cost matters more than maximum quality.

Managing OpenAI Costs

OpenAI's pricing is based on token consumption—both input and output tokens count toward your bill. The platform provides detailed usage tracking through the usage dashboard at platform.openai.com/usage, where you can see daily and monthly spending broken down by model.

Set spending limits in your account settings to prevent unexpected charges. OpenAI allows both hard limits (which stop requests when reached) and soft limits (which trigger notifications). Configure these based on your budget and risk tolerance.

Choose models appropriately for each task. GPT-3.5 Turbo costs a fraction of what GPT-4 does—using it for simple tasks dramatically reduces expenses. Reserve GPT-4 for complex reasoning, creative work, or tasks where quality justifies the additional cost.

Google Gemini

Google's Gemini models bring multimodal capabilities and Google's AI research prowess to the LLM landscape. Gemini excels at tasks requiring integration of text and other modalities, and offers competitive pricing with generous free tiers for development use.

Getting Started with Gemini

Visit ai.google.dev to access Google's AI platform. If you don't have a Google Cloud account, you'll need to create one—the process is straightforward and Google provides free credits for new users. Once you're logged in, navigate to the API keys section and click "Get API Key."

Google generates an API key immediately. Copy this key for use in your Lattice configuration. Unlike the other providers, Google's API keys don't follow a standard prefix pattern, so verify you've copied the complete key.

Set the API key in your environment:

bash
export GOOGLE_API_KEY="your-api-key-here"

Configuring Gemini in Lattice

Add Gemini to your provider configuration:

toml
[orchestrator]
default_provider = "google"

[providers.google]
api_key_env = "GOOGLE_API_KEY"
default_model = "gemini-pro"
enabled = true

Restart the daemon and verify connectivity:

bash
lattice-ctl daemon stop
lattice-ctl daemon start
lattice-ctl provider test google

Understanding Gemini's Offerings

Gemini Pro represents Google's capable general-purpose model, handling a wide range of tasks with good performance. It supports text generation, analysis, and reasoning tasks competently. Gemini's multimodal capabilities—processing text, images, and other inputs together—make it particularly valuable for tasks requiring integration across different data types.

Google's pricing for Gemini is competitive with generous free tiers that make it accessible for development and experimentation. Check ai.google.dev for current pricing and tier details, as Google frequently updates offerings.

Using Multiple Providers Simultaneously

Lattice doesn't limit you to a single provider—configure all the providers you want access to and switch between them as needs dictate. Having multiple providers configured gives you flexibility to choose the best tool for each job, provides redundancy if one provider has outages, and lets you balance costs across different services.

A multi-provider configuration looks like this:

toml
[orchestrator]
default_provider = "anthropic"  # Default for general use

[providers.anthropic]
api_key_env = "ANTHROPIC_API_KEY"
default_model = "claude-sonnet-4-6"
enabled = true

[providers.openai]
api_key_env = "OPENAI_API_KEY"
default_model = "gpt-4-turbo"
enabled = true

[providers.google]
api_key_env = "GOOGLE_API_KEY"
default_model = "gemini-pro"
enabled = true

With this setup, Anthropic serves as your default provider for most work. When you need OpenAI's specific strengths or want to use Gemini's multimodal capabilities, those providers are available. You can specify which provider to use for specific sessions or agents, overriding the default when appropriate.

Securing Your API Keys

API keys are credentials that authorize spending on your account, so protecting them is crucial. Never commit API keys to version control—they should live only in environment variables or secure secret management systems. Add patterns like .env and files containing keys to your .gitignore to prevent accidental commits.

Rotate keys periodically as a security best practice. Most providers let you create multiple keys—generate a new one, update your configuration to use it, verify everything works, then revoke the old key. This rotation limits exposure if a key is ever compromised.

Store keys in your shell configuration files (~/.bashrc, ~/.zshrc) for local development. For production deployments, use proper secret management systems like environment variable injection, secret managers, or configuration management tools that handle credentials securely.

Never share API keys directly. If you need to collaborate with others, each person should obtain their own keys from the provider. Sharing keys makes it impossible to track who's responsible for usage and complicates revocation if someone leaves the project or a key is compromised.

Understanding Rate Limits

Cloud providers impose rate limits to prevent abuse and ensure fair resource allocation. These limits typically restrict how many requests you can make per minute, per day, or per month. When you exceed rate limits, providers return error responses instead of processing your requests.

Anthropic's rate limits vary by usage tier—new accounts have lower limits than established, high-volume customers. You can see your current limits in the Anthropic console and request increases if needed. OpenAI follows a similar tiered approach, with limits increasing as you demonstrate responsible usage over time.

When building applications that might hit rate limits, implement retry logic with exponential backoff. If a request fails due to rate limiting, wait progressively longer before retrying—this prevents hammering the API and gives the limit window time to reset. Most provider client libraries include retry logic, but verify this when using direct API access.

Distributing load across multiple providers helps work around rate limits organically. If you're pushing the limits on Anthropic, some requests can go to OpenAI or Google instead. This natural load balancing keeps all providers within their limits while maintaining total throughput.

Monitoring Usage and Costs

Every cloud provider offers usage dashboards where you can track request volumes, token consumption, and resulting costs. Check these dashboards regularly to ensure spending aligns with expectations and to catch anomalies early.

Set up billing alerts in each provider's console. Configure alerts at levels that give you early warning—perhaps at 50%, 75%, and 90% of your monthly budget. These notifications prevent surprise bills and give you time to investigate unexpected usage spikes.

Track which parts of your application or which workflows consume the most tokens. This attribution helps optimize costs by identifying inefficient prompts, unnecessary requests, or opportunities to use cheaper models without sacrificing quality. Many providers offer detailed breakdowns by API key, making it easy to track usage across different projects or environments.

Troubleshooting Cloud Provider Issues

When cloud provider requests fail, several common issues are worth checking first. Invalid or missing API keys cause authentication failures—verify the key is set correctly with echo $ANTHROPIC_API_KEY or the equivalent for your provider. Ensure the daemon restarted after you set the key, as it needs to inherit the environment variable.

Rate limit errors indicate you're exceeding allowed request volumes. Check your usage in the provider's dashboard to confirm you're hitting limits. If legitimate usage exceeds your limits, contact the provider to request an increase. For development spikes causing temporary limit issues, spread requests over time or use multiple API keys across different accounts.

Network timeouts suggest connectivity problems between your machine and the provider. Test basic connectivity with curl to the provider's API endpoint. If that works but Lattice still times out, check your IPC timeout settings—increase ipc_timeout_ms in the daemon configuration for slow network conditions.

Model not found errors mean you're requesting a model that doesn't exist or you don't have access to. Verify the model name matches exactly what the provider expects—model names are case-sensitive and version-specific. Check your API tier to ensure you have access to the requested model.

Next Steps

Cloud provider configuration is one piece of the Lattice puzzle. The local models guide covers setting up Ollama and LM Studio for offline, private AI. The configuration overview explains how all configuration pieces fit together. For testing and troubleshooting providers, see the provider CLI commands guide. Finally, return to the Lattice overview to understand how cloud and local providers complement each other in a complete AI setup.