
Every `exe.dev` VM has access to the LLM Gateway, a built-in proxy to
Anthropic, OpenAI, and Fireworks APIs. Your subscription includes a monthly
token allocation, and you can purchase additional tokens at
[https://exe.dev/user](https://exe.dev/user).

See the [full list of supported models](/llm-gateway-models) ([JSON](/llm-gateway-models.json)).

The gateway is available inside your VM at
`http://169.254.169.254/gateway/llm/provider`, where `provider` is one of
`anthropic`, `openai`, or `fireworks`. No API keys are necessary.

[Shelley](/docs/shelley/intro) uses the LLM Gateway by default, but you can
also use it directly from any program running on your VM.

## Using the gateway with Codex

Add an OpenAI-compatible provider to `~/.codex/config.toml`:

```toml
model_provider = "exe-openai"

[model_providers.exe-openai]
name = "exe.dev LLM Gateway"
base_url = "http://169.254.169.254/gateway/llm/openai/v1"
requires_openai_auth = false
```

Then run Codex normally:

```sh
$ codex
```

The `base_url` ends at `/v1`. Codex adds the Responses API path when it
makes model requests.

## Using the gateway with Claude Code

Add the Anthropic gateway base URL to `~/.claude/settings.json`:

```json
{
  "apiKeyHelper": "printf exe-gateway",
  "env": {
    "ANTHROPIC_BASE_URL": "http://169.254.169.254/gateway/llm/anthropic"
  }
}
```

Claude Code expects an API key source, so `apiKeyHelper` returns a harmless
placeholder. The gateway authenticates the VM; you do not need an Anthropic
API key.

Then run Claude Code normally:

```sh
$ claude
```

The `ANTHROPIC_BASE_URL` ends at `/anthropic`. Claude Code adds the Anthropic
API paths when it makes model requests.

## Using the gateway with curl

Point your requests at the gateway URL instead of the provider:

```
$ curl -s http://169.254.169.254/gateway/llm/anthropic/v1/messages \
    -H "content-type: application/json" \
    -H "anthropic-version: 2023-06-01" \
    -d '{
      "model": "claude-sonnet-4-6",
      "max_tokens": 256,
      "messages": [{"role": "user", "content": "Hello!"}]
    }'
```

OpenAI and Fireworks work the same way:

```
$ curl -s http://169.254.169.254/gateway/llm/openai/v1/chat/completions \
    -H "content-type: application/json" \
    -d '{
      "model": "gpt-5.5",
      "messages": [{"role": "user", "content": "Hello!"}]
    }'
```

```
$ curl -s http://169.254.169.254/gateway/llm/fireworks/inference/v1/chat/completions \
    -H "content-type: application/json" \
    -d '{
      "model": "accounts/fireworks/models/llama-v3p1-8b-instruct",
      "messages": [{"role": "user", "content": "Hello!"}]
    }'
```
