AI Providers - Heimdall

Heimdall uses large language models (LLMs) for threat modeling, vulnerability discovery, and code analysis. You bring your own API keys (BYOK).

Supported Providers

Heimdall supports three AI providers:

Provider	Models	Use Case
Anthropic (Claude)	Claude Sonnet 4, Claude Opus	Recommended for security analysis (native tool use)
OpenAI	GPT-4o, o1, o3-mini	Alternative with function calling
Ollama	Llama 3.3, Mistral, DeepSeek, Qwen	Local inference (no API key required)

Quick Start

Set at least one provider in your .env file:

# Anthropic (recommended)
ANTHROPIC_API_KEY=sk-ant-...

# Or OpenAI
OPENAI_API_KEY=sk-...

# Or Ollama (local)
OLLAMA_URL=http://localhost:11434

# Optional: override default model
DEFAULT_AI_MODEL=claude-sonnet-4-20250514

Restart Heimdall:

docker compose restart heimdall

Anthropic (Claude)

Getting an API Key

Sign up at console.anthropic.com
Go to API Keys → Create Key
Copy the key (starts with sk-ant-)

Configuration

Add to .env:

ANTHROPIC_API_KEY=sk-ant-api03-...
DEFAULT_AI_MODEL=claude-sonnet-4-20250514

Supported Models

Model ID	Description	Context	Cost (per 1M tokens)
`claude-sonnet-4-20250514`	Balanced performance (default)	200k	Input: $3, Output:$ 15
`claude-opus-4-20250514`	Highest capability	200k	Input: $15, Output:$ 75
`claude-3-5-sonnet-20241022`	Previous generation	200k	Input: $3, Output:$ 15

Claude is recommended because it supports native tool use format, which the Hunt agent relies on for code analysis.

Rate Limits

Anthropic enforces per-minute rate limits:

Free tier: 5 RPM (requests per minute)
Paid tier: 50-1000+ RPM (depending on usage tier)

If you hit rate limits, configure a fallback provider (see Fallback Chain).

OpenAI

Getting an API Key

Sign up at platform.openai.com
Go to API Keys → Create new secret key
Copy the key (starts with sk-)

Configuration

Add to .env:

OPENAI_API_KEY=sk-...
DEFAULT_AI_MODEL=gpt-4o

Supported Models

Model ID	Description	Context	Cost (per 1M tokens)
`gpt-4o`	Optimized GPT-4 (default)	128k	Input: $5, Output:$ 15
`o1-preview`	Reasoning model	128k	Input: $15, Output:$ 60
`o3-mini`	Fast, cost-efficient	128k	Input: $1, Output:$ 4

Rate Limits

Free tier: 3 RPM
Tier 1 ($5+ spent): 500 RPM
Tier 5 ($1000+ spent): 10,000 RPM

See OpenAI rate limits for details.

Ollama (Local)

Ollama runs LLMs locally — no API key or internet connection required.

Installation

macOS / Linux:

curl -fsSL https://ollama.com/install.sh | sh

Docker:

docker run -d -p 11434:11434 --name ollama ollama/ollama

Pull a Model

ollama pull llama3.3

Recommended models for security analysis:

llama3.3 (70B parameters, best quality)
deepseek-coder-v2 (optimized for code)
qwen2.5-coder (fast, lightweight)

Configuration

Add to .env:

OLLAMA_URL=http://localhost:11434
DEFAULT_AI_MODEL=llama3.3

For Docker deployments, use the container name:

OLLAMA_URL=http://ollama:11434

Hardware Requirements

Model	Parameters	RAM	VRAM (GPU)
`llama3.3`	70B	64 GB	24 GB
`qwen2.5-coder:32b`	32B	32 GB	16 GB
`deepseek-coder-v2:16b`	16B	16 GB	8 GB

Large models like Llama 3.3 (70B) require significant hardware. For smaller deployments, use quantized models (e.g., llama3.3:8b-q4_0).

Fallback Provider Chain

When multiple providers are configured, Heimdall automatically chains them in priority order: Priority: Anthropic → OpenAI → Ollama If the primary provider fails with a retryable error, the request falls through to the next provider:

Retryable Errors

HTTP 429: Rate limit exceeded
HTTP 500/502/503/529: Server errors
Network errors: Connection timeout, DNS failure
Billing errors: Insufficient credits, quota exceeded

Non-Retryable Errors

These propagate immediately (no fallback):

HTTP 401: Invalid API key
HTTP 400: Malformed request
HTTP 404: Model not found

Example Configuration

# Primary: Claude (fastest, best quality)
ANTHROPIC_API_KEY=sk-ant-...

# Fallback 1: OpenAI (if Claude rate-limited)
OPENAI_API_KEY=sk-...

# Fallback 2: Ollama (if both cloud providers fail)
OLLAMA_URL=http://localhost:11434

Result: Every LLM call attempts Claude first. If it fails with HTTP 429, OpenAI is tried. If OpenAI also fails, Ollama is used as a last resort.

Observability

Every LLM call records which provider and model was actually used:

SELECT provider, model, tool_name, created_at 
FROM agent_tool_calls 
WHERE scan_id = '<your_scan_id>'
ORDER BY created_at DESC;

This allows you to audit which provider served each request — especially useful when fallback kicks in. See src/ai/fallback.rs for the implementation.

Model Selection

Heimdall infers the provider from the model name:

Model Name Contains	Provider
`claude`	Anthropic
`gpt`, `o1`, `o3`, `o4`	OpenAI
`llama`, `mistral`, `qwen`, `deepseek`, `phi`, `gemma`, `codellama`	Ollama

Override Default Model

Set DEFAULT_AI_MODEL in .env:

DEFAULT_AI_MODEL=gpt-4o

If the model doesn’t match the provider, Heimdall falls back to a safe default:

Anthropic → claude-sonnet-4-20250514
OpenAI → gpt-4o
Ollama → llama3.3

See src/ai/mod.rs:98-106 for the fallback logic.

BYOK (Bring Your Own Key) Approach

Heimdall never provides or manages API keys. You maintain full control:

User-level keys: Set via Settings UI after registration
System-level keys: Set in .env for all users

User keys override system keys.

Encryption at Rest

API keys stored in the database are encrypted with AES-256-GCM:

# Generate a 32-byte encryption key
openssl rand -hex 32

# Add to .env
ENCRYPTION_KEY=<generated_key>

If ENCRYPTION_KEY is not set, keys are stored as hex-encoded plaintext. Always configure encryption for production.

See src/crypto.rs for the encryption implementation.

Testing Connections

Verify your provider configuration:

Via Settings UI

Navigate to Settings → AI Providers
Click Test Connection next to each provider
Successful test shows a green checkmark

Via API

curl -X POST http://localhost:8080/api/settings/test-connection \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <your_session_token>" \
  -d '{"provider": "anthropic", "api_key": "sk-ant-..."}'

Response:

{
  "status": "ok",
  "data": {
    "provider": "anthropic",
    "model": "claude-sonnet-4-20250514",
    "working": true
  }
}

Troubleshooting

No AI Provider Configured

Symptom: Scans fail with “No AI provider configured” Solution: Set at least one provider in .env and restart Heimdall.

Invalid API Key

Symptom: HTTP 401 errors in logs Solutions:

Verify the key format:
- Anthropic: sk-ant-...
- OpenAI: sk-...
Check for extra whitespace in .env
Regenerate the key in your provider console

Rate Limit Exceeded

Symptom: HTTP 429 errors, scans pause frequently Solutions:

Upgrade to a paid tier with higher limits

Configure a fallback provider:

ANTHROPIC_API_KEY=sk-ant-...  # Primary
OPENAI_API_KEY=sk-...         # Fallback

Reduce concurrent scans

Ollama Connection Refused

Symptom: “Failed to connect to Ollama at http://localhost:11434” Solutions:

Verify Ollama is running:
```
curl http://localhost:11434/api/tags
```

Check Docker networking (if using containers):

# Use container name instead of localhost
OLLAMA_URL=http://ollama:11434

Ensure the model is pulled:
```
ollama pull llama3.3
```

Model Not Found

Symptom: “Model ‘xyz’ not found” Solutions:

Verify the model ID is correct (check provider docs)
For Ollama, pull the model:
```
ollama pull <model_name>
```
For cloud providers, ensure your API tier has access to the model

Cost Optimization

Use Claude Sonnet (not Opus) for most scans — it’s 5× cheaper
Configure Ollama as fallback to avoid overage charges

Monitor usage:

SELECT provider, model, COUNT(*) as calls, AVG(input_tokens) as avg_input, AVG(output_tokens) as avg_output
FROM agent_tool_calls
WHERE created_at > NOW() - INTERVAL '30 days'
GROUP BY provider, model;

API Reference

Endpoint	Method	Description
`/api/settings`	`GET`	Get AI provider status
`/api/settings/api-keys`	`POST`	Store user API key
`/api/settings/api-keys/{id}`	`DELETE`	Delete user API key
`/api/settings/test-connection`	`POST`	Test provider connection

See API Reference for full details.

Overview

Getting Started

Core Features

Deployment

Integrations

Advanced

Documentation Index

​Supported Providers

​Quick Start

​Anthropic (Claude)

​Getting an API Key

​Configuration

​Supported Models

​Rate Limits

​OpenAI

​Getting an API Key

​Configuration

​Supported Models

​Rate Limits

​Ollama (Local)

​Installation

​Pull a Model

​Configuration

​Hardware Requirements

​Fallback Provider Chain

​Retryable Errors

​Non-Retryable Errors

​Example Configuration

​Observability

​Model Selection

​Override Default Model

​BYOK (Bring Your Own Key) Approach

​Encryption at Rest

​Testing Connections

​Via Settings UI

​Via API

​Troubleshooting

​No AI Provider Configured

​Invalid API Key

​Rate Limit Exceeded

​Ollama Connection Refused

​Model Not Found

​Cost Optimization

​API Reference

Supported Providers

Quick Start

Anthropic (Claude)

Getting an API Key

Configuration

Supported Models

Rate Limits

OpenAI

Getting an API Key

Configuration

Supported Models

Rate Limits

Ollama (Local)

Installation

Pull a Model

Configuration

Hardware Requirements

Fallback Provider Chain

Retryable Errors

Non-Retryable Errors

Example Configuration

Observability

Model Selection

Override Default Model

BYOK (Bring Your Own Key) Approach

Encryption at Rest

Testing Connections

Via Settings UI

Via API

Troubleshooting

No AI Provider Configured

Invalid API Key

Rate Limit Exceeded

Ollama Connection Refused

Model Not Found

Cost Optimization

API Reference