Documentation Index
Fetch the complete documentation index at: https://mintlify.com/iamngoni/heimdall/llms.txt
Use this file to discover all available pages before exploring further.
Heimdall uses large language models (LLMs) for threat modeling, vulnerability discovery, and code analysis. You bring your own API keys (BYOK).
Supported Providers
Heimdall supports three AI providers:
| Provider | Models | Use Case |
|---|
| Anthropic (Claude) | Claude Sonnet 4, Claude Opus | Recommended for security analysis (native tool use) |
| OpenAI | GPT-4o, o1, o3-mini | Alternative with function calling |
| Ollama | Llama 3.3, Mistral, DeepSeek, Qwen | Local inference (no API key required) |
Quick Start
Set at least one provider in your .env file:
# Anthropic (recommended)
ANTHROPIC_API_KEY=sk-ant-...
# Or OpenAI
OPENAI_API_KEY=sk-...
# Or Ollama (local)
OLLAMA_URL=http://localhost:11434
# Optional: override default model
DEFAULT_AI_MODEL=claude-sonnet-4-20250514
Restart Heimdall:
docker compose restart heimdall
Anthropic (Claude)
Getting an API Key
- Sign up at console.anthropic.com
- Go to API Keys → Create Key
- Copy the key (starts with
sk-ant-)
Configuration
Add to .env:
ANTHROPIC_API_KEY=sk-ant-api03-...
DEFAULT_AI_MODEL=claude-sonnet-4-20250514
Supported Models
| Model ID | Description | Context | Cost (per 1M tokens) |
|---|
claude-sonnet-4-20250514 | Balanced performance (default) | 200k | Input: 3,Output:15 |
claude-opus-4-20250514 | Highest capability | 200k | Input: 15,Output:75 |
claude-3-5-sonnet-20241022 | Previous generation | 200k | Input: 3,Output:15 |
Claude is recommended because it supports native tool use format, which the Hunt agent relies on for code analysis.
Rate Limits
Anthropic enforces per-minute rate limits:
- Free tier: 5 RPM (requests per minute)
- Paid tier: 50-1000+ RPM (depending on usage tier)
If you hit rate limits, configure a fallback provider (see Fallback Chain).
OpenAI
Getting an API Key
- Sign up at platform.openai.com
- Go to API Keys → Create new secret key
- Copy the key (starts with
sk-)
Configuration
Add to .env:
OPENAI_API_KEY=sk-...
DEFAULT_AI_MODEL=gpt-4o
Supported Models
| Model ID | Description | Context | Cost (per 1M tokens) |
|---|
gpt-4o | Optimized GPT-4 (default) | 128k | Input: 5,Output:15 |
o1-preview | Reasoning model | 128k | Input: 15,Output:60 |
o3-mini | Fast, cost-efficient | 128k | Input: 1,Output:4 |
Rate Limits
- Free tier: 3 RPM
- Tier 1 ($5+ spent): 500 RPM
- Tier 5 ($1000+ spent): 10,000 RPM
See OpenAI rate limits for details.
Ollama (Local)
Ollama runs LLMs locally — no API key or internet connection required.
Installation
macOS / Linux:
curl -fsSL https://ollama.com/install.sh | sh
Docker:
docker run -d -p 11434:11434 --name ollama ollama/ollama
Pull a Model
Recommended models for security analysis:
llama3.3 (70B parameters, best quality)
deepseek-coder-v2 (optimized for code)
qwen2.5-coder (fast, lightweight)
Configuration
Add to .env:
OLLAMA_URL=http://localhost:11434
DEFAULT_AI_MODEL=llama3.3
For Docker deployments, use the container name:
OLLAMA_URL=http://ollama:11434
Hardware Requirements
| Model | Parameters | RAM | VRAM (GPU) |
|---|
llama3.3 | 70B | 64 GB | 24 GB |
qwen2.5-coder:32b | 32B | 32 GB | 16 GB |
deepseek-coder-v2:16b | 16B | 16 GB | 8 GB |
Large models like Llama 3.3 (70B) require significant hardware. For smaller deployments, use quantized models (e.g., llama3.3:8b-q4_0).
Fallback Provider Chain
When multiple providers are configured, Heimdall automatically chains them in priority order:
Priority: Anthropic → OpenAI → Ollama
If the primary provider fails with a retryable error, the request falls through to the next provider:
Retryable Errors
- HTTP 429: Rate limit exceeded
- HTTP 500/502/503/529: Server errors
- Network errors: Connection timeout, DNS failure
- Billing errors: Insufficient credits, quota exceeded
Non-Retryable Errors
These propagate immediately (no fallback):
- HTTP 401: Invalid API key
- HTTP 400: Malformed request
- HTTP 404: Model not found
Example Configuration
# Primary: Claude (fastest, best quality)
ANTHROPIC_API_KEY=sk-ant-...
# Fallback 1: OpenAI (if Claude rate-limited)
OPENAI_API_KEY=sk-...
# Fallback 2: Ollama (if both cloud providers fail)
OLLAMA_URL=http://localhost:11434
Result: Every LLM call attempts Claude first. If it fails with HTTP 429, OpenAI is tried. If OpenAI also fails, Ollama is used as a last resort.
Observability
Every LLM call records which provider and model was actually used:
SELECT provider, model, tool_name, created_at
FROM agent_tool_calls
WHERE scan_id = '<your_scan_id>'
ORDER BY created_at DESC;
This allows you to audit which provider served each request — especially useful when fallback kicks in.
See src/ai/fallback.rs for the implementation.
Model Selection
Heimdall infers the provider from the model name:
| Model Name Contains | Provider |
|---|
claude | Anthropic |
gpt, o1, o3, o4 | OpenAI |
llama, mistral, qwen, deepseek, phi, gemma, codellama | Ollama |
Override Default Model
Set DEFAULT_AI_MODEL in .env:
If the model doesn’t match the provider, Heimdall falls back to a safe default:
- Anthropic →
claude-sonnet-4-20250514
- OpenAI →
gpt-4o
- Ollama →
llama3.3
See src/ai/mod.rs:98-106 for the fallback logic.
BYOK (Bring Your Own Key) Approach
Heimdall never provides or manages API keys. You maintain full control:
- User-level keys: Set via Settings UI after registration
- System-level keys: Set in
.env for all users
User keys override system keys.
Encryption at Rest
API keys stored in the database are encrypted with AES-256-GCM:
# Generate a 32-byte encryption key
openssl rand -hex 32
# Add to .env
ENCRYPTION_KEY=<generated_key>
If ENCRYPTION_KEY is not set, keys are stored as hex-encoded plaintext. Always configure encryption for production.
See src/crypto.rs for the encryption implementation.
Testing Connections
Verify your provider configuration:
Via Settings UI
- Navigate to Settings → AI Providers
- Click Test Connection next to each provider
- Successful test shows a green checkmark
Via API
curl -X POST http://localhost:8080/api/settings/test-connection \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <your_session_token>" \
-d '{"provider": "anthropic", "api_key": "sk-ant-..."}'
Response:
{
"status": "ok",
"data": {
"provider": "anthropic",
"model": "claude-sonnet-4-20250514",
"working": true
}
}
Troubleshooting
Symptom: Scans fail with “No AI provider configured”
Solution: Set at least one provider in .env and restart Heimdall.
Invalid API Key
Symptom: HTTP 401 errors in logs
Solutions:
- Verify the key format:
- Anthropic:
sk-ant-...
- OpenAI:
sk-...
- Check for extra whitespace in
.env
- Regenerate the key in your provider console
Rate Limit Exceeded
Symptom: HTTP 429 errors, scans pause frequently
Solutions:
- Upgrade to a paid tier with higher limits
- Configure a fallback provider:
ANTHROPIC_API_KEY=sk-ant-... # Primary
OPENAI_API_KEY=sk-... # Fallback
- Reduce concurrent scans
Ollama Connection Refused
Symptom: “Failed to connect to Ollama at http://localhost:11434”
Solutions:
- Verify Ollama is running:
curl http://localhost:11434/api/tags
- Check Docker networking (if using containers):
# Use container name instead of localhost
OLLAMA_URL=http://ollama:11434
- Ensure the model is pulled:
Model Not Found
Symptom: “Model ‘xyz’ not found”
Solutions:
- Verify the model ID is correct (check provider docs)
- For Ollama, pull the model:
- For cloud providers, ensure your API tier has access to the model
Cost Optimization
- Use Claude Sonnet (not Opus) for most scans — it’s 5× cheaper
- Configure Ollama as fallback to avoid overage charges
- Monitor usage:
SELECT provider, model, COUNT(*) as calls, AVG(input_tokens) as avg_input, AVG(output_tokens) as avg_output
FROM agent_tool_calls
WHERE created_at > NOW() - INTERVAL '30 days'
GROUP BY provider, model;
API Reference
| Endpoint | Method | Description |
|---|
/api/settings | GET | Get AI provider status |
/api/settings/api-keys | POST | Store user API key |
/api/settings/api-keys/{id} | DELETE | Delete user API key |
/api/settings/test-connection | POST | Test provider connection |
See API Reference for full details.