AI Models
ChatRAG supports 100+ AI models from multiple providers with reasoning capabilities, multi-modal generation, and flexible configuration.
Multiple Providers Supported
Model Providers
OpenRouter (Recommended)
100+ ModelsUnified API for accessing models from multiple providers with competitive pricing
OpenAI
GPT-4, GPT-4o, o1, o3
Anthropic
Claude 3.5, 4.1 Opus
Gemini 2.5 Flash, Thinking
Meta
Llama 4 Maverick
Direct Provider APIs
Connect directly to individual providers for specific features
- OpenAI: Required for embeddings, optional for chat models
- Anthropic: Direct Claude API access
- Google: Gemini models
Pre-configured Models
ChatRAG comes with these models ready to use:
GPT-4.1 Mini
Ideal for general chat and quick responses
Claude Sonnet 4.5
Extended thinking, excellent for complex tasks
Gemini 2.5 Flash
Ultra-fast responses, large context window
Llama 4 Maverick
Open source, privacy-focused
Venice: Uncensored
Uncensored model, free tier available
Reasoning / Thinking Models
Advanced models that use extended thinking for complex problem-solving:
OpenAI o1/o3 Series
Reasoning through effort levels (low, medium, high)
NEXT_PUBLIC_REASONING_ENABLED=true
NEXT_PUBLIC_DEFAULT_REASONING_EFFORT=mediumClaude 3.7+ Extended Thinking
Token-based reasoning (up to 32k reasoning tokens)
NEXT_PUBLIC_MAX_REASONING_TOKENS=8000
NEXT_PUBLIC_SHOW_REASONING_BY_DEFAULT=falseDeepSeek R1
Dual method (effort + tokens)
Gemini Thinking
Token-based reasoning with configurable limits
When to Use Reasoning Models
- Complex problem-solving and analysis
- Mathematical or logical reasoning
- Multi-step planning and strategy
- Code debugging and optimization
Adding Models
Add new models through the Config UI or manually:
Via Config UI (Recommended)
- Open
npm run config - Navigate to Models section
- Click "Fetch Models" to get latest from OpenRouter
- Or manually add model with ID and display name
- Save configuration
- Restart dev server
Manual Configuration
Important: Sync 5 Locations
- .env.local
- scripts/init-env.js
- src/lib/env.ts
- scripts/config-server.js
- scripts/config-ui/index.html (3 fallback arrays)
Model schema:
{
"id": "openai/gpt-4o",
"displayName": "GPT-4o",
"isFree": false,
"isOpenSource": false,
"supportsReasoning": false,
"reasoningMethod": "none",
"contextLength": 128000,
"description": "Latest GPT-4 model"
}Model Selection Guide
For General Chat
Use GPT-4o-mini or Claude Sonnet 4.5 for balanced performance and cost
For RAG Applications
Use Claude Sonnet 4.5 for best context understanding and citation accuracy
For WhatsApp
Use GPT-4o-mini or Gemini Flash for fast, concise mobile responses
For Complex Tasks
Use o1, o3, or Claude with extended thinking for reasoning
For Cost Optimization
Use free tier models like Venice Uncensored or Llama 4 Maverick
Model Configuration
Default Models
# Set default model for chat
NEXT_PUBLIC_DEFAULT_MODEL=anthropic/claude-sonnet-4.5
# WhatsApp specific
WHATSAPP_DEFAULT_MODEL=openai/gpt-4o-mini
# Embed widget
NEXT_PUBLIC_EMBED_MODEL=openai/gpt-4o-miniTemperature & Parameters
Adjust model creativity and randomness (typically set per-request in UI)
Model Icons in UI
Models display these indicators:
- 🧠 = Supports reasoning/thinking (supportsReasoning: true)
- 🎁 = Free tier available (isFree: true)
- 🔓 = Open source model (isOpenSource: true)