NOMOS supports multiple LLM providers, allowing you to choose the best model for your use case.
from nomos.llms import OpenAI
llm = OpenAI(model="gpt-4o-mini")
# Other supported models:
llm = OpenAI(model="gpt-4o")
llm = OpenAI(model="gpt-4-turbo")
llm = OpenAI(model="gpt-3.5-turbo")Installation:
pip install nomos[openai]Environment Variable:
export OPENAI_API_KEY=your-api-key-herefrom nomos.llms import Mistral
llm = Mistral(model="ministral-8b-latest")
# Other supported models:
llm = Mistral(model="mistral-small")
llm = Mistral(model="mistral-medium")
llm = Mistral(model="mistral-large")Installation:
pip install nomos[mistralai]Environment Variable:
export MISTRAL_API_KEY=your-api-key-herefrom nomos.llms import Gemini
llm = Gemini(model="gemini-2.0-flash-exp")
# Other supported models:
llm = Gemini(model="gemini-1.5-pro")
llm = Gemini(model="gemini-1.5-flash")
llm = Gemini(model="gemini-1.0-pro")Installation:
pip install nomos[gemini]Environment Variable:
export GOOGLE_API_KEY=your-api-key-herefrom nomos.llms import Ollama
llm = Ollama(model="llama3.3")
# Other popular models:
llm = Ollama(model="qwen2.5:14b")
llm = Ollama(model="codestral")
llm = Ollama(model="deepseek-coder-v2")
llm = Ollama(model="phi4")Installation:
pip install nomos[ollama]Prerequisites:
- Install Ollama
- Pull the desired model:
ollama pull llama3.3
from nomos.llms import HuggingFace
llm = HuggingFace(model="meta-llama/Meta-Llama-3-8B-Instruct")
# or
llm = HuggingFace(model="microsoft/DialoGPT-large")Installation:
pip install nomos[huggingface]Environment Variable:
export HUGGINGFACE_API_TOKEN=your-token-herefrom nomos.llms import Anthropic
llm = Anthropic(model="claude-3-5-sonnet-20241022")
# Other supported models:
llm = Anthropic(model="claude-3-5-haiku-20241022")
llm = Anthropic(model="claude-3-opus-20240229")
llm = Anthropic(model="claude-3-sonnet-20240229")
llm = Anthropic(model="claude-3-haiku-20240307")Installation:
pip install nomos[anthropic]Environment Variable:
export ANTHROPIC_API_KEY=your-api-key-hereYou can also specify LLM configuration in your YAML config file:
llm:
provider: openai
model: gpt-4o-minillm:
provider: mistral
model: mistral-mediumllm:
provider: gemini
model: gemini-2.0-flash-expllm:
provider: ollama
model: llama3.3
base_url: http://localhost:11434 # Optional: custom Ollama URLllm:
provider: huggingface
model: meta-llama/Meta-Llama-3-8B-Instructllm:
provider: anthropic
model: claude-3-5-sonnet-20241022You can pass additional parameters to LLM providers:
llm = OpenAI(
model="gpt-4o-mini",
temperature=0.7,
max_tokens=1000,
top_p=0.9
)
llm = Anthropic(
model="claude-3-5-sonnet-20241022",
temperature=0.3,
max_tokens=2048,
top_p=0.8
)llm:
provider: openai
model: gpt-4o-mini
temperature: 0.7
max_tokens: 1000
top_p: 0.9llm:
provider: anthropic
model: claude-3-5-sonnet-20241022
temperature: 0.3
max_tokens: 2048
top_p: 0.8- OpenAI GPT-4o: Best overall performance, most reliable
- Anthropic Claude 3.5 Sonnet: Excellent reasoning and coding capabilities
- Mistral Large: Strong performance, competitive pricing
- Google Gemini 2.0 Flash: Fast and capable for most tasks
- OpenAI GPT-4o-mini: Fast and cost-effective
- Anthropic Claude 3.5 Sonnet: Good balance of capability and speed
- Mistral Small: Affordable option with good performance
- Ollama: Local models, no API costs
- Code Generation: GPT-4o, Claude 3.5 Sonnet, Codestral (via Ollama)
- Conversational: GPT-4o-mini, Claude 3.5 Haiku, Mistral Medium
- Reasoning & Analysis: Claude 3.5 Sonnet, GPT-4o, Claude 3 Opus
- Multilingual: Gemini 2.0 Flash, GPT-4o
- API Key Not Found: Ensure environment variables are set correctly
- Model Not Available: Check that the model name is correct and available
- Rate Limits: Implement retry logic or use different models
- Local Models (Ollama): Ensure Ollama is running and model is pulled
NOMOS includes built-in error handling and retry mechanisms:
name: my-agent
llm:
provider: openai
model: gpt-4o-mini
max_errors: 3 # Retry up to 3 times on LLM errors- Choose the Right Model: Use smaller models for simple tasks
- Configure Temperature: Lower values (0.1-0.3) for consistent responses
- Set Max Tokens: Limit response length to control costs and latency
- Use Local Models: Ollama for development or when data privacy is important
For the most up-to-date list of available models, refer to the official documentation:
- Anthropic: Claude Models Overview
- OpenAI: OpenAI Models
- Google Gemini: Vertex AI Generative AI Models
- Mistral AI: Mistral Models Overview
- Ollama: Ollama Model Library