Best For Your Use Case

Configure your workload, pick a use case, and get rule-based recommendations with transparent scoring. No hype.

Agent loops, tool-calling, multi-step automation

Configure Workload

10,000
1,000
CostContext
#1

GPT-4.1 nano

OpenAI
$0.001400/reqScore: 100/100
✅ Eligible1048K context
  • Best value at $0.001400 per request for Agents
  • Large context window (1048K tokens)
  • Supports function calling, vision

Input: $0.001000/req (score: 99.5, weight: 0.490). Output: $0.000400/req (score: 99.5, weight: 0.360). Context: 1,047,576 (score: 99.9, weight: 0.150). Total: $0.001400/req.

SourceVerified: 2026-02-18
#2

Gemini 2.0 Flash

Google
$0.001400/reqScore: 99/100
✅ Eligible1000K context
  • Competitive cost at $0.001400 per request
  • Large context window (1000K tokens)
  • Supports function calling, vision

Input: $0.001000/req (score: 99.5, weight: 0.490). Output: $0.000400/req (score: 99.5, weight: 0.360). Context: 1,000,000 (score: 95.4, weight: 0.150). Total: $0.001400/req.

SourceVerified: 2026-02-18
#3

Gemini 2.0 Flash-Lite

Google
$0.001050/reqScore: 99/100
✅ Eligible1000K context
  • Competitive cost at $0.001050 per request
  • Large context window (1000K tokens)
  • Supports function calling, vision

Input: $0.000750/req (score: 99.6, weight: 0.490). Output: $0.000300/req (score: 99.6, weight: 0.360). Context: 1,000,000 (score: 95.4, weight: 0.150). Total: $0.001050/req.

SourceVerified: 2026-02-18
#4

Gemini 2.5 Flash-Lite

Google
$0.002100/reqScore: 99/100
✅ Eligible1049K context
  • $0.002100 per request
  • Large context window (1049K tokens)
  • Supports function calling, vision, reasoning

Input: $0.001500/req (score: 99.3, weight: 0.490). Output: $0.000600/req (score: 99.3, weight: 0.360). Context: 1,048,576 (score: 100.0, weight: 0.150). Total: $0.002100/req.

SourceVerified: 2026-02-20
#5

GPT-4.1 mini

OpenAI
$0.005600/reqScore: 98/100
✅ Eligible1048K context
  • $0.005600 per request
  • Large context window (1048K tokens)
  • Supports function calling, vision

Input: $0.004000/req (score: 98.0, weight: 0.490). Output: $0.001600/req (score: 98.0, weight: 0.360). Context: 1,047,576 (score: 99.9, weight: 0.150). Total: $0.005600/req.

SourceVerified: 2026-02-18
#6

Gemini 2.5 Flash

Google
$0.005500/reqScore: 98/100
✅ Eligible1049K context
  • $0.005500 per request
  • Large context window (1049K tokens)
  • Supports function calling, vision, reasoning

Input: $0.003000/req (score: 98.5, weight: 0.490). Output: $0.002500/req (score: 96.9, weight: 0.360). Context: 1,048,576 (score: 100.0, weight: 0.150). Total: $0.005500/req.

SourceVerified: 2026-02-18
How scoring works

Each model is scored using three weighted components controlled by the Cost vs Context slider:

final_score = (1 - input_cost/max) × 0.490 + (1 - output_cost/max) × 0.360 + (context/max) × 0.150

  • Input cost (weight: 0.490): Lower input cost = higher score.
  • Output cost (weight: 0.360): Lower output cost = higher score.
  • Context window (weight: 0.150): Larger context window = higher score.

Models that don't meet the minimum context requirement are ineligible. The slider shifts weight between cost optimization and context window preference.

Our ranking is based on pricing and context only. Check the benchmark references below for model quality data.

Evaluate Quality Before Choosing

Our ranking is based on pricing only. Check these independent benchmarks to compare model accuracy for this use case:

LMSYS Chatbot Arena

Overall model quality rankings via human preference voting

Berkeley Function Calling Leaderboard

Tool-calling and function-calling accuracy across models