Use Cases

Configure your workload, pick a use case, and get rule-based recommendations with transparent scoring. No hype.

Agent loops, tool-calling, multi-step automation

Configure Workload

Input tokens / request

10,000

Output tokens / request

1,000

Minimum context window

Optimization priority

CostContext

Use cached input pricing

GPT-4.1 nano

OpenAI

$0.001400/reqScore: 100/100

✅ Eligible1048K context

•Best value at $0.001400 per request for Agents
•Large context window (1048K tokens)
•Supports function calling, vision

Input: $0.001000/req (score: 99.5, weight: 0.490). Output: $0.000400/req (score: 99.5, weight: 0.360). Context: 1,047,576 (score: 99.9, weight: 0.150). Total: $0.001400/req.

SourceVerified: 2026-02-18

Gemini 2.5 Flash-Lite

Google

$0.001400/reqScore: 100/100

✅ Eligible1049K context

•Competitive cost at $0.001400 per request
•Large context window (1049K tokens)
•Supports function calling, vision, reasoning

Input: $0.001000/req (score: 99.5, weight: 0.490). Output: $0.000400/req (score: 99.5, weight: 0.360). Context: 1,048,576 (score: 100.0, weight: 0.150). Total: $0.001400/req.

SourceVerified: 2026-03-03

Gemini 2.0 Flash

Google

$0.001400/reqScore: 99/100

✅ Eligible1000K context

•Competitive cost at $0.001400 per request
•Large context window (1000K tokens)
•Supports function calling, vision

Input: $0.001000/req (score: 99.5, weight: 0.490). Output: $0.000400/req (score: 99.5, weight: 0.360). Context: 1,000,000 (score: 95.4, weight: 0.150). Total: $0.001400/req.

SourceVerified: 2026-03-03

Gemini 2.0 Flash-Lite

Google

$0.001050/reqScore: 99/100

✅ Eligible1000K context

•$0.001050 per request
•Large context window (1000K tokens)
•Supports function calling, vision

Input: $0.000750/req (score: 99.6, weight: 0.490). Output: $0.000300/req (score: 99.6, weight: 0.360). Context: 1,000,000 (score: 95.4, weight: 0.150). Total: $0.001050/req.

SourceVerified: 2026-03-03

GPT-4.1 mini

OpenAI

$0.005600/reqScore: 98/100

✅ Eligible1048K context

•$0.005600 per request
•Large context window (1048K tokens)
•Supports function calling, vision

Input: $0.004000/req (score: 98.0, weight: 0.490). Output: $0.001600/req (score: 98.0, weight: 0.360). Context: 1,047,576 (score: 99.9, weight: 0.150). Total: $0.005600/req.

SourceVerified: 2026-02-18

Gemini 2.5 Flash

Google

$0.005500/reqScore: 98/100

✅ Eligible1049K context

•$0.005500 per request
•Large context window (1049K tokens)
•Supports function calling, vision, reasoning

Input: $0.003000/req (score: 98.5, weight: 0.490). Output: $0.002500/req (score: 96.9, weight: 0.360). Context: 1,048,576 (score: 100.0, weight: 0.150). Total: $0.005500/req.

SourceVerified: 2026-03-03

How scoring works

Each model is scored using three weighted components controlled by the Cost vs Context slider:

final_score = (1 - input_cost/max) × 0.490 + (1 - output_cost/max) × 0.360 + (context/max) × 0.150

Input cost (weight: 0.490): Lower input cost = higher score.
Output cost (weight: 0.360): Lower output cost = higher score.
Context window (weight: 0.150): Larger context window = higher score.

Models that don't meet the minimum context requirement are ineligible. The slider shifts weight between cost optimization and context window preference.

Our ranking is based on pricing and context only. Check the benchmark references below for model quality data.

Evaluate Quality Before Choosing

Our ranking is based on pricing only. Check these independent benchmarks to compare model accuracy for this use case:

→

LMSYS Chatbot Arena ↗

Overall model quality rankings via human preference voting

→

Berkeley Function Calling Leaderboard ↗

Tool-calling and function-calling accuracy across models