Calculate and compare API costs for GPT-4, Claude 3, Gemini and other LLMs. Enter input/output tokens to estimate costs, compare models, and project monthly expenses.
Claude Sonnet 4.5 (Anthropic)
Input: $5.00/1M • Output: $20.00/1M • Best balance of capability and cost
You might also find these calculators useful
Large language model APIs charge per token, making cost estimation essential for budgeting. Our calculator computes costs for GPT-4, Claude 3, Gemini, and other models, helping you compare pricing and optimize your AI spending.
AI providers charge separately for input tokens (your prompts) and output tokens (model responses). Input tokens are typically cheaper than output tokens. Costs are quoted per million tokens, so a 1,000-token prompt with GPT-4o costs about $0.0025 at current rates.
Cost Calculation Formula
Total Cost = (Input Tokens ÷ 1M × Input Rate) + (Output Tokens ÷ 1M × Output Rate)Estimate monthly API expenses before scaling your application. A chatbot handling 10,000 conversations/day can cost hundreds to thousands of dollars monthly.
Compare costs across providers. GPT-4o Mini is 17x cheaper than GPT-4o, while Claude 3 Haiku is 60x cheaper than Opus. Choose the right model for your quality/cost tradeoff.
Shorter prompts cost less. System prompts that repeat with every request add up quickly—a 500-token system prompt costs $1.25 per 1,000 requests with GPT-4o.
API costs can spike unexpectedly. Understanding your baseline costs helps set usage alerts and prevent budget overruns.
Output tokens require the model to generate new content through an expensive autoregressive process, computing probabilities for each token sequentially. Input tokens are processed in parallel and only require encoding, not generation.
Estimates are based on official published API pricing. Actual costs may vary due to: volume discounts, cached prompts (up to 90% off), batch API usage (50% off), or enterprise agreements. Always verify current pricing on provider websites.
It depends on your task. For simple tasks, GPT-4o Mini or Claude 3 Haiku offer excellent quality at low cost. For complex reasoning, GPT-4o or Claude 3 Sonnet balance capability and price. Gemini 1.5 Flash is extremely affordable for high-volume applications.
Key strategies: 1) Use smaller models for simpler tasks, 2) Implement prompt caching for repeated system prompts, 3) Use batch API for non-urgent requests (50% discount), 4) Optimize prompts to reduce token count, 5) Set max_tokens limits to prevent runaway responses.
No, this calculator covers inference costs only. Fine-tuning has separate training costs (typically $8-25 per million training tokens) plus higher inference costs for fine-tuned models. Hosted fine-tuned models may also incur storage fees.