AI API Cost Calculator

Instant token cost estimates for every major AI model. Free, no login required.

Model
Input tokens (prompt) 1,000 tokens ≈ 750 words
Output tokens (response) 500 tokens ≈ 375 words
API calls per month 1,000 calls
Per Request
£0.0000
Per Day
£0.00
Per Month
£0.00
Per Year
£0.00
Input: £0.00 (0%) Output: £0.00 (0%)
Your prompt uses 0.1% of the Claude Sonnet 4 200k context window
⚠ Prices update frequently — verify at provider pricing pages before making financial decisions.

Common use cases

Click any scenario to auto-fill the calculator

E-commerce
💬
Customer Support Bot
800 in / 300 out / 50k calls
Business tool
📄
Document Summariser
8k in / 500 out / 5k calls
Developer
💻
Code Assistant
2k in / 1.5k out / 20k calls
ClearNextAI
✉️
Email Rewriter
400 in / 200 out / 10k calls
Marketing
✍️
Content Generator
500 in / 2k out / 1k calls
Enterprise
🔍
RAG Search Query
3k in / 400 out / 100k calls

Compare all models

Side-by-side cost for your exact usage — sorted by monthly cost

Provider Model Monthly Cost Per Request Input $/1M Output $/1M Context

Want to build with AI without worrying about API costs? ClearNextAI handles the prompts for you.

Try ClearNextAI Free →

How many users can you afford?

Calculate your breakeven point and gross margin on AI costs

£
£
Leave blank to use the per-request cost × estimated calls per user
#

Enter your revenue per user to calculate breakeven.

Gross margin on AI costs

Building a SaaS? The ClearNextAI SaaS Builder walks you from idea to first 100 customers.

Try SaaS Builder →

How AI API token pricing works

Large language models process text as tokens — small chunks of text that are typically about 0.75 words each. When you send a prompt to an AI API, your input is broken into tokens, processed by the model, and the response is generated as output tokens. You are charged separately for both input and output tokens.

Output tokens are more expensive than input tokens because generating new text requires significantly more computation than reading existing text. Most providers charge 3 to 5 times more per output token than per input token. This is why optimising your prompts to be concise and setting appropriate max_tokens limits can dramatically reduce your monthly bill.

To estimate your monthly cost, multiply your average input tokens per request by the input price per token, add your average output tokens per request multiplied by the output price per token, then multiply by your total monthly API calls. The calculator above does this maths instantly for every model, so you can compare costs before writing a single line of code.

Frequently asked questions

How many tokens is 1,000 words?

Roughly 1,333 tokens. The general rule is 1 token ≈ 0.75 words, so divide your word count by 0.75 to get an approximate token count. This varies slightly by language and content type — code tends to use more tokens per word than plain English.

Why does output cost more than input?

Generating tokens requires significantly more compute than reading them. When a model processes your input, it reads tokens in parallel. When generating output, it produces tokens one at a time, each requiring a full forward pass through the neural network. This is why most models charge 3 to 5 times more per output token.

How do I reduce my AI API costs?

Use a smaller model for simple tasks — gpt-4o-mini and claude-haiku-3.5 cost 95%+ less than their flagship versions and handle most tasks well. Cache common system prompts, batch requests where possible, and set appropriate max_tokens limits to avoid paying for unnecessarily long responses.

Are these prices accurate?

Prices are correct as of the build date but AI providers update pricing regularly — sometimes reducing prices significantly. Always verify at the provider's official pricing page before making financial decisions. We update this calculator periodically to reflect the latest changes.

What's the difference between context window and tokens used?

The context window is the maximum number of tokens a model can process in a single request (input + output combined). You are only charged for the tokens you actually send and receive — not the full context window. Think of it like a bucket: the context window is the size of the bucket, and your tokens are the water you put in.