LLM Tokens Explained: What They Are and How They Affect Your API Bill
What is a token, how many tokens is your content, and exactly how does token count translate to API cost? Everything developers need to know.
Every LLM API bill is denominated in tokens — but what exactly is a token, how many tokens does your content use, and how does token count translate to real dollars? This guide explains everything you need to know to understand and control your LLM API costs.
What Is a Token?
A token is a chunk of text — roughly 3–4 characters in English, or about 0.75 words. Tokenization is how LLMs break text into pieces their neural networks can process. The exact split depends on the model's tokenizer:
Input Tokens vs Output Tokens
Every API call has two token counts that are priced separately:
Tokens to Words to Cost: The Math
Rule of thumb: 1 token ≈ 0.75 words, so 1,000 words ≈ 1,333 tokens. Here's how common content types translate:
Token Cost Examples Across Models
Cost to process 1 full-length article (1,000 input tokens) and generate a 500-word summary (667 output tokens):
Reasoning Tokens: The Hidden Cost
Reasoning models (OpenAI o3, o4-mini; Claude with extended thinking; DeepSeek R1) use additional “thinking” tokens before generating their final response. These reasoning tokens are billed at the standard output rate — and can be 3–10x the length of the visible response.
Context Window: The Token Limit
Every model has a context window — the maximum number of tokens it can process in a single API call (input + output combined). Exceeding this limit causes an error; approaching it can degrade quality.
Use the token counter to measure your prompts: Token counter → | Cost calculator →