AI · Tokens · Cost

Token Counter

Count GPT and Claude tokens as you type. Your text never leaves your browser.

Model

$ / 1M input tokens

Exact OpenAI count via the tiktoken o200k encoding, computed in your browser.

0 Tokens

0 Characters

0 Words

$0.00 Est. cost

⚙ Encoding: o200k_base

⌀ Chars / token: 0.0

✦ Status: ready

About the Token Counter

A token counter tells you how many tokens a piece of text becomes when a large language model reads it. Tokens are the unit that GPT-5, GPT-4o, the o-series, and Claude actually process, and they are the unit you pay for on every API call and the unit that fills a model's context window. This tool counts them exactly for OpenAI models, right in your browser, and estimates the API cost as you type. Nothing is uploaded.

What a token actually is

A token is a chunk of text, usually a short run of characters rather than a whole word. The model breaks your text into tokens with a process called byte pair encoding, then works with the numeric IDs of those tokens. Common English words are often a single token. Longer or rarer words split into several. Spaces, punctuation, and the leading space before a word all count. As a rough guide, English runs about four characters per token, so 1,000 tokens is roughly 750 words, but the real number depends on the exact text and the model.

Why token count matters

Three reasons. First, cost: API pricing is quoted per million tokens, separately for input and output, so the token count of your prompt is the bill. Second, context windows: every model has a maximum number of tokens it can hold at once, and a prompt that exceeds the window is rejected or silently truncated. Third, latency: longer token counts take longer to process. If you build with LLMs, the token count is the number you budget against on all three axes.

Exact counts for OpenAI, estimates for Claude and Gemini

OpenAI publishes its tokenizer, so this tool gives exact counts for OpenAI models. GPT-5, GPT-4o, GPT-4.1, and the o-series use the o200k encoding. GPT-4 Turbo and GPT-3.5 use the older cl100k encoding. The tool runs the real tiktoken byte pair encoding for both, so the number matches what OpenAI bills to the token.

Claude and Gemini use their own tokenizers, which are not published as browser libraries. For Claude the tool shows a calibrated approximation: the cl100k count multiplied by a per-model factor we measured against Anthropic's official count-tokens endpoint (1.07 for Sonnet 4.6 and Haiku 4.5, 1.49 for Opus 4.8, prose medians from our tokenizer data study; code and CJK text vary). Gemini keeps the plain cl100k proxy. The label switches to "approx" so you always know which number is exact and which is an estimate; treat approximations as planning figures, not billing figures.

How the cost estimate works

The estimate multiplies your token count by the price per million input tokens for the selected model. Each model preset loads a representative input price, and you can edit the rate field to match your exact contract, a cached or batch rate, or output pricing. The estimate covers input tokens only. A real API call also bills the model's response, so add your expected output tokens at the output rate to budget a full round trip.

Real use cases

Staying inside a context window. Before sending a long document to a model, paste it here to confirm it fits. If GPT-4o gives you a 128,000-token window and your document is 140,000 tokens, you know to chunk it before the call fails.

Estimating API cost before you build. Paste a representative prompt, pick the model, and read the cost. Multiply by your expected call volume to forecast spend before writing a line of code.

Trimming prompts. System prompts and few-shot examples are paid on every single call. Counting tokens shows which instructions are expensive and lets you cut the ones that do not earn their place.

Comparing models. The same text becomes a different number of tokens under o200k and cl100k. Switching the model preset shows the difference, which matters when you are choosing between models on cost.

Chunking for embeddings and RAG. Embedding models and retrieval pipelines work in fixed token windows. Counting tokens lets you size chunks so each one fits with room for overlap.

Tokens vs words vs characters

Word count and character count answer different questions. A word counter tells you how long a piece reads to a human. A character counter tells you whether a post fits a platform limit. A token counter tells you what a language model sees and charges. The three rarely match: punctuation-heavy text, code, and non-English scripts all push the token-to-word ratio around; we measured those ratios across seven languages in our tokens-per-word data study. Code in particular tokenizes denser than prose because symbols, indentation, and identifiers fragment into many small tokens.

Why browser-local matters for prompts

Prompts are often sensitive. They can contain proprietary instructions, customer data, unreleased copy, or internal context. Most online token counters send your text to a server to count it. This one does not. The tiktoken encoding runs entirely in your browser, so the prompt you are measuring never leaves your device. You can confirm this by opening your browser's network tab and watching it stay silent as you type.

How the tool works

When you choose an OpenAI model, the tool loads the matching tiktoken encoding once and caches it. Every keystroke is encoded locally and the exact token count appears, along with characters, words, the characters-per-token ratio, and the estimated cost. The encoding files are served as static assets from this site, so there is no API call and no third-party request. Until the encoding finishes loading on the first use, the tool shows a quick estimate, then upgrades to the exact count automatically.

Embed this token counter on your site

The counter is free to embed in any blog post, docs page, or internal wiki. Paste the snippet below where you want the widget to appear; it runs the same exact o200k encoding, browser-local, and inherits light or dark from the visitor's system (force a theme with ?theme=dark or ?theme=light on the iframe URL). Please keep the credit line under the iframe visible; it is the only thing we ask for in return.

<iframe src="https://textkit.tech/embed/token-counter"
        width="100%" height="380" loading="lazy"
        style="border:1px solid #e5e7eb;border-radius:12px"
        title="Token Counter by TextKit"></iframe>
<p>Token counter by <a href="https://textkit.tech/token-counter">TextKit</a></p>

Frequently asked questions

Are the token counts exact?

Yes for OpenAI models. The tool runs the real tiktoken byte pair encoding for the o200k and cl100k encodings, so the count matches what OpenAI bills. Claude counts are calibrated approximations, measured against Anthropic's official count-tokens endpoint; Gemini counts are close approximations. Both are clearly labeled as such, because those tokenizers are not available as browser libraries.

Which models use which encoding?

GPT-5, GPT-4o, GPT-4.1, and the o-series use o200k. GPT-4 Turbo and GPT-3.5 Turbo use cl100k. The tool picks the right encoding automatically when you select a model.

Does the cost include the model's response?

No. The estimate covers input tokens only. A full API call also bills the output the model generates, at a separate output rate. Add your expected output tokens at the output price to budget a complete round trip.

Is my text uploaded anywhere?

No. The encoding runs entirely in your browser. Your prompt is never sent to a server, logged, or stored. The tokenizer data is a static file served from this site, not an API.

Why is a token roughly four characters?

For typical English, byte pair encoding merges common letter sequences into single tokens, which works out to about four characters per token on average. Code, rare words, and non-English text change that ratio, which is why the tool shows the live characters-per-token figure for your specific text.

Can I edit the price per token?

Yes. Each model preset loads a representative input price, and the rate field is editable. Set it to your negotiated rate, a batch or cached rate, or the output rate to estimate response cost.

Tool

Token Counter

About the Token Counter

What a token actually is

Why token count matters

Exact counts for OpenAI, estimates for Claude and Gemini

How the cost estimate works

Real use cases

Tokens vs words vs characters

Why browser-local matters for prompts

How the tool works

Embed this token counter on your site

Frequently asked questions

Are the token counts exact?

Which models use which encoding?

Does the cost include the model's response?

Is my text uploaded anywhere?

Why is a token roughly four characters?

Can I edit the price per token?

Related

Character Counter

Word Counter

JSON Formatter

Token Counter: The Complete Guide (2026)

Tokens per Word: GPT-5 vs Claude, Measured (2026)

Token Counter

About the Token Counter

What a token actually is

Why token count matters

Exact counts for OpenAI, estimates for Claude and Gemini

How the cost estimate works

Real use cases

Tokens vs words vs characters

Why browser-local matters for prompts

How the tool works

Embed this token counter on your site

Frequently asked questions

Are the token counts exact?

Which models use which encoding?

Does the cost include the model's response?

Is my text uploaded anywhere?

Why is a token roughly four characters?

Can I edit the price per token?

Related

Character Counter

Word Counter

JSON Formatter

Token Counter: The Complete Guide (2026)

Tokens per Word: GPT-5 vs Claude, Measured (2026)

Learn more about token counting

Token Counter: The Complete Guide (2026)

Text Tools for the AI Era