Does the model's reply count against the context window?

Yes. The window covers the whole exchange, so the generated response shares the same token budget as the input. Leave room for the answer.

Is a bigger context window always better?

A larger window lets a model consider more at once, but it does not guarantee the model uses every part equally well, and processing more tokens can be slower and more expensive.

Does the token counter send my text to a model?

No. It estimates token usage locally in your browser and contacts no model or server.

Explainer1 min readUpdated June 25, 2026

What is a context window?

Short answer

A context window is the maximum number of tokens a language model can take into account at one time. It covers everything in the exchange: the system instructions, your prompt, any attached content, and the model's own reply.

What counts toward the window

The context window is a budget measured in tokens. Everything the model sees in a turn draws from it:

System or developer instructions
The conversation history so far
Your current prompt and any pasted or attached content
The response the model generates, which also consumes tokens

When you exceed itIf the total goes over the window, something has to give: earlier messages may be dropped or truncated, or the request is rejected. That is why very long inputs can cause a model to lose track of the start.

How big are they

Context windows have grown quickly. Different models support very different sizes, ranging from a few thousand tokens to well over a million. Larger windows let a model work with more material at once, but the relevant input still has to fit alongside the reply.

Try it: Token CounterEstimate your prompt size and compare it against common context windows from 8K to 1M tokens.Open tool