Context Window

Content Marketing

Also: Context Length · Token Limit · Input Window

Usable context = Total tokens available minus system prompt tokens minus output tokens reserved

What it isHow much text an AI model can hold in memory at once

UnitTokens, not words or characters

Watch forForgetting content beyond the limit

Grows withEach new model generation

Quick definition

A context window is the maximum amount of text an AI language model can read and remember during a single session. It is measured in tokens, which are chunks of roughly three to four characters. Content beyond the limit is simply not seen by the model, which affects the quality and consistency of its output.

Run the numbers

Total context window (tokens)

System prompt size (tokens)

Expected output size (tokens)

Usable input tokens125,200 tokens

A 1,000-word document is roughly 1,300 tokens. Divide your usable input by 1,300 to estimate how many such documents fit in one session.

How it varies across Australia

Context windows have grown rapidly across model generations. What counted as a large context window one year ago is now standard, and the gap between consumer-grade models and enterprise-grade models has narrowed sharply. For most marketing and content tasks, current limits are large enough that hitting the ceiling is a workflow problem rather than a model problem.

See digital maturity scores across Australian industries →

The building blocks of context

Token

The unit AI models use to measure text. One token is roughly three to four characters or about three-quarters of a word.

System prompt

Instructions you give the model before the conversation starts. These consume tokens from the same fixed budget.

Input tokens

All text the model reads: your prompt, uploaded documents, conversation history, and retrieved content.

Output tokens

The model's response. These also count against the total limit, so longer outputs shrink the space for inputs.

Context overflow

What happens when input exceeds the limit. Older content is dropped, truncated or ignored depending on the model.

What it actually means

Think of the context window like a whiteboard in a meeting room. Everything written on it, your question, the document you pasted in, the conversation history, and the model's previous answers, is visible to the model while it fits. The moment you run out of whiteboard space, older content gets erased to make room. The model doesn't know what it can no longer see.

This has real consequences for marketing and content work. If you paste a long brand document, a content brief and a previous draft into the same session, you may be pushing critical instructions out the back of the window without realising it. The model's response starts drifting because its instructions have literally disappeared.

Context windows are measured in tokens, not words. One token is roughly three to four characters, so a thousand-word document is roughly 1,300 to 1,500 tokens. A large model might offer a context window of 100,000 to 200,000 tokens, which sounds enormous until you factor in system prompts, retrieved documents, full conversation history and the space the model needs to write its response.

The practical skill is learning to budget tokens the way you budget time. Know what is essential context, what is nice-to-have context, and what can be left out or broken into a separate session. Generative search tools like Perplexity and AI overviews in Google operate on the same principle: they retrieve content from the web and fit it into a context window before generating an answer. The content that fits gets considered. The content that doesn't fit doesn't.

The context window is the model's working memory. Fill it badly and it forgets the things that mattered.

How to calculate it

Usable input tokens = Total context window minus system prompt tokens minus reserved output tokens

Worked example. A model has a 128,000-token context window. Your system prompt is 800 tokens. You want the model to produce a 2,000-token response. Usable input space = 128,000 minus 800 minus 2,000 = 125,200 tokens. A 10,000-word document is roughly 13,000 tokens, so you could fit approximately nine such documents before hitting the limit.

The Australian context

Australian businesses adopting AI tools for content production, customer service and search are encountering context window limits in practical workflows before they understand what the limit actually is. Enterprise tools sold into Australia often carry the same context limits as global versions, but Australian legal, compliance and privacy documentation tends to be verbose. Financial services disclosures, privacy policies under the Privacy Act, and regulatory guidance from ASIC and the ACCC all run long. Teams feeding these documents into AI workflows for summarisation or drafting hit the ceiling faster than equivalent overseas teams working with shorter regulatory texts.

Where people get this wrong

Assuming the model read everything you pasted in.If your input exceeds the context window, older content is dropped silently. The model doesn't warn you. It just answers with less information than you intended to give it.

Confusing context window size with model intelligence.A larger context window means more information fits in one session, not that the model reasons better. A model can have a huge context window and still produce weak output if the context is poorly structured.

Using the same long session for unrelated tasks.Conversation history from an earlier task in the same session consumes tokens that could hold relevant context for the new task. Starting a fresh session for distinct tasks is often more effective than one very long session.

Common questions

What happens when you exceed the context window?

The model drops older content to make room for new input. Exactly what gets dropped depends on the model. Most drop from the beginning of the conversation. Some compress or summarise older turns. Either way, content you intended the model to use may be missing from its working memory without any warning.

How do tokens relate to words?

One token is roughly three to four characters, or about three-quarters of a word. A thousand-word document is roughly 1,300 to 1,500 tokens. Common words are often a single token. Rare words, long words and non-English text can consume more tokens per word than plain English.

Does a bigger context window mean better AI output?

Not automatically. A larger window lets more information fit in one session, which helps for tasks like document summarisation or long-form drafting. But the model's reasoning quality and the quality of your instructions matter more than window size for most marketing tasks.

How should I manage context window limits in practice?

Keep system prompts concise. Trim documents to the sections actually relevant to your task. Start fresh sessions for unrelated tasks rather than continuing an existing long conversation. For workflows that need large documents, look at retrieval-augmented generation tools that selectively pull only the relevant passages into the window.

Debrief

Get the next one

No spam. No fluff. Just the next article, straight to your inbox.

Keep exploring

About New Rebellion

New Rebellion is a marketing intelligence consultancy. We build tools, score Australian businesses on how their marketing actually performs, and publish Debrief every day. This dictionary is part of how we work in the open.

How we think →