Handling Context Window Limits in Fifteen: Token Tracking & Preventing Overflow

February 10, 2026

12 min read

Yuvraj Shanmugam • Research Systems

Context windows are finite. Here is how Fifteen tracks tokens, surfaces risk, and keeps long-running research agents stable.

Why Token Tracking Matters

Context window management is more than a technical detail. It is the difference between a memo that lands and one that fails mid-stream.

Fifteen makes usage visible so teams can design workflows that hold up under real research load.

Cost control with predictable token usage
User experience that avoids mid-draft failures
Reliability for multi-step research workflows

Understanding Fifteen Metrics

Fifteen exposes token usage per run and per session so teams can audit cost and reliability without guessing.

input_tokens: prompt + conversation history
output_tokens: model response tokens
total_tokens: combined request total
cache_read_tokens: cache hits for repeated prompts

from fifteen.agent import Agent
from fifteen.models import OpenAIChat

agent = Agent(
  model=OpenAIChat(id="gpt-4o"),
  instructions="Summarize the datapack for an IC memo."
)

response = agent.run("Draft the risk section.")
print(response.metrics.total_tokens)

Best Practices for Long Contexts

Keep only what is decision-critical in the active context. Summarize tool outputs, and push raw sources into the datapack instead of the prompt.

Cap history length per run
Summarize tool outputs before reuse
Split research into staged agents