Handling Context Window Limits in Fifteen: Token Tracking & Preventing Overflow

February 10, 2026
12 min read
Yuvraj Shanmugam • Research Systems
Context windows are finite. Here is how Fifteen tracks tokens, surfaces risk, and keeps long-running research agents stable.

Why Token Tracking Matters

Context window management is more than a technical detail. It is the difference between a memo that lands and one that fails mid-stream.

Fifteen makes usage visible so teams can design workflows that hold up under real research load.

  • Cost control with predictable token usage
  • User experience that avoids mid-draft failures
  • Reliability for multi-step research workflows

Understanding Fifteen Metrics

Fifteen exposes token usage per run and per session so teams can audit cost and reliability without guessing.

  • input_tokens: prompt + conversation history
  • output_tokens: model response tokens
  • total_tokens: combined request total
  • cache_read_tokens: cache hits for repeated prompts
from fifteen.agent import Agent
from fifteen.models import OpenAIChat

agent = Agent(
  model=OpenAIChat(id="gpt-4o"),
  instructions="Summarize the datapack for an IC memo."
)

response = agent.run("Draft the risk section.")
print(response.metrics.total_tokens)

Best Practices for Long Contexts

Keep only what is decision-critical in the active context. Summarize tool outputs, and push raw sources into the datapack instead of the prompt.

  • Cap history length per run
  • Summarize tool outputs before reuse
  • Split research into staged agents