Handling Context Window Limits in Fifteen: Token Tracking & Preventing Overflow
February 10, 2026
12 min read
Yuvraj Shanmugam • Research Systems
Context windows are finite. Here is how Fifteen tracks tokens, surfaces risk, and keeps long-running research agents stable.
Why Token Tracking Matters
Context window management is more than a technical detail. It is the difference between a memo that lands and one that fails mid-stream.
Fifteen makes usage visible so teams can design workflows that hold up under real research load.
- Cost control with predictable token usage
- User experience that avoids mid-draft failures
- Reliability for multi-step research workflows
Understanding Fifteen Metrics
Fifteen exposes token usage per run and per session so teams can audit cost and reliability without guessing.
- input_tokens: prompt + conversation history
- output_tokens: model response tokens
- total_tokens: combined request total
- cache_read_tokens: cache hits for repeated prompts
from fifteen.agent import Agent
from fifteen.models import OpenAIChat
agent = Agent(
model=OpenAIChat(id="gpt-4o"),
instructions="Summarize the datapack for an IC memo."
)
response = agent.run("Draft the risk section.")
print(response.metrics.total_tokens) Best Practices for Long Contexts
Keep only what is decision-critical in the active context. Summarize tool outputs, and push raw sources into the datapack instead of the prompt.
- Cap history length per run
- Summarize tool outputs before reuse
- Split research into staged agents