Context Management
Fermi’s context management is the core feature that enables long sessions. Instead of hitting context limits and performing a blind reset, the system monitors usage, compresses strategically, and only resets as a last resort. The agent can inspect its own context distribution and surgically summarize selected blocks — down to a single tool call result.
Three Layers
Section titled “Three Layers”1. Hint Compression
Section titled “1. Hint Compression”As context grows, the system injects guidance at two thresholds:
| Level | Default trigger | What the agent sees |
|---|---|---|
| Level 1 | 60% of budget | A nudge to call show_context and consider summarizing older groups |
| Level 2 | 80% of budget | A stronger prompt to summarize immediately before auto-compact triggers |
Hysteresis prevents oscillation — once a hint fires, context must drop meaningfully before the hint can fire again.
2. Agent-Initiated Summarization
Section titled “2. Agent-Initiated Summarization”The agent has two tools for fine-grained context control:
show_context
Section titled “show_context”Returns a self-contained context map showing all context groups with their IDs, sizes, and types. It injects nothing into the existing conversation (preserving the prompt cache) — the map itself tells the agent what each context ID covers.
summarize_context
Section titled “summarize_context”Operates on groups of spatially contiguous context IDs. For each group, the agent writes a distilled summary that preserves decisions, key facts, code references, and unresolved issues — then the original content is replaced by the summary.
The key property: this is append-only. Original content is never deleted — summaries are appended, and the system dynamically determines what is visible based on what has been summarized. This means summarization is safe and reversible at the system level.
summarize_context targets specific ranges. For whole-window reset when the context limit is reached, the system uses auto-compact (a separate mechanism, also exposed as the /compact user command).
3. Auto-Compact
Section titled “3. Auto-Compact”When context reaches critical levels despite hints and summarization:
| Trigger | Default threshold | When it fires |
|---|---|---|
| Before-turn | 85% | Before processing the next user message |
| Mid-turn | 90% | After a tool call result pushes context over the limit |
Auto-compact produces a continuation prompt — a comprehensive briefing that lets the agent resume exactly where it left off. After compact, the context window contains only:
- The continuation prompt
- The current plan snapshot (if a plan is active)
- AGENTS.md files (persistent memory)
Before-turn compact is interruptible: pressing Ctrl+C cancels the compact and preserves the original context.
User vs. Agent Summarization
Section titled “User vs. Agent Summarization”/summarize (user) | summarize_context tool (agent) | |
|---|---|---|
| Trigger | User runs the slash command | Agent decides autonomously (or prompted by hints) |
| Selection | Interactive picker: choose start/end turn range | Agent picks context IDs after inspecting the map |
| Focus | Optional focus prompt (“Keep the auth details”) | Agent writes the summary directly |
| Granularity | Turn-level ranges | Can target individual tool results |
Context Budget
Section titled “Context Budget”The effective context size can be restricted without switching models. In ~/.fermi/settings.json (or <project>/.fermi/settings.json for per-project override):
{ "context_budget_percent": 70}This sets the effective budget to 70% of the model’s maximum context length. All threshold calculations (hints, compact) operate against this budget. Useful when you want to leave headroom for large tool results.
You can also set it per-session via the CLI: fermi -c context_budget_percent=70.
Manual Intervention
Section titled “Manual Intervention”/summarize
Section titled “/summarize”Opens an interactive picker:
- Select start turn — pick where summarization begins
- Select end turn — pick where it ends
- Focus prompt (optional) — instructions about what to preserve
The selected range is converted to context IDs and summarized.
/summarize/compact
Section titled “/compact”Full context reset with a continuation summary. Optionally provide instructions:
/compact/compact Preserve the DB schema decisionsAGENTS.md — Persistent Memory
Section titled “AGENTS.md — Persistent Memory”Two AGENTS.md files are folded into the system prompt (so they’re present every turn) and survive compact:
~/.fermi/AGENTS.md— Global preferences across all projects<project>/AGENTS.md— Project-specific patterns and conventions
They’re read at session init and on reload (e.g. after editing AGENTS.md or running /reload). The agent reads these for context and can write to them to save long-term knowledge. Use AGENTS.md to store architectural decisions, coding conventions, known constraints, and preferred approaches.
Practical Tips
Section titled “Practical Tips”- Let the system work. In most sessions the three automatic layers handle everything.
- Use
/summarizeafter exploration. Once you have conclusions from a long investigation, summarize the exploration to free space for execution. - Provide a focus prompt. Telling the summarizer what matters makes compression more effective.
- Adjust
context_budget_percentif you routinely hit limits with large files or many tool results. - Write to AGENTS.md for knowledge that should persist across sessions — the agent can do this on your behalf.