Memory & compaction
Conversations grow. Eventually they hit the model's context window. The Memory view is where Kenaz manages that pressure — automatically and visibly — without losing the thread.
Memory is one rung of a five-rung ladder. Working memory, compacted history, and long-term memory all live on this page. The two rungs above (personal, team, and org context packs) live on Context. Read both pages together if you're trying to figure out where a given piece of information should live.
What "memory" means here
In Kenaz, a session has three layers of state the model sees on every turn:
- The system prompt — set per-project or per-session.
- The compacted history — a summary the harness wrote when the conversation got long.
- Recent messages — the last N turns, verbatim.
When you send a new prompt, the harness assembles system + compacted + recent + your prompt and sends that to the provider. The Memory view shows you exactly what each layer contains.
Automatic compaction
When a session approaches the active model's context limit (default trigger: 70% of the window), Kenaz:
- Picks the oldest stretch of messages that haven't been compacted.
- Sends them to the model with a "summarize this conversation chunk" prompt.
- Replaces those messages in the visible history with a collapsed
[Summary]card. - Stores the original messages out-of-band — they're still in the audit log and you can expand the card to see them.
This is invisible during normal use; the chat just keeps working. The Memory view is where to look when something goes wrong.
Manual compaction
Memory view → Compact now. Forces compaction on the current session immediately. Useful when:
- You're about to attach a long document and want headroom.
- You're switching to a smaller-context model.
- You want to "reset" the conversation tone without losing the thread.
You can also pick a specific message range to compact, leaving newer history intact.
Editing memory
The Memory view is a live editor for the compacted layer:
- Edit a summary — click into a
[Summary]card and rewrite. The next turn the model sees your version, not Kenaz's. - Pin a fact — promote any line to "always include verbatim." Survives further compaction.
- Drop a chunk — delete a section the model doesn't need to remember.
Edits to memory are audited the same way prompts and responses are — the original Kenaz-written summary stays in the audit log even after you overwrite it.
Long-term memory (cross-session)
Memory has its own durable layer separate from the conversation. When you mark something to remember — right-click any line in the chat → Remember this, or pin a line from a [Summary] card — Kenaz writes a Chunk to its local memory store, embeds it, and gives it a scope:
| Scope | Visible in |
|---|---|
session | Just the originating session (default) |
project | Every session inside a project |
global | Every session, period |
On future turns, Kenaz runs a k-NN search against your chunks for the active query and pulls in any with similarity above threshold. The chunks ride along as a small "long-term memory" block in the system prompt — the model can use them but isn't required to.
The Memory view → Long-term tab browses, edits, and deletes chunks. Use it as your personal scratchpad of durable facts; over time it evolves into a model of how you work.
Memory vs. context packs
Memory chunks come back via retrieval — only when relevant. Context packs (Context) are injected — always present. The trade-off:
- Pin a memory chunk when the fact might come up again but doesn't need to be in front of the model every turn.
- Promote to a context pack when you've noticed yourself pinning the same kind of fact repeatedly, or when a teammate would benefit from seeing it too.
When you're ready to make the jump, Memory → multi-select chunks → Promote to context pack drops the selection into a draft personal pack. From there the same path covers personal → team → org. See Context for the full elevation flow.
MCP-based memory servers
The MCP catalog includes alternative memory backends (e.g. @modelcontextprotocol/server-memory). Connect one and Kenaz treats its tools as additional memory — the model can add_memory / search_memory per the server's tool schema. Useful when you want memory to live somewhere specific (a Notion page, a vector DB you already run, a teammate's shared store).
External memory servers do not auto-elevate into Kenaz context packs; if you want both, pin into Kenaz's local memory and promote from there.
Compaction with vision / attachments
Image attachments don't compact — Kenaz drops them from older turns when the context shrinks, but keeps a [Image dropped — was: <filename>] placeholder so the model knows it referenced an image earlier.
PDF attachments are summarized text-side (the model gets a one-paragraph synopsis) but the original PDF stays attached to the message — re-attach in a later turn if you need the full content again.
Performance
Compaction makes one extra model call per trigger event. On Sonnet-class models that's ~2–4 seconds and a few hundred tokens. On the Usage view you'll see a compaction event accounting for the spend.
If compaction is firing too often (e.g. you're doing tool-heavy work and tool results are bloating history), bump the trigger threshold in Settings → Memory → Compaction trigger, or pick a longer-context model.
Privacy
Compaction prompts go to whichever provider is currently configured for the session — same provider that handles your regular turns. The summarization prompt explicitly tells the model to omit secrets, paths, and tokens, but you should treat the compacted summary as having the same privacy posture as the original messages.
To change the provider used specifically for compaction (e.g. local Ollama for compaction, hosted Sonnet for chat), use Settings → Memory → Compaction provider.