pi-privacy-filter is a
pi
extension that runs every outbound prompt through a local PII / secret
classifier before it leaves your machine. Account numbers, emails,
names, keys — they get swapped for stable
[ENTITY_TYPE_N] placeholders. The model only ever sees the
placeholders. The original values are restored in the response, locally,
before you see it.
// You type: ▶ My AWS account number is 22922829292 and my email is harry@example.com. 🔒 Redacted 2 values before sending: 22922829292, harry@example.com // What the LLM sees: My AWS account number is [ACCOUNT_NUMBER_1] and my email is [EMAIL_1]. // LLM responds (with placeholders): Got it — I'll use account [ACCOUNT_NUMBER_1] and contact [EMAIL_1]. // What you see (restored locally): Got it — I'll use account 22922829292 and contact harry@example.com. ▶ /privacy-mapping [ACCOUNT_NUMBER_1] → 22922829292 [EMAIL_1] → harry@example.com
Coding agents are a productivity multiplier — and a steady, quiet
pipeline of whatever you happen to paste into the prompt: account IDs,
env vars, names, customer emails, internal hostnames, the contents of
.env files you grepped half a second ago. Every one of
those bytes ends up in some provider's logs.
pi-privacy-filter puts a tiny, local,
deterministic redaction step in between you and the wire. A
token-classification model from
openai/privacy-filter
runs on your CPU via
@huggingface/transformers.
Sensitive spans get swapped for stable [ENTITY_TYPE_N]
placeholders. The mapping lives in memory only. The LLM works against
symbols; the user works against real values; nobody else sees either.
Two pi event hooks bracket every LLM call. On the way out, sensitive spans become placeholders. On the way back, placeholders become real values again — locally. The model is none the wiser, and the on-disk session contains your original text exactly as you typed it.
Pi's context event fires before each request. Every text part in every message — user prompts, tool results, prior assistant replies — runs through the local classifier. New secrets mint new placeholders; known secrets reuse their existing ones via a deterministic, longest-first pre-replace pass.
Only the redacted payload leaves the machine. The system prompt is appended with a one-liner explaining the placeholder convention so the model uses tokens verbatim instead of guessing the originals. Same value → same placeholder, every turn, forever.
When the assistant message lands, pi's message_end hook walks every text and thinking block and swaps placeholders back to the original values from the in-memory mapping — before you see the response and before pi persists it.
Everything that matters for a redact / restore pipeline that has to be invisible to the user.
The classifier runs on your CPU via @huggingface/transformers. The placeholder mapping lives in process memory only — never persisted, never shipped over the wire.
Same value → same placeholder, across turns, across messages, across thinking blocks. The LLM can reason about [ACCOUNT_NUMBER_1] as a coherent entity for the entire session.
Once a value is mapped, every future occurrence is replaced via deterministic string substitution before the classifier runs. Classifier instability or fragmentation can't leak the same secret twice.
Reasoning content from Anthropic / OpenAI / Gemini / Mistral all uses a thinking field. We redact and restore those alongside regular text — so chain-of-thought never leaks raw secrets either.
Redaction results memoized per exact-text input, with O(1) original→placeholder lookups. After the first turn, history replay is essentially free; only new bytes hit the model.
Every time a brand-new secret enters the conversation, you get a local-only toast like 🔒 Redacted 1 value before sending: 22922829292 — so you can see exactly what got hidden.
Pi's footer shows 🔒 privacy-filter · N redacted, updating after every turn. Ambient awareness with zero clutter in the transcript.
/privacy-mapping dumps the current placeholder → original mapping. Handy for debugging the classifier, auditing what got redacted, or just satisfying curiosity.
Single TypeScript file, loaded via pi's normal extension mechanism. Works with every provider pi supports — the model never knows it's behind a redaction layer beyond the one-line system-prompt notice.
The classifier weights are pulled the first time you start pi (cached
under ~/.cache/huggingface-transformers-js/) — after that
it's all offline.
# install via pi pi install npm:@codingcoffee/pi-privacy-filter # or try without installing pi -e npm:@codingcoffee/pi-privacy-filter
git clone https://github.com/codingCoffee/pi-privacy-filter.git cd pi-privacy-filter bun install # or: npm install # load it for one session pi -e .
Or wire it permanently as a project-local / global pi extension:
# project-local — applies in the current repo only mkdir -p .pi/extensions ln -s "$PWD" .pi/extensions/pi-privacy-filter # global — applies everywhere mkdir -p ~/.pi/agent/extensions ln -s "$PWD" ~/.pi/agent/extensions/pi-privacy-filter
First run will download the openai/privacy-filter weights (~tens of MB). Subsequent runs use the local cache and start instantly.
The extension is intentionally low-config: there's nothing to tune, nothing to point at, no API keys to manage. You get one inspection command and two ambient UI affordances.
| Surface | What | When |
|---|---|---|
/privacy-mapping |
Slash command — dumps current placeholder → original mapping. | Any time, on demand. |
| Footer status | 🔒 privacy-filter · N redacted — live count of distinct values redacted in this session. |
Always visible; updates after every turn. |
| Toast notification | 🔒 Redacted N values before sending: … — shows the actual values (locally, never sent anywhere). |
Only when a brand-new secret enters the conversation. |
OMP_NUM_THREADS=1, configures the model session with
intraOpNumThreads=1 / interOpNumThreads=1, and filters the
handful of pthread_setaffinity_np warnings off stderr so
they don't pollute the pi TUI. Real errors still surface.
Install once. Forget about it. Watch the toasts.