Projects

2026 · Solo · LLM / retrieval system

Clearance-aware GraphRAG

An agentic GraphRAG system where a user's clearance is enforced inside the retrieval path — the model never sees a chunk the caller isn't allowed to read, so it cannot leak it.

The problem

Corporate RAG has a quiet failure mode: the retriever fetches a chunk the user was never cleared to read, and the model dutifully summarises the secret. Most systems "fix" this by scrubbing the answer afterwards — too late, the fact has already reached the model. Strata enforces clearance inside retrieval.

Approach & tradeoffs

A user's clearance is a ceiling: public < internal < confidential < restricted. That ACL is applied as a filter inside the Qdrant vector search and the Neo4j graph WHERE clause, before any context reaches the generator — so a higher-clearance fact can't be retrieved in the first place, let alone leaked.

On top of that boundary sits the retrieval engineering:

Results

What I'd flag

The golden sets are small (10–20 items, single-run, on a local 8 GB GPU), so single-pass-vs-agent deltas are within noise — which is exactly why every eval report stores the raw answers. That's how I caught a refusal-matcher bug that was the metric's fault, not the model's.