01 · The Problem

Four Limits of Standard AI

Off-the-shelf AI and RAG are probabilistic. They work for documents. They are dangerous for legal analysis, hallucinating connections, missing nuance, and corrupting timelines.

GAP 01

🪪

Identity

"James" vs "Jamie Harlow" vs "The Accused": a vector DB treats these as three different people. This kills recall.

Phonetic GUID Binding

GAP 02

⚖️

Logic

A police officer writes "Knife." The Penal Code says "Weapon." Standard vector search misses this entirely.

Knowledge Graph Ontology

GAP 03

⏱

Chronology

In a Case Diary, sequence is the evidence. Standard AI retrieves by similarity, not by time. It corrupts timelines.

Time-Series Validation Engine

GAP 04

🕸

Complexity

Financial crimes are mathematical graph patterns, not keywords. A chatbot cannot "see" the flow of money.

Deterministic Graph Algorithms

02 · Architecture

Split-Brain System

Most systems only use the neural half. This architecture adds the symbolic layer: a rigid logic engine that acts as a chaperone for the creative AI.

Ingest & Normalisation

Reasoning & Retrieval

Security & Generation

INGEST VLM · Diarization · MDM

REASONING VecDB ↔ KG · Rules Engine

OUTPUT Citable Verdict · Live

latency — throughput —

Neural / LLM layer

Vector Database

Semantic similarity engine. Answers: "What does this text look like?"

Dense embeddings for narrative context
BM25 sparse search for Case IDs and passport numbers
Cross-encoder reranking on UK legal text
HyDE bridging natural questions to legal evidence

Symbolic / Logic layer

Knowledge Graph

Deterministic logic engine. Answers: "What does this mean legally?"

UK Penal Code ontology: statutes, sections, definitions
Temporal graph: timeline of events and causality
Ontology mapper: enforces legal inheritance
Enforces the law before generating the answer

03 · Identity

Variable Name, Constant Identity

100% recall across the entire archive, regardless of spelling. Every name variation is bound to a single Global Unique ID before any storage happens.

Double Metaphone Levenshtein Distance Fuzzy Matching Coreference Resolution

Coreference resolution: When a transcript says "He entered the room", the system looks back, identifies "He" as ENT-99284, and tags it. No evidence is orphaned.

04 · Logic

Think Like a Prosecutor

Standard RAG retrieves text. GraphRAG retrieves logic, traversing the Penal Code ontology to find the legal classification. The LLM receives the inference path, not just raw text.

AWAITING INPUT GraphRAG · Penal Code Ontology Traversal

Hop depth: 0 / 4

INPUT

"kitchen knife"

HOP DEPTH

4 traversals

STATUTE

S.47 OAPA 1861

METHOD

Ontology map

Zero-shot adaptability: If a new weapon like a "3D Printed Spike" appears tomorrow, simply update the graph taxonomy. No retraining needed. The logic propagates instantly across the entire system.

05 · Chronology

Alibi Physics

Time is the only variable that matters in a Case Diary. Evidence is extracted as structured tuples: (Entity, Action, Location, Timestamp), so math can be run on the narrative.

LIVE ANALYSIS Alibi conflict detection

DISTANCE

100 km

TIME WINDOW

15 min

REQ. SPEED

400 km/h

06 · Financial Crimes

Math Lives in Graphs, Not LLMs

Financial crimes are mathematical patterns, not keywords. The LLM parses bank statements. The graph algorithm proves the crime deterministically, with 100% confidence.

GRAPH ANALYSIS Smurfing / structuring pattern detection

Cycle Detection · PageRank

FEEDER ACCOUNTS

4 nodes

TOTAL DETECTED

CONFIDENCE

100%

STRUCTURING / SMURFING 2-HOP GRAPH TRAVERSAL

Cycle Detection · PageRank

Hybrid query: "Show all individuals within 2 hops of Suspect who transferred >50k". Standard search cannot do multi-hop relationship analysis.

07 · Sovereign Security

Air-gapped intelligence

Sensitive data cannot leave the building. A semantic router acts as a traffic cop, routing general queries to the cloud and case-specific data to an on-premise secure vault.

☁ Public Cloud LLM

General legal theory

Definitions, legal precedents, general interpretations of law. No case-specific data. Zero PII exposure.

🔒 On-Premise SLM

Case-specific evidence

Any query involving a case number, victim name, or sensitive PII is physically routed to a locally-hosted model. Sovereign data never leaves the CPS perimeter.

🛡

ABAC clearance filter

Attribute-Based Access Control checks user clearance before data is even retrieved. If a user lacks "Juvenile" clearance, that data effectively does not exist for them.

✂️

Dynamic redaction

Even if the secure model finds the answer, a post-generation filter scrubs all PII from the output based on the requesting user's role. Names, IDs, and phone numbers are removed before delivery.

🚫

Adversarial defense

Prompt injection protection layer. If someone tries to trick the AI into leaking data through crafted inputs, this layer intercepts and blocks it before any retrieval occurs.

08 · Trust

No Black Boxes

Every claim is a click away from its source. Every reasoning step is logged. If a defense attorney challenges a finding, the entire chain of thought can be printed and presented.

🎯

Hallucination guard: negative logic

The system is programmed to prefer silence over lies. Below 95% confidence, the AI is forced to respond "I do not know." It will not guess. Ever.

0%Threshold: 95%100%

📋

Immutable audit log: chain of thought

The entire reasoning path is logged: which documents were accessed, which logic path was taken, and why the conclusion was reached. All entries are timestamped and immutable.

✓ [10:42:01] Entity resolved: ENT-99284
→ [10:42:01] GraphRAG traversal: Knife → Weapon → S.47 OAPA
→ [10:42:02] Confidence: 97.3% · threshold passed
✓ [10:42:02] Citation: Forensic Report #442, p.12 ¶4

🗂

Tenant isolation

Vector indexes are physically separated. Prosecutor strategy notes are isolated from general evidence, so the AI can never accidentally leak internal strategy into a general query.

🔄

Stale data pruning: vector lifecycle

Witnesses change statements. If a witness recants, toggle is_active=False instantly, with no waiting for a weekend rebuild. The retrieval system stays accurate in real-time.

NEURO-SYMBOLIC AI

Reasoning Engine, Not Search Box.

It doesn't find similar text. It proves the case, traversing law, enforcing chronology, detecting patterns, and citing every claim.

Transitioning from probabilistic search → deterministic legal reasoning

Public Prosecution Proof over Probability