Case Study 024

Eliminating Paper.
Deploying On-Premise AI.

A full-stack, AI-powered CRM engineered for a mid-size Italian law firm. Replacing manual workflows with local LLMs and semantic search.

Industry Legal Services
Deployment On-Premise (Air-gapped)
Timeline Q3 2024 – Q1 2025
Stack FastAPI, Qdrant, Ollama

01. Context

The Paper Problem

Legal workflows are historically document-heavy and resistant to digitization due to strict privacy laws. This firm operated on a purely physical basis, creating bottlenecks in case retrieval and version control.

domain Firm Profile

  • 7-10 Senior Lawyers
  • 400+ Active Cases
  • Zero existing digital infrastructure
warning

Initial State

Paper-based, manual routing, no audit trail.

Status: CRITICAL INEFFICIENCY

lock Privacy Constraint

Absolute requirement: Zero external data transmission. Cloud AI (OpenAI, Anthropic) was legally disqualified due to GDPR sensitivity.

inventory_2 Manual Flow

Physical folders moved manually between desks. Loss of documents during transit was a recurring operational risk.

search_off Search Failure

Finding a specific precedent required physically browsing archives, taking hours instead of seconds.

02. Architecture

Three Portals. One Backend.

A monolithic repository structure deploying microservices via Docker Compose. The system separates concerns between Lawyer operations, Client interactions, and Administrative oversight.

gavel Lawyer Portal React + Typescript
groups Client Portal React + Typescript
admin_panel_settings Admin Portal React + Typescript
REST API / WEBSOCKETS
FastAPI Gateway
Python 3.11 Async
PostgreSQL
Relational Data
Qdrant
Vector Search
Redis
Cache & Queues
MinIO
Object Storage

Tech Stack Details

Desktop Packaging

Electron Docker Compose

Security

Cloudflare Tunnel JWT Auth TOTP MFA

DevOps

GitHub Actions Poetry Terraform
All services contained within local VPC
psychology Core Intelligence

Fully Local LLM Architecture

To meet strict privacy requirements, we engineered a Retrieval-Augmented Generation (RAG) pipeline that runs entirely on the firm's on-premise server hardware (NVIDIA RTX 4090).

  • check_circle
    Zero External Data Transmission No data leaves the local network. 100% GDPR compliant.
  • check_circle
    Custom RAG Pipeline LangChain orchestration with semantic chunking for legal texts.
config.yaml
01 inference_engine: "Ollama"
02 model: "Qwen2.5:14b-instruct"
03 embeddings: "BAAI/bge-m3"
04 context_window: 32768
05 quantization: "Q4_K_M"
06 vector_db: "Qdrant (Local)"

System Capabilities

history_edu

Case Lifecycle

Full digital tracking from intake to archiving with automated status updates.

plagiarism

Semantic Search

Find documents by "meaning" rather than just keywords using vector embeddings.

restore_page

Version Control

Immutable history of every document edit with author attribution.

smart_toy

Drafting Assistant

AI-assisted generation of legal briefs based on historical precedents.

Security & Compliance

lock_person Authentication
JWT + TOTP MFA
encrypted Data Encryption
AES-256 (At Rest)
policy Compliance
GDPR + ISO 27001
backup Backup Strategy
Encrypted Offsite (Cold)
40%
Reduction in Case Processing Time
60%
Less Administrative Workload
0%
External Data Leakage (Air-Gapped)

From Paper to AI-Driven Infrastructure.

This project demonstrates that high-compliance industries do not need to sacrifice modern AI capabilities. With the right architecture, privacy and intelligence can coexist.