Building a Private AI Infrastructure From Scratch
Ramone — Local AI System // Case Study
How I designed and built a fully self-hosted AI system running five LLMs locally on consumer hardware — zero cloud dependency, zero data egress. This covers the full stack: WSL2 as the hypervisor layer, Ollama for model serving, Docker for containerisation, and Open WebUI as the interface. Includes the boot automation system, RAG pipeline architecture, and the hardware portability decisions that make the whole thing rebuildable in under 30 minutes.
System Architecture
Model Selection
RAG Pipeline Design
Boot Automation
Hardware Decisions
Workbot Engineering