/projects/librarius
← Back to projects

Librarius

active

August 2025

Python PostgreSQL pgvector Ollama Kubernetes
ai rag python self-hosted

An end-to-end Retrieval Augmented Generation (RAG) system for Warhammer 40K rulebooks. Fully self-hosted on my homelab K3s cluster — no cloud APIs, no subscriptions, just local models and local infrastructure.

Architecture

The system is built as a three-phase pipeline, each named after Librarian specializations from the lore:

Phase 1: Lexicanium — Data Ingestion

  • Extracts PDFs from archives with strict filename schema
  • Partitions documents using semantic chunking that preserves structure (sections, tables, stat blocks)
  • Handles both native and scanned PDFs with OCR via tesseract
  • Stores chunks in PostgreSQL with rich metadata (faction, edition, category, page numbers)

Phase 2: Epistolary — Embedding

  • Converts text chunks to 1024-dimensional vector embeddings using intfloat/multilingual-e5-large-instruct
  • GPU-accelerated batch processing
  • Stored in PostgreSQL with the pgvector extension

Phase 3: Codicier — Retrieval & Generation

  • Embeds user queries and performs k-NN vector search
  • Provides retrieved context to a local LLM via Ollama
  • Supports interactive chat mode with source citations
  • Filterable by game system (40K, 30K, Kill Team, etc.)

Why

LLMs hallucinate game rules constantly. By grounding responses in actual rulebook content with semantic search, the system provides accurate, sourced answers about game mechanics.