🏗️ Architecture

DocGen.AI combines modern AI infrastructure with a local-first, real-time developer experience optimized for performance, modularity, and privacy.

Components

Frontend: React + ShadCN + Tailwind A clean, performant UI powered by Vite, supporting dark mode, dynamic theming, and responsive layout.
Backend: FastAPI Handles authentication, chat sessions, codebase management, and serves LLM requests with secure routing.
LLM Runtime: Ollama (locally-hosted models) Used to run and stream responses from models like LLaMA 3 and Mistral in a local, containerized environment.
Task Queue: ARQ Manages background jobs such as codebase embedding, chunking, token counting, and async file processing with Redis as the queue backend.
Realtime Engine: WebSockets (FastAPI) Streams chat messages, background task updates, and live system feedback.
Reverse Proxy: Caddy + Docker Compose Handles routing across containers (frontend, backend, Ollama), SSL termination, and proxy rules for API calls.
Embeddings + Storage: PostgreSQL + pgvector Stores tokenized code chunks, embedding vectors, and model/chat metadata for RAG-based prompts and retrieval.