No description
  • Python 90.5%
  • Shell 7.4%
  • Dockerfile 2.1%
Find a file
Michał 9dc6a58339 feat: replace Gradio with Chainlit chat UI
Switch web frontend from Gradio to Chainlit for a cleaner, modern
chat interface with WebSocket streaming. Add cl_app.py with lazy-init
pattern, purple theme, and Polish UI strings. Remove all Gradio
references from codebase, rename GRADIO_PORT to SERVER_PORT.
Translate all code comments and logs to English.
2026-04-12 22:55:56 +02:00
public feat: replace Gradio with Chainlit chat UI 2026-04-12 22:55:56 +02:00
.dockerignore feat: optimization, rocm setup 2025-12-23 14:41:18 +01:00
.env.example feat: replace Gradio with Chainlit chat UI 2026-04-12 22:55:56 +02:00
.gitignore Initial commit: DnD Campaign Chatbot 2025-12-23 12:38:57 +01:00
.python-version feat: change dockerfile, linting and formatting changes 2025-12-23 13:39:54 +01:00
chat.py feat: replace Gradio with Chainlit chat UI 2026-04-12 22:55:56 +02:00
cl_app.py feat: replace Gradio with Chainlit chat UI 2026-04-12 22:55:56 +02:00
CLAUDE.md feat: replace Gradio with Chainlit chat UI 2026-04-12 22:55:56 +02:00
config.py feat: replace Gradio with Chainlit chat UI 2026-04-12 22:55:56 +02:00
contextual.py feat: replace Gradio with Chainlit chat UI 2026-04-12 22:55:56 +02:00
docker-compose.yml feat: replace Gradio with Chainlit chat UI 2026-04-12 22:55:56 +02:00
Dockerfile feat: replace Gradio with Chainlit chat UI 2026-04-12 22:55:56 +02:00
entrypoint.sh fix: persist ingest cache and skip dev deps at runtime 2026-04-11 12:49:34 +02:00
ingest.py feat: replace Gradio with Chainlit chat UI 2026-04-12 22:55:56 +02:00
pyproject.toml feat: replace Gradio with Chainlit chat UI 2026-04-12 22:55:56 +02:00
README.md feat: replace Gradio with Chainlit chat UI 2026-04-12 22:55:56 +02:00
retriever.py fix: persist ONNX reranker cache and clean up docstrings 2026-04-12 21:51:36 +02:00
uv.lock feat: replace Gradio with Chainlit chat UI 2026-04-12 22:55:56 +02:00

DnD Campaign Chatbot

A self-hosted RAG (Retrieval Augmented Generation) chatbot for querying D&D campaign notes. Polish-language content.

Features

  • Hybrid search: BM25 (keyword) + vector (semantic) with RRF fusion and optional cross-encoder reranking
  • Contextual prefixes: LLM-generated chunk context for better retrieval (Anthropic technique)
  • Auto-indexing: Watches content directory, incrementally indexes new/modified markdown files
  • REST API: Fast context endpoint for external integrations (used by Nanobot Telegram bot)
  • Web UI: Chainlit chat interface with streaming responses
  • Polish language: Optimized for Polish content with wiki-link parsing

Architecture

Markdown files (dnd-summaries/content/)
        |
        v
   +---------+    contextual     +--------+
   |  ingest  |---prefixes (LLM)-| Ollama |
   |          |    embeddings     |(remote)|
   +----+-----+                  +--------+
        |
        v
   +----------+
   | ChromaDB |  (vector store, port 8100 external / 8000 internal)
   +----+-----+
        |
        v
   +----------+    hybrid search    +--------+
   |   chat   |----LLM response----| Ollama |
   |(Chainlit|    (streaming)      |(remote)|
   |+FastAPI) |                     +--------+
   +----------+
     port 7860

Three Docker Compose services:

Service Container Port Role
chromadb dnd-chromadb 8100:8000 Vector store (persistent on NAS)
ingest dnd-ingest Indexes markdown, watches for changes
chat dnd-chat 7860:7860 Web UI + REST API

Quick Start

1. Configure

cp .env.example .env

Edit .env:

CONTENT_PATH=/path/to/your/markdown/content
OLLAMA_BASE_URL=http://your-ollama-host:11434
LLM_MODEL=SpeakLeash/bielik-11b-v3.0-instruct:Q4_K_M
EMBEDDING_MODEL=bge-m3

2. Start

docker compose up -d

3. Use

Configuration

All settings via .env file:

Variable Default Description
CONTENT_PATH ./_content Path to markdown files
OLLAMA_BASE_URL http://ollama:11434 Ollama server URL
LLM_MODEL gemma3:12b Model for chat and contextual prefixes
EMBEDDING_MODEL bge-m3 Model for embeddings
RETRIEVER_K 6 Number of context chunks to retrieve
CHUNK_SIZE 1000 Text chunk size for indexing
CHUNK_OVERLAP 200 Overlap between chunks
RERANKER_ENABLED false Enable cross-encoder reranking (slower, more precise — useful for large corpora)

RAG Pipeline

  1. Ingestion (ingest.py): Markdown → chunks → contextual prefixes via LLM (cached) → embeddings via bge-m3 → ChromaDB
  2. Retrieval (retriever.py): BM25 + vector search → RRF fusion → optional cross-encoder reranking (bge-reranker-v2-m3, off by default)
  3. Generation (chat.py): Retrieved chunks as context → LLM streaming response

API Endpoints

Endpoint Speed Description
GET /api/context?q=... ~1.5s RAG chunks only, no LLM — for external integrations
GET /api/query?q=... ~30-40s Full LLM-generated answer with sources
GET /api/health instant Health check

Content Format

Pattern Type Example
DD.MM.YYYY.md Session notes 15.11.2025.md
postacie/*.md Character Thok Darkhide.md
lokacje/*.md Location Luskan.md

Wiki-links ([[Name]] and [[Name|Display]]) are resolved during ingestion.

Commands

# Start
docker compose up -d

# Logs
docker compose logs -f chat
docker compose logs -f ingest

# Rebuild after code changes
docker compose build && docker compose up -d

# Force full reindex (delete ChromaDB collection)
curl -X DELETE http://localhost:8100/api/v2/tenants/default_tenant/databases/default_database/collections/dnd_campaign
docker compose restart ingest

# Stop
docker compose down

Development

# Run locally (needs Ollama + ChromaDB running)
uv run python ingest.py
uv run python chat.py

# Lint
uv run ruff check .

License

MIT