No description
- Python 90.5%
- Shell 7.4%
- Dockerfile 2.1%
Switch web frontend from Gradio to Chainlit for a cleaner, modern chat interface with WebSocket streaming. Add cl_app.py with lazy-init pattern, purple theme, and Polish UI strings. Remove all Gradio references from codebase, rename GRADIO_PORT to SERVER_PORT. Translate all code comments and logs to English. |
||
|---|---|---|
| public | ||
| .dockerignore | ||
| .env.example | ||
| .gitignore | ||
| .python-version | ||
| chat.py | ||
| cl_app.py | ||
| CLAUDE.md | ||
| config.py | ||
| contextual.py | ||
| docker-compose.yml | ||
| Dockerfile | ||
| entrypoint.sh | ||
| ingest.py | ||
| pyproject.toml | ||
| README.md | ||
| retriever.py | ||
| uv.lock | ||
DnD Campaign Chatbot
A self-hosted RAG (Retrieval Augmented Generation) chatbot for querying D&D campaign notes. Polish-language content.
Features
- Hybrid search: BM25 (keyword) + vector (semantic) with RRF fusion and optional cross-encoder reranking
- Contextual prefixes: LLM-generated chunk context for better retrieval (Anthropic technique)
- Auto-indexing: Watches content directory, incrementally indexes new/modified markdown files
- REST API: Fast context endpoint for external integrations (used by Nanobot Telegram bot)
- Web UI: Chainlit chat interface with streaming responses
- Polish language: Optimized for Polish content with wiki-link parsing
Architecture
Markdown files (dnd-summaries/content/)
|
v
+---------+ contextual +--------+
| ingest |---prefixes (LLM)-| Ollama |
| | embeddings |(remote)|
+----+-----+ +--------+
|
v
+----------+
| ChromaDB | (vector store, port 8100 external / 8000 internal)
+----+-----+
|
v
+----------+ hybrid search +--------+
| chat |----LLM response----| Ollama |
|(Chainlit| (streaming) |(remote)|
|+FastAPI) | +--------+
+----------+
port 7860
Three Docker Compose services:
| Service | Container | Port | Role |
|---|---|---|---|
chromadb |
dnd-chromadb |
8100:8000 | Vector store (persistent on NAS) |
ingest |
dnd-ingest |
— | Indexes markdown, watches for changes |
chat |
dnd-chat |
7860:7860 | Web UI + REST API |
Quick Start
1. Configure
cp .env.example .env
Edit .env:
CONTENT_PATH=/path/to/your/markdown/content
OLLAMA_BASE_URL=http://your-ollama-host:11434
LLM_MODEL=SpeakLeash/bielik-11b-v3.0-instruct:Q4_K_M
EMBEDDING_MODEL=bge-m3
2. Start
docker compose up -d
3. Use
- Web UI: http://localhost:7860
- Context API: http://localhost:7860/api/context?q=your+question
- Query API: http://localhost:7860/api/query?q=your+question
- Health: http://localhost:7860/api/health
Configuration
All settings via .env file:
| Variable | Default | Description |
|---|---|---|
CONTENT_PATH |
./_content |
Path to markdown files |
OLLAMA_BASE_URL |
http://ollama:11434 |
Ollama server URL |
LLM_MODEL |
gemma3:12b |
Model for chat and contextual prefixes |
EMBEDDING_MODEL |
bge-m3 |
Model for embeddings |
RETRIEVER_K |
6 |
Number of context chunks to retrieve |
CHUNK_SIZE |
1000 |
Text chunk size for indexing |
CHUNK_OVERLAP |
200 |
Overlap between chunks |
RERANKER_ENABLED |
false |
Enable cross-encoder reranking (slower, more precise — useful for large corpora) |
RAG Pipeline
- Ingestion (
ingest.py): Markdown → chunks → contextual prefixes via LLM (cached) → embeddings viabge-m3→ ChromaDB - Retrieval (
retriever.py): BM25 + vector search → RRF fusion → optional cross-encoder reranking (bge-reranker-v2-m3, off by default) - Generation (
chat.py): Retrieved chunks as context → LLM streaming response
API Endpoints
| Endpoint | Speed | Description |
|---|---|---|
GET /api/context?q=... |
~1.5s | RAG chunks only, no LLM — for external integrations |
GET /api/query?q=... |
~30-40s | Full LLM-generated answer with sources |
GET /api/health |
instant | Health check |
Content Format
| Pattern | Type | Example |
|---|---|---|
DD.MM.YYYY.md |
Session notes | 15.11.2025.md |
postacie/*.md |
Character | Thok Darkhide.md |
lokacje/*.md |
Location | Luskan.md |
Wiki-links ([[Name]] and [[Name|Display]]) are resolved during ingestion.
Commands
# Start
docker compose up -d
# Logs
docker compose logs -f chat
docker compose logs -f ingest
# Rebuild after code changes
docker compose build && docker compose up -d
# Force full reindex (delete ChromaDB collection)
curl -X DELETE http://localhost:8100/api/v2/tenants/default_tenant/databases/default_database/collections/dnd_campaign
docker compose restart ingest
# Stop
docker compose down
Development
# Run locally (needs Ollama + ChromaDB running)
uv run python ingest.py
uv run python chat.py
# Lint
uv run ruff check .
License
MIT