No description

Python 90.5%
Shell 7.4%
Dockerfile 2.1%

Find a file

Michał 9dc6a58339 feat: replace Gradio with Chainlit chat UI Switch web frontend from Gradio to Chainlit for a cleaner, modern chat interface with WebSocket streaming. Add cl_app.py with lazy-init pattern, purple theme, and Polish UI strings. Remove all Gradio references from codebase, rename GRADIO_PORT to SERVER_PORT. Translate all code comments and logs to English.		2026-04-12 22:55:56 +02:00
public	feat: replace Gradio with Chainlit chat UI	2026-04-12 22:55:56 +02:00
.dockerignore	feat: optimization, rocm setup	2025-12-23 14:41:18 +01:00
.env.example	feat: replace Gradio with Chainlit chat UI	2026-04-12 22:55:56 +02:00
.gitignore	Initial commit: DnD Campaign Chatbot	2025-12-23 12:38:57 +01:00
.python-version	feat: change dockerfile, linting and formatting changes	2025-12-23 13:39:54 +01:00
chat.py	feat: replace Gradio with Chainlit chat UI	2026-04-12 22:55:56 +02:00
cl_app.py	feat: replace Gradio with Chainlit chat UI	2026-04-12 22:55:56 +02:00
CLAUDE.md	feat: replace Gradio with Chainlit chat UI	2026-04-12 22:55:56 +02:00
config.py	feat: replace Gradio with Chainlit chat UI	2026-04-12 22:55:56 +02:00
contextual.py	feat: replace Gradio with Chainlit chat UI	2026-04-12 22:55:56 +02:00
docker-compose.yml	feat: replace Gradio with Chainlit chat UI	2026-04-12 22:55:56 +02:00
Dockerfile	feat: replace Gradio with Chainlit chat UI	2026-04-12 22:55:56 +02:00
entrypoint.sh	fix: persist ingest cache and skip dev deps at runtime	2026-04-11 12:49:34 +02:00
ingest.py	feat: replace Gradio with Chainlit chat UI	2026-04-12 22:55:56 +02:00
pyproject.toml	feat: replace Gradio with Chainlit chat UI	2026-04-12 22:55:56 +02:00
README.md	feat: replace Gradio with Chainlit chat UI	2026-04-12 22:55:56 +02:00
retriever.py	fix: persist ONNX reranker cache and clean up docstrings	2026-04-12 21:51:36 +02:00
uv.lock	feat: replace Gradio with Chainlit chat UI	2026-04-12 22:55:56 +02:00

README.md

DnD Campaign Chatbot

A self-hosted RAG (Retrieval Augmented Generation) chatbot for querying D&D campaign notes. Polish-language content.

Features

Hybrid search: BM25 (keyword) + vector (semantic) with RRF fusion and optional cross-encoder reranking
Contextual prefixes: LLM-generated chunk context for better retrieval (Anthropic technique)
Auto-indexing: Watches content directory, incrementally indexes new/modified markdown files
REST API: Fast context endpoint for external integrations (used by Nanobot Telegram bot)
Web UI: Chainlit chat interface with streaming responses
Polish language: Optimized for Polish content with wiki-link parsing

Architecture

Markdown files (dnd-summaries/content/)
        |
        v
   +---------+    contextual     +--------+
   |  ingest  |---prefixes (LLM)-| Ollama |
   |          |    embeddings     |(remote)|
   +----+-----+                  +--------+
        |
        v
   +----------+
   | ChromaDB |  (vector store, port 8100 external / 8000 internal)
   +----+-----+
        |
        v
   +----------+    hybrid search    +--------+
   |   chat   |----LLM response----| Ollama |
   |(Chainlit|    (streaming)      |(remote)|
   |+FastAPI) |                     +--------+
   +----------+
     port 7860

Three Docker Compose services:

Service	Container	Port	Role
`chromadb`	`dnd-chromadb`	8100:8000	Vector store (persistent on NAS)
`ingest`	`dnd-ingest`	—	Indexes markdown, watches for changes
`chat`	`dnd-chat`	7860:7860	Web UI + REST API

Quick Start

1. Configure

cp .env.example .env

Edit .env:

CONTENT_PATH=/path/to/your/markdown/content
OLLAMA_BASE_URL=http://your-ollama-host:11434
LLM_MODEL=SpeakLeash/bielik-11b-v3.0-instruct:Q4_K_M
EMBEDDING_MODEL=bge-m3

2. Start

docker compose up -d

3. Use

Web UI: http://localhost:7860
Context API: http://localhost:7860/api/context?q=your+question
Query API: http://localhost:7860/api/query?q=your+question
Health: http://localhost:7860/api/health

Configuration

All settings via .env file:

Variable	Default	Description
`CONTENT_PATH`	`./_content`	Path to markdown files
`OLLAMA_BASE_URL`	`http://ollama:11434`	Ollama server URL
`LLM_MODEL`	`gemma3:12b`	Model for chat and contextual prefixes
`EMBEDDING_MODEL`	`bge-m3`	Model for embeddings
`RETRIEVER_K`	`6`	Number of context chunks to retrieve
`CHUNK_SIZE`	`1000`	Text chunk size for indexing
`CHUNK_OVERLAP`	`200`	Overlap between chunks
`RERANKER_ENABLED`	`false`	Enable cross-encoder reranking (slower, more precise — useful for large corpora)

RAG Pipeline

Ingestion (ingest.py): Markdown → chunks → contextual prefixes via LLM (cached) → embeddings via bge-m3 → ChromaDB
Retrieval (retriever.py): BM25 + vector search → RRF fusion → optional cross-encoder reranking (bge-reranker-v2-m3, off by default)
Generation (chat.py): Retrieved chunks as context → LLM streaming response

API Endpoints

Endpoint	Speed	Description
`GET /api/context?q=...`	~1.5s	RAG chunks only, no LLM — for external integrations
`GET /api/query?q=...`	~30-40s	Full LLM-generated answer with sources
`GET /api/health`	instant	Health check

Content Format

Pattern	Type	Example
`DD.MM.YYYY.md`	Session notes	`15.11.2025.md`
`postacie/*.md`	Character	`Thok Darkhide.md`
`lokacje/*.md`	Location	`Luskan.md`

Wiki-links ([[Name]] and [[Name|Display]]) are resolved during ingestion.

Commands

# Start
docker compose up -d

# Logs
docker compose logs -f chat
docker compose logs -f ingest

# Rebuild after code changes
docker compose build && docker compose up -d

# Force full reindex (delete ChromaDB collection)
curl -X DELETE http://localhost:8100/api/v2/tenants/default_tenant/databases/default_database/collections/dnd_campaign
docker compose restart ingest

# Stop
docker compose down

Development

# Run locally (needs Ollama + ChromaDB running)
uv run python ingest.py
uv run python chat.py

# Lint
uv run ruff check .

License

MIT