feat: migrate ChromaDB to HTTP server mode #1

Merged
fr0stykiller merged 2 commits from feat/chroma-server into main 2026-03-13 23:52:34 +01:00
Owner

Summary

  • Replace local persistent directory ChromaDB with a dedicated chromadb/chroma container exposing HTTP API
  • Switch ingest.py and chat.py to use chromadb.HttpClient instead of local directory
  • Expose ChromaDB on host port 8100 for external access
  • Simplify entrypoint.sh and config.py

Test plan

  • ChromaDB server starts and accepts connections
  • Ingest service indexes 77 chunks from 16 files
  • Chat service connects and serves Gradio UI
  • Verify chat response quality
## Summary - Replace local persistent directory ChromaDB with a dedicated chromadb/chroma container exposing HTTP API - Switch ingest.py and chat.py to use chromadb.HttpClient instead of local directory - Expose ChromaDB on host port 8100 for external access - Simplify entrypoint.sh and config.py ## Test plan - [x] ChromaDB server starts and accepts connections - [x] Ingest service indexes 77 chunks from 16 files - [x] Chat service connects and serves Gradio UI - [x] Verify chat response quality
Replace local persistent directory ChromaDB with a dedicated chromadb/chroma
container exposing HTTP API. This enables external services (like Nanobot)
to query the vector store over the network.

Changes:
- Add chromadb service to docker-compose
- Switch ingest.py and chat.py to use chromadb.HttpClient
- Update config.py: replace CHROMA_DIR with CHROMA_URL
- Simplify entrypoint.sh with ChromaDB health wait
- Remove chroma_data volume mount from ingest/chat containers
- Expose ChromaDB on host port 8100

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three RAG quality improvements:

1. Hybrid search (BM25 + Vector + RRF fusion)
   - BM25 catches exact keyword matches (proper nouns, names)
   - Vector search catches semantic similarity
   - RRF merges both rankings

2. Contextual chunk prefixes (Anthropic technique)
   - LLM generates context prefix per chunk at ingest time
   - Prefix describes: source document, characters, locations, events
   - Reduces failed retrievals by ~49%

3. Cross-encoder reranking (bge-reranker-v2-m3)
   - Retrieves top-20 candidates, reranks to top-k
   - Cross-encoder scores (query, doc) pairs together
   - Much more accurate than cosine similarity alone

New files: retriever.py, contextual.py
New deps: rank-bm25, sentence-transformers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fr0stykiller force-pushed feat/chroma-server from 8570b3384e to 30bc7b927b 2026-03-13 23:16:37 +01:00 Compare
fr0stykiller deleted branch feat/chroma-server 2026-03-13 23:52:34 +01:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
fr0stykiller/dnd-chatbot!1
No description provided.