feat: add qdrant doc store backend#13486
feat: add qdrant doc store backend#13486JosefAschauer wants to merge 5 commits intoinfiniflow:mainfrom
Conversation
|
Related PRs for this workstream:
Those two fixes are intentionally submitted as separate PRs so they can be reviewed and merged independently of the Qdrant backend integration. |
|
Suggested review order:
The three PRs are related, but #13484 and #13485 are independent generic fixes. This PR is the Qdrant backend integration. |
|
Added upstream-facing Qdrant documentation to this PR:
This keeps the Qdrant-specific setup and operational notes in the repo instead of only in the PR discussion. |
|
RAGFlow relies heavily on keyword-based full-text search. I'm uncertain whether Qdrant offers capabilities comparable to Elasticsearch in this regard. Additionally, the use of sparse vector search differs fundamentally from traditional full-text search, which raises further questions about suitability. |
|
Thank you for the review — this is an important point and I want to address it directly. You're right: Qdrant's full-text capabilities are not equivalent to Elasticsearch's BM25. Sparse vector search (SPLADE-style) captures semantic term importance but doesn't replace ES's mature lexical features like language-specific analyzers, stemming, phrase matching, or the fine-grained BM25 scoring that RAGFlow's "multiple recall" pipeline depends on. I don't want to pretend otherwise. The Qdrant backend is not intended to replace Elasticsearch for text-heavy RAG workloads. ES remains the right choice there, and this PR doesn't change anything about the ES path. The motivation for Qdrant is a different use case: vision-based document retrieval using ColPali/ColQwen multi-vector embeddings. ColPali produces ~1,024 patch embeddings per page using late-interaction (MaxSim scoring), which requires native multi-vector support in the vector database. Neither Elasticsearch nor Infinity supports this. Qdrant does — it's one of very few databases with first-class multi-vector and MaxSim support. The roadmap is:
The Vision template would appear as an additional dataset option alongside Naive, QA, Table, etc. — not a replacement for any existing template. Users working with visually rich documents (charts, diagrams, infographics, complex tables) would select it. Users with text-heavy documents would continue using ES with existing templates. To make the scope clearer, I'm happy to:
Would it help if I also opened an RFC issue describing the ColPali Vision template design? That would give more context on why multi-vector support matters and what the complete picture looks like. I have the RFC drafted and ready to post if that would be useful for the discussion. |
|
Support for late-interaction models represents a key capability that RAGFlow plans to offer, serving as one of the distinguishing features of the Infinity . |
|
Thanks for sharing the roadmap — it's great to hear that late-interaction retrieval is planned for Infinity. That's exactly the capability RAGFlow needs for visually rich documents. A couple of thoughts from a contributor and user perspective: Many RAGFlow users already run Qdrant or Milvus in their existing infrastructure. Adding a third-party vector database option would lower the barrier for adoption — users wouldn't need to migrate their vector infrastructure to try RAGFlow. This is similar to how RAGFlow already supports multiple LLM providers rather than locking users into one. |
|
Good catches. Pushed a cleanup:
Net: -273 lines, all 20 focused tests pass. |
|
Future work: the current implementation uses a single global A sensible next step would be per-KB engine routing:
That would require a routing layer above the current global |
Add Qdrant as an upstream-supported RAGFlow document/vector backend with optional sparse hybrid retrieval. Dense retrieval works by default using the existing embedding flow. Sparse hybrid retrieval (dense+sparse+RRF) activates only when a sparse embedding model is configured. Qdrant text matching uses tokenized payload matching, not Elasticsearch BM25 parity. Elasticsearch and Infinity behavior are unchanged. Key changes: - rag/utils/qdrant_conn.py: main adapter implementing DocStoreConnection - rag/utils/sparse_vector.py: token-based sparse embedding model - common/settings.py: DOC_ENGINE=qdrant routing and config - rag/nlp/search.py: sparse query path with RRF fusion - rag/llm/embedding_model.py: sparse encoding methods on Base class - api/db/services/llm_service.py: LLMBundle sparse delegation - api/db/services/dialog_service.py: SQL fallback disabled for Qdrant - api/db/services/doc_metadata_service.py: generic docStoreConn usage - Manual chunk paths (chunk_app, sdk/doc, document_service): sparse vectors - rag/svr/task_executor.py: sparse vector generation during ingestion - Docker: qdrant service, env vars, service_conf template - Tests: 15 unit tests covering dense, hybrid, CRUD, and SQL fallback Schema is left extensible for future multivector/late-interaction retrieval (ColPali/ColQwen). This commit does not implement ColPali/ColQwen. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
c087e8b to
e9b2dab
Compare
|
Qdrant is not just "vector-only" anymore. It supports sparse vector indexing alongside dense vectors in the same collection, which means lexical scoring can be modeled there as sparse retrieval instead of relying on a separate inverted-index engine. That makes it a credible backend for RAGFlow’s primary chunk-retrieval path, especially because hybrid dense+sparse fusion is native. At the same time, I’m not claiming full Elasticsearch feature parity from this PR: Elasticsearch’s native lexical core is still inverted-index/BM25, while Qdrant reaches similar behavior through sparse vectors, and RAGFlow still has separate gaps such as SQL retrieval and message-store support outside the core retrieval path. |
|
I think the cleanest framing for #13486 is as an opt-in experimental backend addition rather than as a claim of full Elasticsearch parity. The code path is intentionally backward-compatible for existing backends, and the larger roadmap question is really about whether RAGFlow should support a Qdrant-first multimodal/Vision-RAG direction. I wrote up that larger argument separately as an RFC so the PR can stay focused on the technical review of the backend groundwork. |
Summary
Add Qdrant as an opt-in experimental document store backend for RAGFlow, including dense retrieval, optional sparse hybrid retrieval, Docker/config wiring, and follow-up hardening from live validation.
Included
DOC_ENGINE=qdrantFollow-up Hardening Included In This PR
available_intlike the existing backend behavior during filteringElasticsearch/Infinityerror text with the active backend name in affected ingestion pathsqdrantservice config inspectionValidation
python3 -m py_compile admin/server/config.py api/utils/health_utils.py rag/graphrag/general/index.py rag/graphrag/utils.py rag/svr/task_executor.py rag/utils/qdrant_conn.py test/unit_test/rag/utils/test_qdrant_conn.py test/unit_test/admin/server/test_config_qdrant.py.venv/bin/python -m pytest test/unit_test/rag/utils/test_qdrant_conn.py test/unit_test/admin/server/test_config_qdrant.pyNotes
DOC_ENGINE=qdrant.