feat: add qdrant doc store backend by JosefAschauer · Pull Request #13486 · infiniflow/ragflow

JosefAschauer · 2026-03-09T15:03:39Z

Summary

Add Qdrant as an opt-in experimental document store backend for RAGFlow, including dense retrieval, optional sparse hybrid retrieval, Docker/config wiring, and follow-up hardening from live validation.

Included

Qdrant document-store adapter with interface parity for the current storage contract
dense retrieval and optional sparse hybrid retrieval with RRF fusion
factory/settings/config/Docker wiring for DOC_ENGINE=qdrant
Qdrant-aware admin health/config integration
live-validation hardening for Qdrant point IDs, availability defaults, and backend-specific error reporting
unit coverage for adapter behavior and search flow

Follow-up Hardening Included In This PR

map legacy string chunk IDs to valid Qdrant point UUIDs while preserving the external RAGFlow IDs
treat missing available_int like the existing backend behavior during filtering
replace hardcoded Elasticsearch/Infinity error text with the active backend name in affected ingestion paths
add admin/server support for qdrant service config inspection

Validation

python3 -m py_compile admin/server/config.py api/utils/health_utils.py rag/graphrag/general/index.py rag/graphrag/utils.py rag/svr/task_executor.py rag/utils/qdrant_conn.py test/unit_test/rag/utils/test_qdrant_conn.py test/unit_test/admin/server/test_config_qdrant.py
.venv/bin/python -m pytest test/unit_test/rag/utils/test_qdrant_conn.py test/unit_test/admin/server/test_config_qdrant.py
live manual validation against a running Qdrant container for indexing, chunk browsing, and chat retrieval

Notes

This PR is intentionally scoped as an opt-in experimental backend addition, not as a claim of full Elasticsearch feature parity.
Existing Elasticsearch/Infinity behavior is unchanged unless users explicitly select DOC_ENGINE=qdrant.
The broader Qdrant-first multimodal / Vision-RAG direction is a separate roadmap discussion; this PR is only the backend groundwork for that path.
The generic chat retrieval fix and the generic chunk raw-content fix are submitted separately in dedicated PRs.

JosefAschauer · 2026-03-09T15:05:28Z

Related PRs for this workstream:

fix: avoid empty doc filter in knowledge retrieval #13484 fixes a generic chat retrieval bug in the shared conversation path.
fix: preserve raw chunk content in chunk APIs #13485 fixes a generic chunk API bug so raw chunk content is preserved and highlights are returned separately.

Those two fixes are intentionally submitted as separate PRs so they can be reviewed and merged independently of the Qdrant backend integration.

JosefAschauer · 2026-03-09T15:07:53Z

Suggested review order:

The three PRs are related, but #13484 and #13485 are independent generic fixes. This PR is the Qdrant backend integration.

JosefAschauer · 2026-03-09T20:54:35Z

Added upstream-facing Qdrant documentation to this PR:

QDRANT_README.md documents the backend design, dense and sparse setup, validation steps, troubleshooting, current limitations, and future multivector extension points.
README.md now links to it from the doc-engine/configuration section and the documentation list.

This keeps the Qdrant-specific setup and operational notes in the repo instead of only in the PR discussion.

yingfeng · 2026-03-10T03:14:02Z

RAGFlow relies heavily on keyword-based full-text search. I'm uncertain whether Qdrant offers capabilities comparable to Elasticsearch in this regard. Additionally, the use of sparse vector search differs fundamentally from traditional full-text search, which raises further questions about suitability.

JosefAschauer · 2026-03-10T07:07:35Z

Thank you for the review — this is an important point and I want to address it directly.

You're right: Qdrant's full-text capabilities are not equivalent to Elasticsearch's BM25. Sparse vector search (SPLADE-style) captures semantic term importance but doesn't replace ES's mature lexical features like language-specific analyzers, stemming, phrase matching, or the fine-grained BM25 scoring that RAGFlow's "multiple recall" pipeline depends on. I don't want to pretend otherwise.

The Qdrant backend is not intended to replace Elasticsearch for text-heavy RAG workloads. ES remains the right choice there, and this PR doesn't change anything about the ES path.

The motivation for Qdrant is a different use case: vision-based document retrieval using ColPali/ColQwen multi-vector embeddings. ColPali produces ~1,024 patch embeddings per page using late-interaction (MaxSim scoring), which requires native multi-vector support in the vector database. Neither Elasticsearch nor Infinity supports this. Qdrant does — it's one of very few databases with first-class multi-vector and MaxSim support.

The roadmap is:

This PR: Qdrant as a dense vector backend with optional sparse hybrid search (current)
Follow-up: A "Vision" dataset template that uses ColPali for visual document retrieval, storing multi-vector patch embeddings in Qdrant

The Vision template would appear as an additional dataset option alongside Naive, QA, Table, etc. — not a replacement for any existing template. Users working with visually rich documents (charts, diagrams, infographics, complex tables) would select it. Users with text-heavy documents would continue using ES with existing templates.

To make the scope clearer, I'm happy to:

Add explicit documentation that the Qdrant backend prioritizes vector retrieval and does NOT offer full BM25 equivalence
Mark the Qdrant backend as experimental/beta in the UI
Add a note in the dataset configuration guiding users on when to choose Qdrant vs ES

Would it help if I also opened an RFC issue describing the ColPali Vision template design? That would give more context on why multi-vector support matters and what the complete picture looks like.

I have the RFC drafted and ready to post if that would be useful for the discussion.

yingfeng · 2026-03-10T07:19:08Z

Support for late-interaction models represents a key capability that RAGFlow plans to offer, serving as one of the distinguishing features of the Infinity .
Actually, a production ready of late-interaction model based cross-modal retrieval requires the a series of configuration, such as whether to use full-text + late interaction as hybrid retrieval, or whether use full-text + dense retrieval with a late interaction based reranker,...,etc.
Therefore, if what you want is a multi-vector based cross modal retrieval, just wait~

JosefAschauer · 2026-03-10T07:43:51Z

Thanks for sharing the roadmap — it's great to hear that late-interaction retrieval is planned for Infinity. That's exactly the capability RAGFlow needs for visually rich documents.

A couple of thoughts from a contributor and user perspective:

Many RAGFlow users already run Qdrant or Milvus in their existing infrastructure. Adding a third-party vector database option would lower the barrier for adoption — users wouldn't need to migrate their vector infrastructure to try RAGFlow. This is similar to how RAGFlow already supports multiple LLM providers rather than locking users into one.

JosefAschauer · 2026-03-10T09:40:02Z

Good catches. Pushed a cleanup:

Removed the hand-rolled fake Qdrant type system (~350 lines) and the sys.modules import hack. Tests now use real qdrant_client.models with a thin in-memory client mock.
Renamed MULTIVECTOR_PLACEHOLDER_NAME to MULTIVECTOR_VECTOR_NAME.
Trimmed the ColPali/ColQwen PR hook comment blocks to single-line future markers.
Removed unused imports and dead assignments.

Net: -273 lines, all 20 focused tests pass.

JosefAschauer · 2026-03-10T10:02:09Z

Future work: the current implementation uses a single global DOC_ENGINE, so it is not yet possible to mix Elasticsearch for text-heavy KBs and Qdrant for visual/multivector KBs in one deployment.

A sensible next step would be per-KB engine routing:

text KBs on Elasticsearch or Infinity for strong full-text retrieval
visual KBs on Qdrant for multivector retrieval

That would require a routing layer above the current global settings.docStoreConn / settings.retriever model, plus KB-level backend metadata and mixed-engine retrieval handling. This PR does not implement that, but it can be a useful input to that later design.

Add Qdrant as an upstream-supported RAGFlow document/vector backend with optional sparse hybrid retrieval. Dense retrieval works by default using the existing embedding flow. Sparse hybrid retrieval (dense+sparse+RRF) activates only when a sparse embedding model is configured. Qdrant text matching uses tokenized payload matching, not Elasticsearch BM25 parity. Elasticsearch and Infinity behavior are unchanged. Key changes: - rag/utils/qdrant_conn.py: main adapter implementing DocStoreConnection - rag/utils/sparse_vector.py: token-based sparse embedding model - common/settings.py: DOC_ENGINE=qdrant routing and config - rag/nlp/search.py: sparse query path with RRF fusion - rag/llm/embedding_model.py: sparse encoding methods on Base class - api/db/services/llm_service.py: LLMBundle sparse delegation - api/db/services/dialog_service.py: SQL fallback disabled for Qdrant - api/db/services/doc_metadata_service.py: generic docStoreConn usage - Manual chunk paths (chunk_app, sdk/doc, document_service): sparse vectors - rag/svr/task_executor.py: sparse vector generation during ingestion - Docker: qdrant service, env vars, service_conf template - Tests: 15 unit tests covering dense, hybrid, CRUD, and SQL fallback Schema is left extensible for future multivector/late-interaction retrieval (ColPali/ColQwen). This commit does not implement ColPali/ColQwen. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

JosefAschauer · 2026-03-12T18:15:25Z

Qdrant is not just "vector-only" anymore. It supports sparse vector indexing alongside dense vectors in the same collection, which means lexical scoring can be modeled there as sparse retrieval instead of relying on a separate inverted-index engine. That makes it a credible backend for RAGFlow’s primary chunk-retrieval path, especially because hybrid dense+sparse fusion is native. At the same time, I’m not claiming full Elasticsearch feature parity from this PR: Elasticsearch’s native lexical core is still inverted-index/BM25, while Qdrant reaches similar behavior through sparse vectors, and RAGFlow still has separate gaps such as SQL retrieval and message-store support outside the core retrieval path.

JosefAschauer · 2026-03-15T11:02:02Z

I think the cleanest framing for #13486 is as an opt-in experimental backend addition rather than as a claim of full Elasticsearch parity. The code path is intentionally backward-compatible for existing backends, and the larger roadmap question is really about whether RAGFlow should support a Qdrant-first multimodal/Vision-RAG direction. I wrote up that larger argument separately as an RFC so the PR can stay focused on the technical review of the backend groundwork.

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. 🐖api The modified files are located under directory 'api/apps/sdk' 💞 feature Feature request, pull request that fullfill a new feature. labels Mar 9, 2026

JosefAschauer mentioned this pull request Mar 9, 2026

fix: avoid empty doc filter in knowledge retrieval #13484

Merged

JosefAschauer mentioned this pull request Mar 9, 2026

fix: preserve raw chunk content in chunk APIs #13485

Open

JosefAschauer mentioned this pull request Mar 9, 2026

[Feature Request]: Support Qdrant as vector DB in RagFlow #6546

Open

4 tasks

JosefAschauer and others added 5 commits March 10, 2026 11:13

fix: harden qdrant integration after live validation

29e4388

docs: add qdrant backend notes

83cb78d

test: simplify qdrant backend coverage

95c508c

fix: avoid qdrant settings lint warning

e9b2dab

JosefAschauer force-pushed the feat/qdrant-docstore-pr branch from c087e8b to e9b2dab Compare March 10, 2026 11:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add qdrant doc store backend#13486

feat: add qdrant doc store backend#13486
JosefAschauer wants to merge 5 commits intoinfiniflow:mainfrom
JosefAschauer:feat/qdrant-docstore-pr

JosefAschauer commented Mar 9, 2026 •

edited

Loading

Uh oh!

JosefAschauer commented Mar 9, 2026

Uh oh!

JosefAschauer commented Mar 9, 2026

Uh oh!

JosefAschauer commented Mar 9, 2026

Uh oh!

yingfeng commented Mar 10, 2026

Uh oh!

JosefAschauer commented Mar 10, 2026

Uh oh!

yingfeng commented Mar 10, 2026

Uh oh!

JosefAschauer commented Mar 10, 2026

Uh oh!

JosefAschauer commented Mar 10, 2026

Uh oh!

JosefAschauer commented Mar 10, 2026

Uh oh!

JosefAschauer commented Mar 12, 2026

Uh oh!

JosefAschauer commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JosefAschauer commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Included

Follow-up Hardening Included In This PR

Validation

Notes

Uh oh!

JosefAschauer commented Mar 9, 2026

Uh oh!

JosefAschauer commented Mar 9, 2026

Uh oh!

JosefAschauer commented Mar 9, 2026

Uh oh!

yingfeng commented Mar 10, 2026

Uh oh!

JosefAschauer commented Mar 10, 2026

Uh oh!

yingfeng commented Mar 10, 2026

Uh oh!

JosefAschauer commented Mar 10, 2026

Uh oh!

JosefAschauer commented Mar 10, 2026

Uh oh!

JosefAschauer commented Mar 10, 2026

Uh oh!

JosefAschauer commented Mar 12, 2026

Uh oh!

JosefAschauer commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JosefAschauer commented Mar 9, 2026 •

edited

Loading