Skip to content

feat: add qdrant doc store backend#13486

Open
JosefAschauer wants to merge 5 commits intoinfiniflow:mainfrom
JosefAschauer:feat/qdrant-docstore-pr
Open

feat: add qdrant doc store backend#13486
JosefAschauer wants to merge 5 commits intoinfiniflow:mainfrom
JosefAschauer:feat/qdrant-docstore-pr

Conversation

@JosefAschauer
Copy link
Contributor

@JosefAschauer JosefAschauer commented Mar 9, 2026

Summary

Add Qdrant as an opt-in experimental document store backend for RAGFlow, including dense retrieval, optional sparse hybrid retrieval, Docker/config wiring, and follow-up hardening from live validation.

Included

  • Qdrant document-store adapter with interface parity for the current storage contract
  • dense retrieval and optional sparse hybrid retrieval with RRF fusion
  • factory/settings/config/Docker wiring for DOC_ENGINE=qdrant
  • Qdrant-aware admin health/config integration
  • live-validation hardening for Qdrant point IDs, availability defaults, and backend-specific error reporting
  • unit coverage for adapter behavior and search flow

Follow-up Hardening Included In This PR

  • map legacy string chunk IDs to valid Qdrant point UUIDs while preserving the external RAGFlow IDs
  • treat missing available_int like the existing backend behavior during filtering
  • replace hardcoded Elasticsearch/Infinity error text with the active backend name in affected ingestion paths
  • add admin/server support for qdrant service config inspection

Validation

  • python3 -m py_compile admin/server/config.py api/utils/health_utils.py rag/graphrag/general/index.py rag/graphrag/utils.py rag/svr/task_executor.py rag/utils/qdrant_conn.py test/unit_test/rag/utils/test_qdrant_conn.py test/unit_test/admin/server/test_config_qdrant.py
  • .venv/bin/python -m pytest test/unit_test/rag/utils/test_qdrant_conn.py test/unit_test/admin/server/test_config_qdrant.py
  • live manual validation against a running Qdrant container for indexing, chunk browsing, and chat retrieval

Notes

  • This PR is intentionally scoped as an opt-in experimental backend addition, not as a claim of full Elasticsearch feature parity.
  • Existing Elasticsearch/Infinity behavior is unchanged unless users explicitly select DOC_ENGINE=qdrant.
  • The broader Qdrant-first multimodal / Vision-RAG direction is a separate roadmap discussion; this PR is only the backend groundwork for that path.
  • The generic chat retrieval fix and the generic chunk raw-content fix are submitted separately in dedicated PRs.

@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. 🐖api The modified files are located under directory 'api/apps/sdk' 💞 feature Feature request, pull request that fullfill a new feature. labels Mar 9, 2026
@JosefAschauer
Copy link
Contributor Author

Related PRs for this workstream:

Those two fixes are intentionally submitted as separate PRs so they can be reviewed and merged independently of the Qdrant backend integration.

@JosefAschauer
Copy link
Contributor Author

Suggested review order:

  1. fix: avoid empty doc filter in knowledge retrieval #13484
  2. fix: preserve raw chunk content in chunk APIs #13485
  3. this PR

The three PRs are related, but #13484 and #13485 are independent generic fixes. This PR is the Qdrant backend integration.

@JosefAschauer
Copy link
Contributor Author

Added upstream-facing Qdrant documentation to this PR:

  • QDRANT_README.md documents the backend design, dense and sparse setup, validation steps, troubleshooting, current limitations, and future multivector extension points.
  • README.md now links to it from the doc-engine/configuration section and the documentation list.

This keeps the Qdrant-specific setup and operational notes in the repo instead of only in the PR discussion.

@yingfeng
Copy link
Member

RAGFlow relies heavily on keyword-based full-text search. I'm uncertain whether Qdrant offers capabilities comparable to Elasticsearch in this regard. Additionally, the use of sparse vector search differs fundamentally from traditional full-text search, which raises further questions about suitability.

@JosefAschauer
Copy link
Contributor Author

Thank you for the review — this is an important point and I want to address it directly.

You're right: Qdrant's full-text capabilities are not equivalent to Elasticsearch's BM25. Sparse vector search (SPLADE-style) captures semantic term importance but doesn't replace ES's mature lexical features like language-specific analyzers, stemming, phrase matching, or the fine-grained BM25 scoring that RAGFlow's "multiple recall" pipeline depends on. I don't want to pretend otherwise.

The Qdrant backend is not intended to replace Elasticsearch for text-heavy RAG workloads. ES remains the right choice there, and this PR doesn't change anything about the ES path.

The motivation for Qdrant is a different use case: vision-based document retrieval using ColPali/ColQwen multi-vector embeddings. ColPali produces ~1,024 patch embeddings per page using late-interaction (MaxSim scoring), which requires native multi-vector support in the vector database. Neither Elasticsearch nor Infinity supports this. Qdrant does — it's one of very few databases with first-class multi-vector and MaxSim support.

The roadmap is:

  1. This PR: Qdrant as a dense vector backend with optional sparse hybrid search (current)
  2. Follow-up: A "Vision" dataset template that uses ColPali for visual document retrieval, storing multi-vector patch embeddings in Qdrant

The Vision template would appear as an additional dataset option alongside Naive, QA, Table, etc. — not a replacement for any existing template. Users working with visually rich documents (charts, diagrams, infographics, complex tables) would select it. Users with text-heavy documents would continue using ES with existing templates.

To make the scope clearer, I'm happy to:

  • Add explicit documentation that the Qdrant backend prioritizes vector retrieval and does NOT offer full BM25 equivalence
  • Mark the Qdrant backend as experimental/beta in the UI
  • Add a note in the dataset configuration guiding users on when to choose Qdrant vs ES

Would it help if I also opened an RFC issue describing the ColPali Vision template design? That would give more context on why multi-vector support matters and what the complete picture looks like.

I have the RFC drafted and ready to post if that would be useful for the discussion.

@yingfeng
Copy link
Member

Support for late-interaction models represents a key capability that RAGFlow plans to offer, serving as one of the distinguishing features of the Infinity .
Actually, a production ready of late-interaction model based cross-modal retrieval requires the a series of configuration, such as whether to use full-text + late interaction as hybrid retrieval, or whether use full-text + dense retrieval with a late interaction based reranker,...,etc.
Therefore, if what you want is a multi-vector based cross modal retrieval, just wait~

@JosefAschauer
Copy link
Contributor Author

Thanks for sharing the roadmap — it's great to hear that late-interaction retrieval is planned for Infinity. That's exactly the capability RAGFlow needs for visually rich documents.

A couple of thoughts from a contributor and user perspective:

Many RAGFlow users already run Qdrant or Milvus in their existing infrastructure. Adding a third-party vector database option would lower the barrier for adoption — users wouldn't need to migrate their vector infrastructure to try RAGFlow. This is similar to how RAGFlow already supports multiple LLM providers rather than locking users into one.

@JosefAschauer
Copy link
Contributor Author

Good catches. Pushed a cleanup:

  • Removed the hand-rolled fake Qdrant type system (~350 lines) and the sys.modules import hack. Tests now use real qdrant_client.models with a thin in-memory client mock.
  • Renamed MULTIVECTOR_PLACEHOLDER_NAME to MULTIVECTOR_VECTOR_NAME.
  • Trimmed the ColPali/ColQwen PR hook comment blocks to single-line future markers.
  • Removed unused imports and dead assignments.

Net: -273 lines, all 20 focused tests pass.

@JosefAschauer
Copy link
Contributor Author

Future work: the current implementation uses a single global DOC_ENGINE, so it is not yet possible to mix Elasticsearch for text-heavy KBs and Qdrant for visual/multivector KBs in one deployment.

A sensible next step would be per-KB engine routing:

  • text KBs on Elasticsearch or Infinity for strong full-text retrieval
  • visual KBs on Qdrant for multivector retrieval

That would require a routing layer above the current global settings.docStoreConn / settings.retriever model, plus KB-level backend metadata and mixed-engine retrieval handling. This PR does not implement that, but it can be a useful input to that later design.

JosefAschauer and others added 5 commits March 10, 2026 11:13
Add Qdrant as an upstream-supported RAGFlow document/vector backend with
optional sparse hybrid retrieval. Dense retrieval works by default using
the existing embedding flow. Sparse hybrid retrieval (dense+sparse+RRF)
activates only when a sparse embedding model is configured. Qdrant text
matching uses tokenized payload matching, not Elasticsearch BM25 parity.
Elasticsearch and Infinity behavior are unchanged.

Key changes:
- rag/utils/qdrant_conn.py: main adapter implementing DocStoreConnection
- rag/utils/sparse_vector.py: token-based sparse embedding model
- common/settings.py: DOC_ENGINE=qdrant routing and config
- rag/nlp/search.py: sparse query path with RRF fusion
- rag/llm/embedding_model.py: sparse encoding methods on Base class
- api/db/services/llm_service.py: LLMBundle sparse delegation
- api/db/services/dialog_service.py: SQL fallback disabled for Qdrant
- api/db/services/doc_metadata_service.py: generic docStoreConn usage
- Manual chunk paths (chunk_app, sdk/doc, document_service): sparse vectors
- rag/svr/task_executor.py: sparse vector generation during ingestion
- Docker: qdrant service, env vars, service_conf template
- Tests: 15 unit tests covering dense, hybrid, CRUD, and SQL fallback

Schema is left extensible for future multivector/late-interaction retrieval
(ColPali/ColQwen). This commit does not implement ColPali/ColQwen.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@JosefAschauer JosefAschauer force-pushed the feat/qdrant-docstore-pr branch from c087e8b to e9b2dab Compare March 10, 2026 11:22
@JosefAschauer
Copy link
Contributor Author

Qdrant is not just "vector-only" anymore. It supports sparse vector indexing alongside dense vectors in the same collection, which means lexical scoring can be modeled there as sparse retrieval instead of relying on a separate inverted-index engine. That makes it a credible backend for RAGFlow’s primary chunk-retrieval path, especially because hybrid dense+sparse fusion is native. At the same time, I’m not claiming full Elasticsearch feature parity from this PR: Elasticsearch’s native lexical core is still inverted-index/BM25, while Qdrant reaches similar behavior through sparse vectors, and RAGFlow still has separate gaps such as SQL retrieval and message-store support outside the core retrieval path.

@JosefAschauer
Copy link
Contributor Author

I think the cleanest framing for #13486 is as an opt-in experimental backend addition rather than as a claim of full Elasticsearch parity. The code path is intentionally backward-compatible for existing backends, and the larger roadmap question is really about whether RAGFlow should support a Qdrant-first multimodal/Vision-RAG direction. I wrote up that larger argument separately as an RFC so the PR can stay focused on the technical review of the backend groundwork.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🐖api The modified files are located under directory 'api/apps/sdk' 💞 feature Feature request, pull request that fullfill a new feature. size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants