Thank you for your interest in OpenViking! We welcome contributions of all kinds:
- Bug reports
- Feature requests
- Documentation improvements
- Code contributions
- Python: 3.10+
- Go: 1.22+ (Required for building AGFS components from source)
- C++ Compiler: GCC 9+ or Clang 11+ (Required for building core extensions, must support C++17)
- CMake: 3.12+
OpenViking provides pre-compiled Wheel packages for the following environments:
- Windows: x86_64
- macOS: x86_64, arm64 (Apple Silicon)
- Linux: x86_64 (manylinux)
For other platforms (e.g., Linux ARM64, FreeBSD), the package will be automatically compiled from source during installation via pip. Ensure you have the Prerequisites installed.
git clone https://github.com/YOUR_USERNAME/openviking.git
cd openvikingWe recommend using uv for Python environment management:
# Install uv (if not installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Sync dependencies and create virtual environment
uv sync --all-extras
source .venv/bin/activate # Linux/macOS
# or .venv\Scripts\activate # WindowsOpenViking defaults to binding-client mode for AGFS, which requires a pre-built shared library. If you modify the AGFS (Go) code or C++ extensions, or if the pre-built library is not found, you need to re-compile and re-install them. Run the following command in the project root:
uv pip install -e . --force-reinstallThis command ensures that setup.py is re-executed, triggering the compilation of AGFS and C++ components.
Create a configuration file ~/.openviking/ov.conf:
{
"embedding": {
"dense": {
"provider": "volcengine",
"api_key": "your-api-key",
"model": "doubao-embedding-vision-250615",
"api_base": "https://ark.cn-beijing.volces.com/api/v3",
"dimension": 1024,
"input": "multimodal"
}
},
"vlm": {
"api_key": "your-api-key",
"model": "doubao-seed-2-0-pro-260215",
"api_base": "https://ark.cn-beijing.volces.com/api/v3"
}
}Set the environment variable:
export OPENVIKING_CONFIG_FILE=~/.openviking/ov.confimport asyncio
import openviking as ov
async def main():
client = ov.AsyncOpenViking(path="./test_data")
await client.initialize()
print("OpenViking initialized successfully!")
await client.close()
asyncio.run(main())The Rust CLI (ov) provides a high-performance command-line client for interacting with OpenViking Server.
Prerequisites: Rust >= 1.88
# Build and install from source
cargo install --path crates/ov_cli
# Or use the quick install script (downloads pre-built binary)
curl -fsSL https://raw.githubusercontent.com/volcengine/OpenViking/main/crates/ov_cli/install.sh | bashAfter installation, run ov --help to see all available commands. CLI connection config goes in ~/.openviking/ovcli.conf.
openviking/
├── pyproject.toml # Project configuration
├── Cargo.toml # Rust workspace configuration
├── third_party/ # Third-party dependencies
│ └── agfs/ # AGFS filesystem
│
├── openviking/ # Python SDK
│ ├── async_client.py # AsyncOpenViking client
│ ├── sync_client.py # SyncOpenViking client
│ │
│ ├── core/ # Core data models
│ │ ├── context.py # Context base class
│ │ └── directories.py # Directory definitions
│ │
│ ├── parse/ # Resource parsers
│ │ ├── parsers/ # Parser implementations
│ │ ├── tree_builder.py
│ │ └── registry.py
│ │
│ ├── retrieve/ # Retrieval system
│ │ ├── retriever.py # Main retriever
│ │ ├── reranker.py # Reranking
│ │ └── intent_analyzer.py
│ │
│ ├── session/ # Session management
│ │ ├── session.py # Session core
│ │ └── compressor.py # Compression
│ │
│ ├── server/ # HTTP server
│ │ ├── app.py # FastAPI app factory
│ │ ├── bootstrap.py # Entry point (openviking-server)
│ │ └── routers/ # API routers
│ │
│ ├── storage/ # Storage layer
│ │ ├── viking_fs.py # VikingFS
│ │ └── vectordb/ # Vector database
│ │
│ ├── utils/ # Utilities
│ │ └── config/ # Configuration
│ │
│ └── prompts/ # Prompt templates
│
├── crates/ # Rust components
│ └── ov_cli/ # Rust CLI client
│ ├── src/ # CLI source code
│ └── install.sh # Quick install script
│
├── src/ # C++ extensions (pybind11)
│
├── tests/ # Test suite
│ ├── client/ # Client tests
│ ├── server/ # Server tests
│ ├── session/ # Session tests
│ ├── parse/ # Parser tests
│ ├── vectordb/ # Vector database tests
│ └── integration/ # Integration tests
│
└── docs/ # Documentation
├── en/ # English docs
└── zh/ # Chinese docs
We use the following tools to maintain code consistency:
| Tool | Purpose | Config |
|---|---|---|
| Ruff | Linting, Formatting, Import sorting | pyproject.toml |
| mypy | Type checking | pyproject.toml |
We use pre-commit to automatically run these checks before every commit. This ensures your code always meets the standards without manual effort.
-
Install pre-commit:
pip install pre-commit
-
Install the git hooks:
pre-commit install
Now, ruff (check & format) will run automatically when you run git commit. If any check fails, it may automatically fix the file. You just need to add the changes and commit again.
# Format code
ruff format openviking/
# Lint
ruff check openviking/
# Type check
mypy openviking/- Line width: 100 characters
- Indentation: 4 spaces
- Strings: Prefer double quotes
- Type hints: Encouraged but not required
- Docstrings: Required for public APIs (1-2 lines max)
# Run all tests
pytest
# Run specific test module
pytest tests/client/ -v
pytest tests/server/ -v
pytest tests/parse/ -v
# Run specific test file
pytest tests/client/test_lifecycle.py
# Run specific test
pytest tests/client/test_lifecycle.py::TestClientInitialization::test_initialize_success
# Run by keyword
pytest -k "search" -v
# Run with coverage
pytest --cov=openviking --cov-report=term-missingTests are organized in subdirectories under tests/. The project uses asyncio_mode = "auto", so async tests do not need the @pytest.mark.asyncio decorator:
# tests/client/test_example.py
from openviking import AsyncOpenViking
class TestAsyncOpenViking:
async def test_initialize(self, uninitialized_client: AsyncOpenViking):
await uninitialized_client.initialize()
assert uninitialized_client._service is not None
await uninitialized_client.close()
async def test_add_resource(self, client: AsyncOpenViking):
result = await client.add_resource(
"./test.md",
reason="test document"
)
assert result["status"] == "success"
assert "root_uri" in resultCommon fixtures are defined in tests/conftest.py, including client (initialized AsyncOpenViking), uninitialized_client, temp_dir, etc.
git checkout main
git pull origin main
git checkout -b feature/your-feature-nameBranch naming conventions:
feature/xxx- New featuresfix/xxx- Bug fixesdocs/xxx- Documentation updatesrefactor/xxx- Code refactoring
- Follow code style guidelines
- Add tests for new functionality
- Update documentation as needed
git add .
git commit -m "feat: add new parser for xlsx files"git push origin feature/your-feature-nameThen create a Pull Request on GitHub.
We follow Conventional Commits:
<type>(<scope>): <subject>
<body>
<footer>
| Type | Description |
|---|---|
feat |
New feature |
fix |
Bug fix |
docs |
Documentation |
style |
Code style (no logic change) |
refactor |
Code refactoring |
perf |
Performance improvement |
test |
Tests |
chore |
Build/tooling |
# New feature
git commit -m "feat(parser): add support for xlsx files"
# Bug fix
git commit -m "fix(retrieval): fix score calculation in rerank"
# Documentation
git commit -m "docs: update quick start guide"
# Refactoring
git commit -m "refactor(storage): simplify interface methods"Use the same format as commit messages.
## Summary
Brief description of the changes and their purpose.
## Type of Change
- [ ] New feature (feat)
- [ ] Bug fix (fix)
- [ ] Documentation (docs)
- [ ] Refactoring (refactor)
- [ ] Other
## Testing
Describe how to test these changes:
- [ ] Unit tests pass
- [ ] Manual testing completed
## Related Issues
- Fixes #123
- Related to #456
## Checklist
- [ ] Code follows project style guidelines
- [ ] Tests added for new functionality
- [ ] Documentation updated (if needed)
- [ ] All tests passWe use GitHub Actions for Continuous Integration and Continuous Deployment. Our workflows are designed to be modular and tiered.
| Event | Workflow | Description |
|---|---|---|
| Pull Request | pr.yml |
Runs Lint (Ruff, Mypy) and Test Lite (Integration tests on Linux + Python 3.10). Provides fast feedback for contributors. (Displayed as 01. Pull Request Checks) |
| Push to Main | ci.yml |
Runs Test Full (All OS: Linux/Win/Mac, All Py versions: 3.10-3.13) and CodeQL (Security scan). Ensures main branch stability. (Displayed as 02. Main Branch Checks) |
| Release Published | release.yml |
Triggered when you create a Release on GitHub. Automatically builds source distribution and wheels, determines version from Git Tag, and publishes to PyPI. (Displayed as 03. Release) |
| Weekly Cron | schedule.yml |
Runs CodeQL security scan every Sunday. (Displayed as 04. Weekly Security Scan) |
Maintainers can manually trigger the following workflows from the "Actions" tab to perform specific tasks or debug issues.
Runs code style checks (Ruff) and type checks (Mypy). No arguments required.
Tip: It is recommended to install pre-commit locally to run these checks automatically before committing (see Automated Checks section above).
Runs fast integration tests, supports custom matrix configuration.
- Inputs:
os_json: JSON string array of OS to run on (e.g.,["ubuntu-24.04"]).python_json: JSON string array of Python versions (e.g.,["3.10"]).
Runs the full test suite on all supported platforms (Linux/Mac/Win) and Python versions (3.10-3.13). Supports custom matrix configuration when triggered manually.
- Inputs:
os_json: List of OS to run on (Default:["ubuntu-24.04", "macos-14", "windows-latest"]).python_json: List of Python versions (Default:["3.10", "3.11", "3.12", "3.13"]).
Runs CodeQL security analysis. No arguments required.
Builds Python wheel packages only, does not publish.
- Inputs:
os_json: List of OS to build on (Default:["ubuntu-24.04", "ubuntu-24.04-arm", "macos-14", "macos-15-intel", "windows-latest"]).python_json: List of Python versions (Default:["3.10", "3.11", "3.12", "3.13"]).build_sdist: Whether to build source distribution (Default:true).build_wheels: Whether to build wheel distribution (Default:true).
Publishes built packages (requires build Run ID) to PyPI.
- Inputs:
target: Select publish target (testpypi,pypi,both).build_run_id: Build Workflow Run ID (Required, get it from the Build run URL).
One-stop build and publish (includes build and publish steps).
Version Numbering & Tag Convention: This project uses
setuptools_scmto automatically extract version numbers from Git Tags.
- Tag Naming Convention: Must follow the
vX.Y.Zformat (e.g.,v0.1.0,v1.2.3). Tags must be compliant with Semantic Versioning.- Release Build: When a Release event is triggered, the version number directly corresponds to the Git Tag (e.g.,
v0.1.0->0.1.0).- Manual/Non-Tag Build: The version number will include the commit count since the last Tag (e.g.,
0.1.1.dev3).- Confirm Version: After the publish job completes, you can see the published version directly in the Notifications area at the top of the Workflow Summary page (e.g.,
Successfully published to PyPI with version: 0.1.8). You can also verify it in the logs or the Artifacts filenames.
- Inputs:
target: Select publish target.none: Build artifacts only (no publish). Used for verifying build capability.testpypi: Publish to TestPyPI. Used for Beta testing.pypi: Publish to official PyPI.both: Publish to both.
os_json: Build platforms (Default includes all).python_json: Python versions (Default includes all).build_sdist: Whether to build source distribution (Default:true).build_wheels: Whether to build wheel distribution (Default:true).
Publishing Notes:
- Test First: It is strongly recommended to publish to TestPyPI for verification before publishing to official PyPI. Note that PyPI and TestPyPI are completely independent environments, and accounts and package data are not shared.
- No Overwrites: Neither PyPI nor TestPyPI allow overwriting existing packages with the same name and version. If you need to republish, you must upgrade the version number (e.g., tag a new version or generate a new dev version). If you try to publish an existing version, the workflow will fail.
Please provide:
-
Environment
- Python version
- OpenViking version
- Operating system
-
Steps to Reproduce
- Detailed steps
- Code snippets
-
Expected vs Actual Behavior
-
Error Logs (if any)
Please describe:
- Problem: What problem are you trying to solve?
- Solution: What solution do you propose?
- Alternatives: Have you considered other approaches?
Documentation is in Markdown format under docs/:
docs/en/- English documentationdocs/zh/- Chinese documentation
- Code examples must be runnable
- Keep documentation in sync with code
- Use clear, concise language
By participating in this project, you agree to:
- Be respectful: Maintain a friendly and professional attitude
- Be inclusive: Welcome contributors from all backgrounds
- Be constructive: Provide helpful feedback
- Stay focused: Keep discussions technical
If you have questions:
Thank you for contributing!