Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
-
Updated
Oct 30, 2025 - Python
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)
Using Segment-Anything and CLIP to generate pixel-aligned semantic features.
[Official] [IROS 2024] A goal-oriented planning to lift VLN performance for Closed-Loop Navigation: Simple, Yet Effective
Clipora is a powerful toolkit for fine-tuning OpenCLIP models using Low Rank Adapters (LoRA).
A simple open-sourced SigLIP model finetuned on Genshin Impact's image-text pairs.
Text-to-image search with OpenCLIP, Docker, Flask, Faiss, etc. and a basic front-end.
Mori_Cloud is a web platform that allows users to store, manage, and share memorable moments through images and text. Powered by artificial intelligence for smart image search, Mori_Cloud delivers a personalized, secure, and modern user experience.
use SAM and OpenCLIP to perform zero-shot object detection using COCO 2017 val split.
[GENERIC] API for practical large models
VALORA AI is a Multimodal Pricing Prediction Model that uses textual and visual data to make precise predictions on product prices
Masked Multi-Component Gated Decomposition Architecture
Group images by provided labels using OpenAI/CLIP
CLIP based Zero Shot Instance Segmentation
An Edge-AI security pipeline for real-time violence detection and automated forensic reporting using YOLO, CoCa, and Qwen2.5-VL.
Turn any YouTube video into viral clips for free!
An AI-powered computer vision system that automatically selects the best wedding photos from thousands of images.
A redesigned CLIP architecture replacing the ViT encoder with modern CNN backbones (ConvNeXt V2) to improve efficiency and inference speed while maintaining strong vision-language alignment, enabling real-time use in robotics and edge applications.
Add a description, image, and links to the openclip topic page so that developers can more easily learn about it.
To associate your repository with the openclip topic, visit your repo's landing page and select "manage topics."