Skip to content

Latest commit

 

History

History
882 lines (823 loc) · 46.8 KB

File metadata and controls

882 lines (823 loc) · 46.8 KB

FiftyOne Plugins 🔌🚀

Discord Hugging Face Voxel51 Blog Newsletter LinkedIn Twitter Medium

FiftyOne provides a powerful plugin framework that allows for extending and customizing the functionality of the tool.

With plugins, you can add new functionality to the FiftyOne App, create integrations with other tools and APIs, render custom panels, and add custom buttons to menus.

With FiftyOne Teams, you can even write plugins that allow users to execute long-running tasks from within the App that run on a connected compute cluster.

For example, here's a taste of what you can do with the @voxel51/brain plugin!

brain.mp4

Table of Contents

This repository contains a curated collection of FiftyOne Plugins, organized into the following categories:

  • Core Plugins: core functionality that all FiftyOne users will likely want to install. These plugins are maintained by the FiftyOne team
  • Voxel51 Plugins: non-core plugins that are officially maintained by the FiftyOne team
  • Example Plugins: these plugins exist to inspire and educate you to create your own plugins! Each emphasizes a different aspect of the plugin system
  • Community Plugins: third-party plugins that are contributed and maintained by the community. These plugins are not officially supported by the FiftyOne team, but they're likely awesome!

🔌🤝 Contribute Your Own Plugin 🚀🚀

Want to showcase your own plugin here? See the contributing section for instructions!

Core Plugins

Name Tags Description
@voxel51/annotation annotation ✏️ Utilities for integrating FiftyOne with annotation tools
@voxel51/brain curation visualization 🧠 Utilities for working with the FiftyOne Brain
@voxel51/dashboard visualization 📊 Create your own custom dashboards from within the App
@voxel51/evaluation evaluation ✅ Utilities for evaluating models with FiftyOne
@voxel51/io io 📁 A collection of import/export utilities
@voxel51/indexes utils 📈 Utilities working with FiftyOne database indexes
@voxel51/plugins utils 🧩 Utilities for managing and building FiftyOne plugins
@voxel51/delegated utils 📡 Utilities for managing your delegated operations
@voxel51/runs utils 🏃 Utilities for managing your custom runs
@voxel51/utils utils ⚒️ Call your favorite SDK utilities from the App
@voxel51/zoo model dataset 🌎 Download datasets and run inference with models from the FiftyOne Zoo, all without leaving the App

Voxel51 Plugins

Name Tags Description
@voxel51/voxelgpt examples 🤖 An AI assistant that can query visual datasets, search the FiftyOne docs, and answer general computer vision questions
@voxel51/mlflow training 📋 Track model training experiments on your FiftyOne datasets with MLflow!
@voxel51/huggingface_hub dataset huggingface 🤗 Push FiftyOne datasets to the Hugging Face Hub, and load datasets from the Hub into FiftyOne!
@voxel51/transformers model huggingface 🤗 Run inference on your datasets using Hugging Face Transformers models!
@voxel51/rerun-plugin visualization 🎥 Visualize Rerun data files (.rrd) inside the FiftyOne App

Example Plugins

Name Tags Description
@voxel51/hello-world examples 👋 An example plugin that contains both Python and JavaScript components
@voxel51/operator-examples examples ⚙️ A collection of example operators showing how to use the operator type system to build custom FiftyOne operations
@voxel51/panel-examples examples 📊 A collection of example panels demonstrating common patterns for building Python panels

Community Plugins

🔌🤝 Contribute Your Own Plugin 🚀🚀

Want to showcase your own plugin here? See the contributing section for instructions!

Name Tags Description
@landingai/ade ocr model text 📄 Parse, extract, and split documents using LandingAI's Agentic Document Extraction (ADE) API. Converts PDFs, images, spreadsheets, and Office files into structured Markdown with spatial bounding box grounding.
@parva101/video2dataset utils 🎬 Convert YouTube URLs or local videos into FiftyOne image datasets with uniform/scene-change/hybrid frame sampling, perceptual deduplication, and source metadata.
@harpreetsahota/molmo_point model tracking 🫵🏻 Integrating MolmoPoint a model that locates and tracks objects in images and videos by pointing and returning precise pixel coordinates
@harpreetsahota/online_video_depth_anything model depth 🏙️ Integrating Online Video Depth Anything (oVDA) for a temporally-consistent monocular depth estimator for videos that runs in an online setting with low VRAM consumption.
@harpreetsahota/vlm_prompt_lab model vlm 👨🏽‍🔬 Experiment with any VLM that can be run in a Hugging Face image-text-to-text pipeline right in the FiftyOne App!
@ehofesmann/envi-spetral-viewer visualization hyperspectral 🌈 Explore hyperspectral image datasets, interactively visualize pixel-level spectra, and dynamically recolor images.
@harpreetsahota/qwen3_5_vl model 👀 Implementing Qwen3.5VL as a Remote Source Zoo Model for FiftyOne.
@harpreetsahota/FiftyComfy visualization ☺️ A FiftyOne Panel for modular node-based workflows which takes inspiration from ComfyUI.
@harpreetsahota/hf_fine_tuner_plugin model huggingface 🎚️ A plugin to fine-tune Hugging Face models on your FiftyOne Dataset.
@harpreetsahota/image_editing_panel model huggingface 🎑 Chat-based image editing powered by HuggingFace image-to-image Inference API.
@harpreetsahota/qwen_image_edit model 🏞️ Chat-based image editing powered by drbaph/Qwen-Image-Edit-2511-FP8
@mgustineli/roi-patches curation 🪟 Tile images into a configurable grid of ROI patches with adjustable overlap for region-based analysis, using FiftyOne's native patches view.
@harpreetsahota/LightOnOCR-2 model vlm 📑 LightOnOCR-2-1B is a compact multilingual VLM that converts document images into clean, naturally ordered text without brittle multi-stage OCR pipelines.
@harpreetsahota/glm_ocr model vlm 📄 GLM-OCR is a lightweight 0.9B vision language model achieving state-of-the-art document understanding, including formula recognition, table recognition, and structured information extraction.
@harpreetsahota/cradiov4 model embeddings 📻 CRADIOv4 performs visual feature extraction whose image embeddings can be used by a downstream model for various tasks. This implementation also produces attention maps.
@perceptron-ai-inc/isaac-0_2 model vlm 🤖 Isaac-0.2 is Perceptron AI's hybrid-reasoning vision language model supporting object detection, keypoint detection, OCR, instance segmentation, visual question answering, and UI understanding. Includes thinking and tool use for improving detection in complex scenes.
@harpreetsahota/medgemma_1_5 model medical 🩻 Implementing MedGemma 1.5 as a Remote Zoo Model for FiftyOne
@harpreetsahota/qwen3vl_embeddings model embeddings 📼 Qwen3-VL-Embedding maps text, images, and video into a unified representation space, enabling powerful cross-modal retrieval and understanding.
@ardamamur/egoexor dataset medical 🏥 EgoExOR is an Operating Room dataset fusing egocentric and exocentric perspectives for surgical procedures. See here to load it with FiftyOne.
@harpreetsahota/molmo2 model vlm 📹 Molmo2 is a family of open vision language models developed by the Allen Institute for AI (Ai2) that support image, video, and multi-image understanding and grounding.
@harpreetsahota/apple_sharp model 3d 🧊 SHARP is Apple's state-of-the-art model for predicting 3D Gaussian Splats from a single RGB image. This integration brings SHARP to FiftyOne, enabling batch inference on image datasets with 3D visualization.
@harpreetsahota/sam3_images model segmentation 🖼️ Integration of Meta's SAM3 (Segment Anything Model 3) into FiftyOne, with full support of text prompts, keypoint prompts, bounding box prompts, auto segmentation, and image embeddings.
@harpreetsahota/qwen3vl_video model vlm 🎥 A FiftyOne zoo model integration for Qwen3-VL that enables comprehensive video understanding with multiple label types in a single forward pass and for computing video embeddings.
@harpreetsahota/text_evaluation_metrics model evaluation text 🔡 This plugin provides five text evaluation metrics for comparing predictions against ground truth: ANLS, Exact Match, Normalized Similarity, Character Error Rate, and Word Error Rate.
@harpreetsahota/mineru_2_5 model vlm 📜 MinerU2.5 is a 1.2B-parameter vision language model for efficient high-resolution document parsing. This model can support grounding OCR as well as free text OCR.
@harpreetsahota/nomic-embed-multimodal model embeddings 📜 Nomic Embed Multimodal is a family of vision language models built on Qwen2.5-VL that generates high-dimensional embeddings for both images and text in a shared vector space.
@harpreetsahota/bimodernvbert model embeddings 🗂️ BiModernVBert is a vision language model built on the ModernVBert architecture that generates embeddings for both images and text in a shared 768-dimensional vector space.
@harpreetsahota/colmodernvbert model embeddings 📑 ColModernVBert is a multi-vector vision language model built on the ModernVBert architecture that generates ColBERT-style embeddings for both images and text.
@harpreetsahota/deepseek_ocr model vlm 🐳 DeepSeek-OCR is a vision language model designed for optical character recognition with a focus on "contextual optical compression."
@harpreetsahota/olmOCR-2 model ocr 📊 olmOCR-2 is a state-of-the-art OCR model built on Qwen2.5-VL architecture that extracts text from document images with high accuracy.
@harpreetsahota/jina_embeddings_v4 model embeddings 📑 Jina Embeddings v4 is a state-of-the-art vision language model that generates embeddings for both images and text in a shared vector space.
@harpreetsahota/colqwen2_5_v0_2 model embeddings 🗃️ ColQwen2.5 is a vision language model based on Qwen2.5-VL-3B-Instruct that generates ColBERT-style multi-vector representations for efficient document retrieval. This version takes dynamic image resolutions (up to 768 image patches) and doesn't resize them, preserving aspect ratios for better accuracy.
@harpreetsahota/nanonets_ocr2 model ocr 📄 Nanonets-OCR2 transforms documents into structured markdown with intelligent content recognition and semantic tagging, making it ideal for downstream processing by Large Language Models (LLMs).
@harpreetsahota/colpali_v1_3 model embeddings 📃 ColPali is a vision language model based on PaliGemma-3B that generates ColBERT-style multi-vector representations for efficient document retrieval.
@harpreetsahota/kosmos2_5 model ocr 📑 Kosmos-2.5 excels at two core tasks: generating spatially-aware text blocks (OCR) and producing structured markdown output from images.
@harpreetsahota/moondream3 model vlm 🌝 Moondream 3 (Preview) is an vision language model with a mixture-of-experts architecture (9B total parameters, 2B active). This model makes no compromises, delivering state-of-the-art visual reasoning while still retaining our efficient and deployment-friendly ethos.
@harpreetsahota/caption_viewer visualization vlm 🖥️ A plugin that intelligently displays and formats vision language model outputs and text fields. Perfect for viewing OCR results, receipt analysis, document processing, and any text-heavy computer vision workflows.
@harpreetsahota/fiftyone_wandb_plugin utils evaluation 📉 This plugin connects FiftyOne datasets with Weights & Biases to enable reproducible, data-centric ML workflows.
@harpreetsahota/isaac0_1 model vlm 🤖 Isaac-0.1 is the first in Perceptron AI's family of models built to be the intelligence layer for the physical world. This integration supports various computer vision tasks including object detection, classification, OCR, visual question answering, and more.
@vlm-run/vlmrun-voxel51-plugin model vlm 🎯 Extract structured data from visual and audio sources including documents, images, and videos
@harpreetsahota/minicpm-v model vlm 👁️ Integrating MiniCPM-V 4.5 as a Remote Source Zoo Model in FiftyOne
@harpreetsahota/fast_vlm model vlm 💨 Integrating FastVLM as a Remote Source Zoo Model for FiftyOne
@harpreetsahota/gui_actor model vlm 🖥️ Implementing Microsoft's GUI Actor as a Remote Zoo Model for FiftyOne
@harpreetsahota/synthetic_gui_samples_plugins model vlm 🧪 A FiftyOne plugin for generating synthetic samples for datasets in COCO4GUI format
@harpreetsahota/coco4gui_fiftyone io 💽 Implementing the COCO4GUI dataset type in FiftyOne with importers and exports
@harpreetsahota/fiftyone_lerobot_importer io 🤖 Import your LeRobot format dataset into FiftyOne format
@harpreetsahota/medsiglip model medical 🩻 Implementing MedSigLIP as a Remote Zoo Model for FiftyOne
@harpreetsahota/florence2 model vlm 🏛️ Implementing Florence2 as a Remote Zoo Model for FiftyOne
@harpreetsahota/medgemma model medical 🩻 Implementing MedGemma as a Remote Zoo Model for FiftyOne
@harpreetsahota/moondream2 model vlm 🌔 Moondream2 implementation as a remotely sourced zoo model for FiftyOne
@harpreetsahota/qwen2_5_vl model vlm 👀 Implementing Qwen2.5-VL as a Remote Zoo Model for FiftyOne
@harpreetsahota/paligemma2 model vlm 💎 Implementing PaliGemma-2-Mix as a Remote Zoo Model for FiftyOne
@harpreetsahota/siglip2 model vlm 🔎 A FiftyOne Remotely Sourced Zoo Model integration for Google's SigLIP2 model enabling natural language search across images in your FiftyOne Dataset
@harpreetsahota/os_atlas model vlm 🖥️ Integrating OS-Atlas Base into FiftyOne as a Remote Source Zoo Model
@harpreetsahota/Nemotron_Nano_VL model vlm 👁️ Implementing Llama-3.1-Nemotron-Nano-VL-8B-V1 as a Remote Zoo Model for FiftyOne
@harpreetsahota/UI_TARS model vlm 🖥️ Implementing UI-TARS-1.5 as a Remote Zoo Model for FiftyOne
@harpreetsahota/MiMo_VL model vlm 🎨 Implementing MiMo-VL as a Remote Zoo Model for FiftyOne
@harpreetsahota/Kimi_VL_A3B model vlm 👀 FiftyOne Remotely Sourced Zoo Model integration for Moonshot AI's Kimi-VL-A3B models enabling object detection, keypoint localization, and image classification with strong GUI and document understanding capabilities.
@harpreetsahota/vggt model 3d 🎲 Implemeting Meta AI's VGGT as a FiftyOne Remote Zoo Model
@harpreetsahota/NVLabs_CRADIOV3 model embeddings 📻 Implementing NVLabs C-RADIOv3 Embeddings Model as Remotely Sourced Zoo Model for FiftyOne
@harpreetsahota/nemo_retriever_parse_plugin model ocr 📜 Implementing NVIDIA NeMo Retriever Parse as a FiftyOne Plugin
@harpreetsahota/visual_document_retrieval model ocr 📄 A FiftyOne Remotely Sourced Zoo Model integration for LlamaIndex's VDR model enabling natural language search across document images, screenshots, and charts in your datasets.
@harpreetsahota/ShowUI model vlm 🖥️ Integrating ShowUI into FiftyOne as a Remote Source Zoo Model
@harpreetsahota/vitpose model pose 🧘🏽 Run ViTPose Models from Hugging Face on your FiftyOne Dataset
@harpreetsahota/depth_pro_plugin model depth 🥽 Perfom zero-shot metric monocular depth estimation using the Apple Depth Pro model
@harpreetsahota/janus_vqa model vlm 🐋 Run the Janus Pro Models from Deepseek on your Fiftyone Dataset
@harpreetsahota/hiera_video_embeddings model video 🎥 Compute embeddings for video using Facebook Hiera Models
@segmentsai/segments-voxel51-plugin annotation ✏️ Integrate FiftyOne with the Segments.ai annotation tool!
@jacobmarks/image_issues curation 🌩️ Find common image quality issues in your datasets
@jacobmarks/concept_interpolation curation 📈 Find images that best interpolate between two text-based extremes!
@jacobmarks/text_to_image model vlm 🎨 Add synthetic data from prompts with text-to-image models and FiftyOne!
@jacobmarks/twilio_automation data 📲 Automate data ingestion with Twilio!
@wayofsamu/line2d visualization 📉 Visualize x,y-Points as a line chart.
@jacobmarks/vqa-plugin model vqa ❔ Ask (and answer) open-ended visual questions about your images!
@jacobmarks/youtube_panel_plugin visualization 📺 Play YouTube videos in the FiftyOne App!
@jacobmarks/image_deduplication curation 🪞 Find exact and approximate duplicates in your dataset!
@jacobmarks/keyword_search search 🔑 Perform keyword search on a specified field!
@jacobmarks/pytesseract_ocr model ocr 👓 Run optical character recognition with PyTesseract!
@brimoor/pdf-loader io 📄 Load your PDF documents into FiftyOne as per-page images
@jacobmarks/zero_shot_prediction model 🔮 Run zero-shot (open vocabulary) prediction on your data!
@jacobmarks/active_learning annotation 🏃 Accelerate your data labeling with Active Learning!
@jacobmarks/reverse_image_search curation ⏪ Find the images in your dataset most similar to an image from filesystem or the internet!
@jacobmarks/concept_space_traversal embeddings 🌌 Navigate concept space with CLIP, vector search, and FiftyOne!
@jacobmarks/audio_retrieval audio 🔊 Find the images in your dataset most similar to an audio file!
@jacobmarks/semantic_document_search search 🔎 Perform semantic search on text in your documents!
@allenleetc/model-comparison evaluation ⚖️ Compare two object detection models!
@ehofesmann/filter_values search 🔎 Filter a field of your FiftyOne dataset by one or more values.
@jacobmarks/gpt4_vision model vlm 🤖 Chat with your images using GPT-4 Vision!
@swheaton/anonymize curation 🥸 Anonymize/blur images based on a FiftyOne Detections field.
@jacobmarks/double_band_filter search filter icon Filter on two numeric ranges simultaneously!
@danielgural/semantic_video_search model search filter icon Semantically search through your video datasets using FiftyOne Brain and Twelve Labs!
@jacobmarks/emoji_search examples 😏 Semantically search emojis and copy to clipboard!
@danielgural/img_to_video video 🦋 Bring images to life with image to video!
@ehofesmann/edit_label_attributes annotation ✏️ Edit attributes of your labels directly in the FiftyOne App!
@danielgural/audio_loader audio visualization 🎧 Import your audio datasets as spectograms into FiftyOne!
@jacobmarks/albumentations_augmentation data 🪞 Test out any Albumentations data augmentation transform with FiftyOne!
@jacobmarks/image_captioning model vlm 🖋️ Caption all your images with state of the art vision language models!
@jacobmarks/multimodal_rag search embeddings 🦙 Create and test multimodal RAG pipelines with LlamaIndex, Milvus, and FiftyOne!
@danielgural/optimal_confidence_threshold evaluation 🔍 Find the optimal confidence threshold for your detection models automatically!
@danielgural/outlier_detection curation ❌ Find those troublesome outliers in your dataset automatically!
@danielgural/clustering_algorithms curation 🕵️ Find the clusters in your data using some of the best algorithms available!
@jacobmarks/clustering curation 🍇 Cluster your images using embeddings with FiftyOne and scikit-learn!
@mmoollllee/fiftyone-tile visualization ⬜ Tile your high resolution images to squares for training small object detection models
@mmoollllee/fiftyone-timestamps curation 🕒 Compute datetime-related fields (sunrise, dawn, evening, weekday, ...) from your samples' filenames or creation dates
@allenleetc/plotly-map-panel visualization 🌎 Plotly-based Map Panel with adjustable marker cosmetics!
@madave94/multi_annotator_toolkit annotation 🧹 Tackle noisy annotation! Find and analyze annotation issues in datasets with multiple annotators per image.
@AdonaiVera/fiftyone-vlm-efficient model vlm curation 🪄 Improve VLM training data quality with state-of-the-art dataset pruning and quality techniques
@AdonaiVera/bddoia-fiftyone dataset 🚗 Load and explore the BDDOIA Safe/Unsafe Action dataset via the FiftyOne Zoo
@AdonaiVera/fiftyone-agents vlm evaluation 🤖 A comprehensive FiftyOne plugin for testing and evaluating multiple vision langugage models with dynamic prompts and built-in evaluation capabilities
@AdonaiVera/gemini-vision-plugin model vlm 🔮 This plugin integrates Google Gemini's multimodal vision models (e.g., gemini-2.5-flash) into your FiftyOne workflows. Prompt with text and one or more images; receive a text response grounded in visual inputs
@allenleetc/sample-inspector curation 🔎 Adjust image brightness and contrast and filter semantic masks by class in a sample detail view!
@Burhan-Q/fiftyone-vllm model vlm 🎯 Run inference using an online vLLM instance for image captioning, classification, object detection, VQA, and OCR.
@Burhan-Q/fo-doom examples 👾 Play the classic DOOM (1993) shareware game directly within the FiftyOne App.
@voxel51/davis-2017 dataset video 🚲 Load and explore the DAVIS-2017 video segmentation dataset via the FiftyOne Zoo.
@voxel51/mose-v2 dataset video ⚽ Load and explore the MOSE complex video object segmentation dataset via the FiftyOne Zoo.
@sherpan/torchvision-classifier-finetuner model training 🎛️ Fine-tune pretrained torchvision backbones (ResNet-50, EfficientNet-B2, MobileNetV3) on FiftyOne datasets with Classification labels and run inference directly from the App.

Using Plugins

Install FiftyOne

If you haven't already, install FiftyOne:

pip install fiftyone

Installing a plugin

In general, you can install all plugin(s) in a GitHub repository by running:

fiftyone plugins download https://github.com/path/to/repo

For instance, to install all plugins in this repository, you can run:

fiftyone plugins download https://github.com/voxel51/fiftyone-plugins

You can also install a specific plugin using the --plugin-names flag:

fiftyone plugins download \
    https://github.com/voxel51/fiftyone-plugins \
    --plugin-names <name>

💡 Pro tip: Some plugins require additional setup. Click the plugin's link and navigate to the project's README for instructions.

Plugin management

You can use the CLI commands below to manage your downloaded plugins:

# List all plugins you've downloaded
fiftyone plugins list

# List the available operators and panels
fiftyone operators list

# Disable a particular plugin
fiftyone plugins disable <name>

# Enable a particular plugin
fiftyone plugins enable <name>

Local development

If you plan to develop plugins locally, you can clone the repository and symlink it into your FiftyOne plugins directory like so:

cd /path/to/fiftyone-plugins
ln -s "$(pwd)" "$(fiftyone config plugins_dir)/fiftyone-plugins"

Contributing

Showcasing your plugin 🤝

Have a plugin you'd like to share with the community? Awesome! 🎉🎉🎉

Just follow these steps to add your plugin to this repository:

  1. Make sure your plugin repo has a README.md file that describes the plugin and how to install it
  2. Fork this repository
  3. Add an entry for your plugin to the Community Plugins table above
  4. Submit a pull request into this repository

Contributing to this repository 🙌

You're also welcome to contribute to the plugins that live natively in this repository. Check out the contributions guide for instructions.

Join the Community

If you want join a fast-growing community of engineers, researchers, and practitioners who love computer vision, join the FiftyOne Discord community 🚀🚀🚀

💡 Pro tip: the #plugins channel is a great place to discuss plugins!

About FiftyOne

If you've made it this far, we'd greatly appreciate if you'd take a moment to check out FiftyOne and give us a star!

FiftyOne is an open source library for building high-quality datasets and computer vision models. It's the engine that powers this project.

Thanks for visiting! 😊