← See All Custom Node Packs

ComfyUI-YoloWorld-EfficientSAM

820

Run ComfyUI Easily with InstaSD

Skip the complex setup. InstaSD helps creative professionals build workflows and deploy them to the world:

  • One-click deployment
  • Any model, any node
  • Powerful GPUs for rapid iteration
Get Started

Documentation

ComfyUI YoloWorld-EfficientSAM

This repository provides a custom implementation for using YOLO-World and EfficientSAM models in ComfyUI. By utilizing advanced object detection and segmentation capabilities, the repository facilitates efficient processing of both images and videos.

Installation

Recommended Method

  • Installation through the ComfyUI Manager is recommended and will be available soon.

Manual Installation

  1. Navigate to the custom_nodes directory:

    cd custom_nodes
    
  2. Clone the repository:

    git clone https://github.com/ZHO-ZHO-ZHO/ComfyUI-YoloWorld-EfficientSAM
    
  3. Change to the repository directory:

    cd custom_nodes/ComfyUI-YoloWorld-EfficientSAM
    
  4. Install the required dependencies:

    pip install -r requirements.txt
    
  5. Restart ComfyUI.

Model Download

  • To fully utilize the functionality, download the efficient_sam_s_cpu.jit and efficient_sam_s_gpu.jit models from EfficientSAM and place them in the custom_nodes/ComfyUI-YoloWorld-EfficientSAM directory.

Purpose of the Repository

The purpose of this repository is to extend ComfyUI with advanced object detection and segmentation features. By integrating YOLO-World and EfficientSAM models, users are able to perform effective detection and segmentation on both images and videos, enabling a variety of content analysis and processing tasks.

Provided Nodes

The repository provides the following custom nodes for ComfyUI:

  • Yoloworld_ModelLoader_Zho: Loads YOLO-World models, supporting three model types—yolo_world/l, yolo_world/m, and yolo_world/s.

  • ESAM_ModelLoader_Zho: Loads EfficientSAM models and supports CUDA or CPU.

  • Yoloworld_ESAM_Zho: Performs detection and segmentation attaching to YOLO-World and EfficientSAM models. It offers configurations such as categories, confidence thresholds, and IoU thresholds.

  • Yoloworld_ESAM_DetectorProvider_Zho: Provides detection and segmentation capabilities, useful in workflow integration with support for overlapping bounding box suppression.

Special Features and Capabilities

  • Model Support: Automatically downloads and loads YOLO-World models and allows for CUDA/CPU selections for EfficientSAM models.
  • Advanced Detection and Segmentation: Integrates object detection and segmentation in a single workflow, facilitating efficient image and video processing.
  • Parameter Configurations: Users can adjust confidence and IoU thresholds, bounding box, and text properties for better control over detection and segmentation outcomes.
  • Mask Handling: Offers capabilities for mask separation and extraction, allowing specific masks to be output independently or combined.
  • Video and Image Workflow Support: Both video and image processing are supported, making the repository versatile for different types of media content.

Utility in ComfyUI Workflows

This repository is highly useful in ComfyUI workflows for users seeking advanced object detection and segmentation solutions. By placing powerful YOLO-World and EfficientSAM capabilities at users' disposal, it enables comprehensive content analysis and flexible manipulation of both image and video data. The configurable parameters and support for mask operations enhance its application in projects requiring precise detection and segmentation refinements.