ComfyUI-YoloWorld-EfficientSAM
Run ComfyUI Easily with InstaSD
Skip the complex setup. InstaSD helps creative professionals build workflows and deploy them to the world:
- One-click deployment
- Any model, any node
- Powerful GPUs for rapid iteration
Available Nodes
Yoloworld_ESAM_DetectorProvider_Zho
Yoloworld_ESAM_DetectorProvider_Zho
Overview
The Yoloworld_ESAM_DetectorProvider_Zho node is a specialized component in the ComfyUI environment designed to facilitate object detection and segmentation using the YOLO-World and EfficientSAM models. This node is ideal for users who need to perform efficient and robust object detection and segmentation on images or videos within ComfyUI workflows. It provides powerful features to cater to complex detection scenarios with customizable options.
Functionality
This node provides two main functionalities:
- Bounding Box Detection: It uses the YOLO-World model to detect objects within images and provides their bounding boxes.
- Segmentation with Detection: Optionally, it integrates the EfficientSAM model to perform detailed segmentation of detected objects, producing precise masks beyond simple bounding boxes.
Inputs
The node accepts the following inputs:
-
YOLO-World Model (
yolo_world_model): This is a required input where the user selects the pre-trained YOLO-World model to be used for detection. The model can be one of the supported layers: l, m, or s. -
Categories (
categories): A string input that specifies the objects to be detected, separated by commas. For instance, "cat, dog, person". -
IOU Threshold (
iou_threshold): A float input used to determine the threshold for Intersection over Union (IoU) during non-maximum suppression. It helps in filtering out overlapping bounding boxes. The value ranges from 0 to 1. -
Class Agnostic NMS (
with_class_agnostic_nms): A boolean input that specifies whether Non-Maximum Suppression (NMS) should ignore class distinctions and only consider box overlap. -
EfficientSAM Model (Optional,
esam_model_opt): Optionally, the user can select the EfficientSAM model for additional segmentation capabilities. When provided, it segments detected objects beyond bounding boxes.
Outputs
This node produces the following outputs:
-
Bounding Box Detector (
BBOX_DETECTOR): A detector focused on providing bounding box outputs for the detected objects using the YOLO-World model. -
Segmentation Detector (
SEGM_DETECTOR): When an EfficientSAM model is provided, this output gives access to a segmentation-based detector, which provides masks for detected objects.
Usage in ComfyUI Workflows
In ComfyUI workflows, the Yoloworld_ESAM_DetectorProvider_Zho node can be integrated for robust object detection and segmentation tasks. It is beneficial for workflows that require detailed object understanding and additional mask-based outputs for further processing. Users can utilize this node in conjunction with other ComfyUI nodes to build comprehensive pipelines for image analysis, augmentation, and feature extraction.
Example Workflow:
- Load Image: Use an image input node to load an image into the workflow.
- Model Configuration: Configure the YOLO-World and optionally the EfficientSAM models.
- Run Detection: Use the
Yoloworld_ESAM_DetectorProvider_Zhonode to detect and either just box or also segment the desired objects. - Output Processing: Use the resulting bounding boxes and/or masks in further nodes for visualization, analysis, or transformation.
Special Features and Considerations
-
Customizable Detection: Users can fine-tune detection sensitivity by adjusting the confidence and IoU thresholds, allowing for control over false positives and overlapping boxes.
-
Optional Segmentation: With the optional use of an EfficientSAM model, users can achieve more detailed segmentation results.
-
Integration with Impact-Pack: The node is designed to be compatible with the Impact-Pack for enhanced functionality in extensive detection tasks.
-
Layer Support: Multiple model sizes (l, m, s) supported by YOLO-World can be seamlessly integrated, allowing users to choose the appropriate model based on their specific use case or computational resources.
In summary, the Yoloworld_ESAM_DetectorProvider_Zho node is a versatile and powerful component of the ComfyUI suite, offering detailed object detection and segmentation capabilities with a focus on efficiency and usability.