Overview

The Generic Detector node is designed to detect objects within a video frame based on the similarity of text prompts and the images of objects. This functionality is useful for applications requiring identification and categorization of objects in real-time.

Inputs & Outputs

Inputs: 1, Media Format: Raw Video
Outputs: 1, Media Format: Raw Video
Output Metadata: Objects

Properties

Property	Description	Type	Default	Required
model_id	The type of model to use. Options: Yolo-Small (yolov8s-world), Yolo-Medium (yolov8m-world), Yolo-Large (yolov8l-world), OWL-Medium (owlvit-base-patch32).	enum	yolov8s-world	Yes
class_list	A comma-separated list of objects to detect, optionally with an alternate label. For example, a person=person, a blue car=blue_car.	string	null	Yes
interval	The inference interval. Infer on every nth frame. 1 means infer every frame. The minimum value is 1 frame.	number	1	Yes
confidence_threshold	This property allows you to override the default minimum inference threshold for all classes. The minimum value is 0, the maximum value is 1.0, the step is 0.1, and the scale is 0.1.	slider-optional	0.1	No
per_class_thresholds	A comma-separated list of per-class thresholds. Leave this field empty to use the default threshold for all classes. For example, 0.1,0.2,0.3.	string	null	No
iou_threshold	This property allows you to increase the threshold to reduce potential duplicate detections of a single object. The minimum value is 0, the maximum value is 1.0, the step is 0.1, and the scale is 0.1.	slider-optional	0.5	No
min_object_size	The minimum size of the object to detect. The size should be specified as width x height.	string	null	No
enable_max_optimizations	Enable advanced optimizations to improve performance. This currently increases deployment start time.	bool	false	No
clear_cache	Set to true to clear model cache. This will increase deployment start time.	bool	false	No
source_objects	Look for objects within these object types. Leave empty to detect objects within the entire frame. Ex. `car`,`person`,`car.red`	string	null	No

Metadata

The output metadata includes information about the detected objects within the frame. This includes object types, confidence scores, and bounding box coordinates.

Overview

Inputs & Outputs

Properties

Metadata

Example JSON