OCR

Perform OCR on objects in an ROI, or on a ROI in the frame.

Overview

The OCR node allows for performing OCR (Optical Character Recognition) on objects within a Region of Interest (ROI), or on a ROI within the frame. This functionality is crucial for applications requiring text recognition and extraction from video feeds.

Properties

PropertyDescriptionTypeDefaultRequired
roi_labelsRegions of interest labelshiddenYes
roisRegions of interest. Type: polygon. Default: null. Conditional on roi_labels.polygonnullYes
processing_modeProcessing mode. Options: ROIs, at an Interval (rois_interval), ROIs, upon a Trigger (rois_trigger), Objects in an ROI (objects).enumrois_intervalYes
triggerQueue ROI for OCR when this condition evaluates to true. Conditional on processing_mode being rois_trigger.trigger-conditionnullYes
objects_to_processObject types to process. ex. car,person,car.red. Conditional on processing_mode being objects.model-labelnullYes
min_obj_size_pixelsMin. width and height of an object. Conditional on processing_mode being objects.number64Yes
obj_lookup_size_change_thresholdObject size ratio change threshold. Slider min: 0.1, max: 2.0, step: 0.2. Conditional on processing_mode being objects.slider0.1Yes
max_lookups_per_objMax. OCR attempts per object. Conditional on processing_mode being objects.number5Yes
group_ocr_resultsGroup OCR texts?boolfalseNo
ocr_match_patternOCR match pattern. Only retain OCR results that match this Regular Expression pattern. Leave blank to keep unfiltered results.stringnullNo
min_confidenceOCR confidence threshold. Slider min: 0, max: 1, step: 0.05.slider0.7Yes
additional_orientationsComma separated string with additional orientations to consider for OCR. Valid orientation values: 90, 180, 270. This is useful for recognizing text that is rotated at 90, 180, or 270 degrees.enum-multinullNo
ocr_intervalOCR lookups interval in seconds.number1No
display_roiDisplay ROI on video?booltrueNo
display_objinfoDisplay OCR info on video? Options: Disabled, Bottom left, Bottom right, Top left, Top right.enumbottom_leftNo
debugLog debugging information?boolfalseNo

Output Metadata

Metadata PropertyDescription
nodes.[node_id].rois.[roi_id].label_changed_deltaIndicates if there has been a change in the label of the ROI.
nodes.[node_id].rois.[roi_id].label_availableIndicates if a label is available for the ROI.
nodes.[node_id].recognized_obj_countThe count of recognized objects.
nodes.[node_id].recognized_obj_deltaThe change in the count of recognized objects.
nodes.[node_id].label_changed_obj_deltaThe change in the count of objects with changed labels.
nodes.[node_id].unrecognized_obj_countThe count of unrecognized objects.
nodes.[node_id].unrecognized_obj_deltaThe change in the count of unrecognized objects.
{
  "nodes": {
    "[node_id]": {
      "rois": {
        "[roi_id]": {
          "label_changed_delta": true,
          "label_available": true
        }
      },
      "recognized_obj_count": 5,
      "recognized_obj_delta": 1,
      "label_changed_obj_delta": 2,
      "unrecognized_obj_count": 0,
      "unrecognized_obj_delta": 0
    }
  }
}

Example Usage

An example of how to use the OCR node in a video processing pipeline:

<VideoProcessingPipeline>
  <OCR
    roi_labels="['license_plate']"
    rois="[{ x: 10, y: 20, width: 100, height: 50 }]"
    processing_mode="objects"
    trigger="motion_detected"
    objects_to_process="['car']"
    min_obj_size_pixels={64}
    obj_lookup_size_change_threshold={0.3}
    max_lookups_per_obj={3}
    group_ocr_results={true}
    ocr_match_pattern="\d{3}-\d{2}"
    min_confidence={0.85}
    ocr_interval={5}
    display_roi={true}
    display_objinfo="bottom_left"
    debug={false}
  />
</VideoProcessingPipeline>