Bulk Clip Processing Guide

This guide walks you through using Lumeo to process a large number of files in a scalable manner.

Overview

Processing clips in bulk is a pretty common scenario that Lumeo makes easy to pull off. Typical use cases for bulk processing clips include video search and indexing, alerts and extracting metadata for reporting or dashboards.

There are 3 key steps to bulk process clips:

  1. Build and test the pipeline that will process each clip
  2. Setup one or more gateways in your workspace that will process these clips
  3. Queue clips to be processed using the deployment_queues API or Universal Bridge or using lumeo-bulk-deploy script.

1. Build and Test Your Pipeline

Any Pipeline can process a clip, so this step is no different from building and testing a typical pipeline. See Getting Started Guide as a starting point

2. Setup Gateways

The preferred way to process a large number of clips is to setup Lumeo Gateways in the cloud as a Kubernetes cluster. You can also request Lumeo-managed Cloud Gateways in your account to save you from setting up your own.

This allows you to scale gateway capacity up and down easily to meet your processing volume needs. However, if you are just experimenting with this capability, note that it will work with any gateway in your workspace.

To setup Lumeo Gateways as a Kubernetes cluster, follow the Gateway Setup Guide or GCP - Kubernetes Guide. A guide for Azure is coming soon too.

📘

Recommendation: Setup a separate workspace for clip processing

We recommend setting up a separate workspace to house the Gateways to be used for bulk processing, so that you have complete control over which gateways are used for processing.

The deployment_queuesAPI used to queue clips for processing in the next step will pick the first available gateway in the workspace, irrespective of whether it is an edge device, or in the cloud and you generally would not want those clips to be routed to a gateway being used for other tasks.

You can control the maximum number of clips each Gateway processes concurrently by setting the Max deployments property in Gateway settings.

3. Process clips

Process Clips using API

The last step is to queue clips to be processed when Gateway capacity is available. You can do this using the deployment_queues API. Each Workspace comes with a default deployment queue that will create a deployment from a queued entry on a first-in-first-out basis, on the least utilized gateway.

Pipelines that generate alerts will do so when the deployment runs. Other outputs such as clips generated using the Save Clip Node and Save Snapshot Node can be accessed using the files API, or via your own S3 bucket.

The Api Recipe below shows you how to Queue deployments and access any output files.

Process Clips using Universal Bridge

Lumeo's Universal Bridge lets you upload Clips to Lumeo's cloud using SMTP, FTP as well as via scripts.

Uploading clips using the Universal Bridge creates a virtual camera which can then be configured with a specific pipeline & camera-specific pipeline overrides using the Console. This pipeline is then deployed for each new clip that is uploaded.

Learn more : Universal Bridge

Process Clips using a script

lumeo-bulk-deploy script lets you deploy pipelines using files from local storage, URLs, S3 buckets.

Start by installing the python package that contains that script:

pip install lumeo or pipx install lumeo

usage: lumeo-bulk-deploy [-h] --app_id APP_ID --token TOKEN [--pattern PATTERN] [--file_list FILE_LIST] [--csv_file CSV_FILE] [--s3_bucket S3_BUCKET] [--s3_access_key_id S3_ACCESS_KEY_ID]
                         [--s3_secret_access_key S3_SECRET_ACCESS_KEY] [--s3_region S3_REGION] [--s3_endpoint_url S3_ENDPOINT_URL] [--s3_prefix S3_PREFIX] [--tag TAG] [--camera_id CAMERA_ID]
                         [--camera_external_id CAMERA_EXTERNAL_ID] [--pipeline_id PIPELINE_ID] [--deployment_config DEPLOYMENT_CONFIG] [--deployment_prefix DEPLOYMENT_PREFIX]
                         [--delete_processed DELETE_PROCESSED] [--log_level LOG_LEVEL] [--batch_size BATCH_SIZE] [--queue_size]

Lumeo Bulk Deployer uploads media files to Lumeo cloud, (optionally) associates them with a virtual camera, and queues them for processing. Learn more at https://docs.lumeo.com/docs/universal-bridge

options:
  -h, --help            show this help message and exit

Authentication Args:
  --app_id APP_ID       Application (aka Workspace) ID
  --token TOKEN         Access (aka API) Token.

Source Files (one of pattern, file_list, csv_file, s3_bucket or tag is required):
  --pattern PATTERN     Glob pattern for files to upload
  --file_list FILE_LIST
                        Comma separated list of file URIs to queue
  --csv_file CSV_FILE   CSV file containing file_uri and corresponding camera_external_id or camera_id
  --s3_bucket S3_BUCKET
                        S3 bucket name to use as source for files
  --s3_access_key_id S3_ACCESS_KEY_ID
                        S3 Access key ID
  --s3_secret_access_key S3_SECRET_ACCESS_KEY
                        S3 secret access key
  --s3_region S3_REGION
                        S3 region if using AWS S3 bucket. Either s3_region or s3_endpoint_url must be specified.
  --s3_endpoint_url S3_ENDPOINT_URL
                        S3 endpoint URL. Either s3_region or s3_endpoint_url must be specified.
  --s3_prefix S3_PREFIX
                        S3 path prefix to filter files. Optional.
  --tag TAG             Tag to apply to uploaded files. Can be tag uuid, tag name or tag path (e.g. "tag1/tag2/tag3").If specified without pattern/file_list/csv_file/s3_bucket, will process existing files with
                        that tag.

Associate with Camera (gets pipeline & deployment config from camera):
  --camera_id CAMERA_ID
                        Camera ID of an existing camera, to associate with the uploaded files
  --camera_external_id CAMERA_EXTERNAL_ID
                        Use your own unique camera id to find or create a virtual camera, and associate with the uploaded files

Deployment Args (applied only when camera not specified):
  --pipeline_id PIPELINE_ID
                        Pipeline ID to queue deployment for processing. Required if camera_id / camera_external_id not specified.
  --deployment_config DEPLOYMENT_CONFIG
                        String containing a Deployment config JSON object. Video source in the config will be overridden by source files specified in this script. Ignored if camera_id or camera_external_id
                        specified. Optional.

General Args:
  --deployment_prefix DEPLOYMENT_PREFIX
                        Prefix to use for deployment name. Optional.
  --delete_processed DELETE_PROCESSED
                        Delete successfully processed files from the local folder after uploading
  --log_level LOG_LEVEL
                        Log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
  --batch_size BATCH_SIZE
                        Number of concurrent uploads to process at a time. Default 5.
  --queue_size          Print the current queue size

In the following examples, you specify a set of files, the pipeline and deployment configurations as script arguments.

Upload local files using a pattern

Uploads all files that match the pattern and deploy a specific pipeline.

lumeo-bulk-deploy --app_id 'd413586b-0ccb-4aaa-9fdf-3df7404f716d' --token 'xxxxxxx' --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --pattern '/Users/username/media/lumeo-*.mp4'

Uploads all files that match the pattern and deploy a specific pipeline, and tag all uploaded files + resulting deployments

lumeo-bulk-deploy --app_id 'd413586b-0ccb-4aaa-9fdf-3df7404f716d' --token 'xxxxxxx' --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --pattern '/Users/username/media/lumeo-*.mp4' --tag 'bulk-uploads/2024-07-31'

Uploads all files that match the pattern and deploy a specific pipeline with deployment config override

lumeo-bulk-deploy --app_id 'd413586b-0ccb-4aaa-9fdf-3df7404f716d' --token 'xxxxxxx' --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --pattern '/Users/username/media/lumeo-*.mp4' --deployment_config='{"overlay_meta2": {"text": "my-test-run","show_frame_count":true}}'

Upload self-hosted files using list

Creates input streams for URLs in the list and deploys with a specific pipeline.

lumeo-bulk-deploy --app_id 'd413586b-0ccb-4aaa-9fdf-3df7404f716d' --token 'xxxxxxx' --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --file_list 'https://assets.lumeo.com/media/parking_lot/mall-parking-1.mp4,https://assets.lumeo.com/media/sample/sample-people-car-traffic.mp4'

Upload self-hosted or local files using a CSV manifest

Creates input streams for URLs in the list / uploads local files, and deploys with a specific pipeline/deployment config specified in the csv file.

lumeo-bulk-deploy --app_id 'd413586b-0ccb-4aaa-9fdf-3df7404f716d' --token 'xxxxxxx' --csv_file ./manifest.csv

Creates input streams for URLs in the list / uploads local files, and deploys with a specific pipeline/deployment config specified in the csv file, falling back to command line options.

lumeo-bulk-deploy --app_id 'd413586b-0ccb-4aaa-9fdf-3df7404f716d' --token 'xxxxxxx' --csv_file ./manifest.csv --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --deployment_config '{"overlay_meta2": {"text": "my-test-run-default","show_frame_count":false}}'

CSV format (note the double quoted JSON when specifying deployment config in the CSV file):

file_uri, camera_external_id, camera_id, pipeline_id, deployment_config
/Users/devarshi/Downloads/warehouse2.mp4,,,ee55c234-b3d5-405f-b904-cfb2bd6f2e06
https://assets.lumeo.com/media/parking_lot/mall-parking-1.mp4,,ee55c234-b3d5-405f-b904-cfb2bd6f2e06
https://storage.googleapis.com/lumeo-public-media/samples/mall-guest-svcs.mp4,,,ee55c234-b3d5-405f-b904-cfb2bd6f2e06 
https://storage.googleapis.com/lumeo-public-media/demos/warehouse5.mp4,,,ee55c234-b3d5-405f-b904-cfb2bd6f2e06,"{""overlay_meta2"": {""text"": ""my-test-run"",""show_frame_count"":true}}" 

Upload self-hosted files from a S3 bucket

Creates input streams for signed S3 URLs from your S3 bucket, and deploys with a specific pipeline, tagging them in the process.

lumeo-bulk-deploy --s3_endpoint_url='https://sfo2.digitaloceanspaces.com' --s3_prefix=universal-bridge-testing --s3_bucket=lumeo-test --s3_access_key_id=xxxx --s3_secret_access_key='xxxx' --app_id=bc655947-45da-43cb-a254-a3a5e69ec084 --token='xxx' --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --tag 's3-file-uploads/2024-03-01/run1'

Process existing Lumeo cloud files tagged with a specific tag

Queues up all the files with the specific tag in Lumeo Cloud for processing using the specified pipeline id and deployment configuration.

lumeo-bulk-deploy --app_id=bc655947-45da-43cb-a254-a3a5e69ec084 --token='xxx' --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --tag 'test-bench/person-model-testing' --deployment_config '{"overlay_meta2": {"text": "person-model-testing-2024-08-18","show_frame_count":false}}'

lumeo-bulk-deploy --app_id=bc655947-45da-43cb-a254-a3a5e69ec084 --token='xxx' --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --tag 'test-bench/person-model-testing'

Billing and Performance Considerations

Clips that are uploaded to Lumeo's cloud, and any media artifacts generated by the pipeline count towards your Lumeo cloud storage usage.