Bulk Clip Processing Guide
This guide walks you through using Lumeo to process a large number of files in a scalable manner.
Overview
Processing clips in bulk is a pretty common scenario that Lumeo makes easy to pull off. Typical use cases for bulk processing clips include video search and indexing, alerts and extracting metadata for reporting or dashboards.
There are 3 key steps to bulk process clips:
- Build and test the pipeline that will process each clip
- Setup one or more gateways in your workspace that will process these clips
- Queue clips to be processed using the deployment_queues API or Universal Bridge or using
lumeo-bulk-deploy
script.
1. Build and Test Your Pipeline
Any Pipeline can process a clip, so this step is no different from building and testing a typical pipeline. See Getting Started Guide as a starting point
2. Setup Gateways
The preferred way to process a large number of clips is to setup Lumeo Gateways in the cloud as a Kubernetes cluster. You can also request Lumeo-managed Cloud Gateways in your account to save you from setting up your own.
This allows you to scale gateway capacity up and down easily to meet your processing volume needs. However, if you are just experimenting with this capability, note that it will work with any gateway in your workspace.
To setup Lumeo Gateways as a Kubernetes cluster, follow the Gateway Setup Guide or GCP - Kubernetes Guide. A guide for Azure is coming soon too.
Recommendation: Setup a separate workspace for clip processing
We recommend setting up a separate workspace to house the Gateways to be used for bulk processing, so that you have complete control over which gateways are used for processing.
The deployment_queuesAPI used to queue clips for processing in the next step will pick the first available gateway in the workspace, irrespective of whether it is an edge device, or in the cloud and you generally would not want those clips to be routed to a gateway being used for other tasks.
You can control the maximum number of clips each Gateway processes concurrently by setting the Max deployments property in Gateway settings.
3. Process clips
Process Clips using API
The last step is to queue clips to be processed when Gateway capacity is available. You can do this using the deployment_queues API. Each Workspace comes with a default deployment queue that will create a deployment from a queued entry on a first-in-first-out basis, on the least utilized gateway.
Pipelines that generate alerts will do so when the deployment runs. Other outputs such as clips generated using the Save Clip Node and Save Snapshot Node can be accessed using the files API, or via your own S3 bucket.
The Api Recipe below shows you how to Queue deployments and access any output files.
Process Clips using Universal Bridge
Lumeo's Universal Bridge lets you upload Clips to Lumeo's cloud using SMTP, FTP as well as via scripts.
Uploading clips using the Universal Bridge creates a virtual camera which can then be configured with a specific pipeline & camera-specific pipeline overrides using the Console. This pipeline is then deployed for each new clip that is uploaded.
Learn more : Universal Bridge
Process Clips using a script
lumeo-bulk-deploy
script lets you deploy pipelines using files from local storage, URLs, S3 buckets.
Start by installing the python package that contains that script:
pip install lumeo
or pipx install lumeo
usage: lumeo-bulk-deploy [-h] --app_id APP_ID --token TOKEN [--pattern PATTERN] [--file_list FILE_LIST] [--csv_file CSV_FILE] [--s3_bucket S3_BUCKET] [--s3_access_key_id S3_ACCESS_KEY_ID]
[--s3_secret_access_key S3_SECRET_ACCESS_KEY] [--s3_region S3_REGION] [--s3_endpoint_url S3_ENDPOINT_URL] [--s3_prefix S3_PREFIX] [--tag TAG] [--camera_id CAMERA_ID]
[--camera_external_id CAMERA_EXTERNAL_ID] [--pipeline_id PIPELINE_ID] [--deployment_config DEPLOYMENT_CONFIG] [--deployment_prefix DEPLOYMENT_PREFIX]
[--delete_processed DELETE_PROCESSED] [--log_level LOG_LEVEL] [--batch_size BATCH_SIZE] [--queue_size]
Lumeo Bulk Deployer uploads media files to Lumeo cloud, (optionally) associates them with a virtual camera, and queues them for processing. Learn more at https://docs.lumeo.com/docs/universal-bridge
options:
-h, --help show this help message and exit
Authentication Args:
--app_id APP_ID Application (aka Workspace) ID
--token TOKEN Access (aka API) Token.
Source Files (one of pattern, file_list, csv_file, s3_bucket or tag is required):
--pattern PATTERN Glob pattern for files to upload
--file_list FILE_LIST
Comma separated list of file URIs to queue
--csv_file CSV_FILE CSV file containing file_uri and corresponding camera_external_id or camera_id
--s3_bucket S3_BUCKET
S3 bucket name to use as source for files
--s3_access_key_id S3_ACCESS_KEY_ID
S3 Access key ID
--s3_secret_access_key S3_SECRET_ACCESS_KEY
S3 secret access key
--s3_region S3_REGION
S3 region if using AWS S3 bucket. Either s3_region or s3_endpoint_url must be specified.
--s3_endpoint_url S3_ENDPOINT_URL
S3 endpoint URL. Either s3_region or s3_endpoint_url must be specified.
--s3_prefix S3_PREFIX
S3 path prefix to filter files. Optional.
--tag TAG Tag to apply to uploaded files. Can be tag uuid, tag name or tag path (e.g. "tag1/tag2/tag3").If specified without pattern/file_list/csv_file/s3_bucket, will process existing files with
that tag.
Associate with Camera (gets pipeline & deployment config from camera):
--camera_id CAMERA_ID
Camera ID of an existing camera, to associate with the uploaded files
--camera_external_id CAMERA_EXTERNAL_ID
Use your own unique camera id to find or create a virtual camera, and associate with the uploaded files
Deployment Args (applied only when camera not specified):
--pipeline_id PIPELINE_ID
Pipeline ID to queue deployment for processing. Required if camera_id / camera_external_id not specified.
--deployment_config DEPLOYMENT_CONFIG
String containing a Deployment config JSON object. Video source in the config will be overridden by source files specified in this script. Ignored if camera_id or camera_external_id
specified. Optional.
General Args:
--deployment_prefix DEPLOYMENT_PREFIX
Prefix to use for deployment name. Optional.
--delete_processed DELETE_PROCESSED
Delete successfully processed files from the local folder after uploading
--log_level LOG_LEVEL
Log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
--batch_size BATCH_SIZE
Number of concurrent uploads to process at a time. Default 5.
--queue_size Print the current queue size
In the following examples, you specify a set of files, the pipeline and deployment configurations as script arguments.
Upload local files using a pattern
Uploads all files that match the pattern and deploy a specific pipeline.
lumeo-bulk-deploy --app_id 'd413586b-0ccb-4aaa-9fdf-3df7404f716d' --token 'xxxxxxx' --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --pattern '/Users/username/media/lumeo-*.mp4'
Uploads all files that match the pattern and deploy a specific pipeline, and tag all uploaded files + resulting deployments
lumeo-bulk-deploy --app_id 'd413586b-0ccb-4aaa-9fdf-3df7404f716d' --token 'xxxxxxx' --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --pattern '/Users/username/media/lumeo-*.mp4' --tag 'bulk-uploads/2024-07-31'
Uploads all files that match the pattern and deploy a specific pipeline with deployment config override
lumeo-bulk-deploy --app_id 'd413586b-0ccb-4aaa-9fdf-3df7404f716d' --token 'xxxxxxx' --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --pattern '/Users/username/media/lumeo-*.mp4' --deployment_config='{"overlay_meta2": {"text": "my-test-run","show_frame_count":true}}'
Upload self-hosted files using list
Creates input streams for URLs in the list and deploys with a specific pipeline.
lumeo-bulk-deploy --app_id 'd413586b-0ccb-4aaa-9fdf-3df7404f716d' --token 'xxxxxxx' --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --file_list 'https://assets.lumeo.com/media/parking_lot/mall-parking-1.mp4,https://assets.lumeo.com/media/sample/sample-people-car-traffic.mp4'
Upload self-hosted or local files using a CSV manifest
Creates input streams for URLs in the list / uploads local files, and deploys with a specific pipeline/deployment config specified in the csv file.
lumeo-bulk-deploy --app_id 'd413586b-0ccb-4aaa-9fdf-3df7404f716d' --token 'xxxxxxx' --csv_file ./manifest.csv
Creates input streams for URLs in the list / uploads local files, and deploys with a specific pipeline/deployment config specified in the csv file, falling back to command line options.
lumeo-bulk-deploy --app_id 'd413586b-0ccb-4aaa-9fdf-3df7404f716d' --token 'xxxxxxx' --csv_file ./manifest.csv --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --deployment_config '{"overlay_meta2": {"text": "my-test-run-default","show_frame_count":false}}'
CSV format (note the double quoted JSON when specifying deployment config in the CSV file):
file_uri, camera_external_id, camera_id, pipeline_id, deployment_config
/Users/devarshi/Downloads/warehouse2.mp4,,,ee55c234-b3d5-405f-b904-cfb2bd6f2e06
https://assets.lumeo.com/media/parking_lot/mall-parking-1.mp4,,ee55c234-b3d5-405f-b904-cfb2bd6f2e06
https://storage.googleapis.com/lumeo-public-media/samples/mall-guest-svcs.mp4,,,ee55c234-b3d5-405f-b904-cfb2bd6f2e06
https://storage.googleapis.com/lumeo-public-media/demos/warehouse5.mp4,,,ee55c234-b3d5-405f-b904-cfb2bd6f2e06,"{""overlay_meta2"": {""text"": ""my-test-run"",""show_frame_count"":true}}"
Upload self-hosted files from a S3 bucket
Creates input streams for signed S3 URLs from your S3 bucket, and deploys with a specific pipeline, tagging them in the process.
lumeo-bulk-deploy --s3_endpoint_url='https://sfo2.digitaloceanspaces.com' --s3_prefix=universal-bridge-testing --s3_bucket=lumeo-test --s3_access_key_id=xxxx --s3_secret_access_key='xxxx' --app_id=bc655947-45da-43cb-a254-a3a5e69ec084 --token='xxx' --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --tag 's3-file-uploads/2024-03-01/run1'
Process existing Lumeo cloud files tagged with a specific tag
Queues up all the files with the specific tag in Lumeo Cloud for processing using the specified pipeline id and deployment configuration.
lumeo-bulk-deploy --app_id=bc655947-45da-43cb-a254-a3a5e69ec084 --token='xxx' --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --tag 'test-bench/person-model-testing' --deployment_config '{"overlay_meta2": {"text": "person-model-testing-2024-08-18","show_frame_count":false}}'
lumeo-bulk-deploy --app_id=bc655947-45da-43cb-a254-a3a5e69ec084 --token='xxx' --pipeline_id ee55c234-b3d5-405f-b904-cfb2bd6f2e06 --tag 'test-bench/person-model-testing'
Billing and Performance Considerations
Clips that are uploaded to Lumeo's cloud, and any media artifacts generated by the pipeline count towards your Lumeo cloud storage usage.
Updated 2 months ago