This repository contains CLI tools for performing specific automations.
vault-to-hub-migration: Migrating data from a Vault database to a Hub database.reprocess-media: Re-queue hub media for analysis when analysis is missing.audit-legacy-compat: Auditing legacy data shape compatibility across domains.migrate-legacy-media: Backfilling legacy media documents to current media shape.generate-default-labels: Adding labels to existing users.
You can run these jobs in your cluster. The benefit is that you do not need to expose anything, and use the internal Kubernetes dns.
kubectl apply -f jobs/vault-to-hub-migration-job.yamlkubectl apply -f jobs/generate-default-labels-job.yamlkubectl apply -f jobs/migrate-legacy-media-job.yaml-
Clone the repository:
git clone https://github.com/uug-ai/cli.git cd cli -
Install dependencies:
go mod tidy
-
Run example. This will execute the
vault-to-hub-migrationaction. Please have a look at the various options you can provide for each action.go run main.go -action vault-to-hub-migration \ ...options
This tool migrates data from a Vault database to a Hub database.
- Missing sequence: Vault media that is not present in any Hub sequence is sent through the standard pipeline (
monitor,sequence,...) using the configured-pipelinestages. - Sequence present: If a Hub sequence entry exists for a media item, it will be considered for analysis-only migration when
analysisis included in-pipeline. - Analysis selection rules (analysis-only path):
- If
analysis_idis empty on the sequence image, the item is queued for analysis. - If
analysis_idis present but the analysis document is missing, the item is queued for analysis. - If
analysis_idis present and the analysis exists, the item is skipped. - If
-operation-countis set and the analysis exists but has fewerresolvedoperationsthan the threshold, the analysis is deleted and the item is queued for analysis. - If no
analysis_idexists but an analysis exists for the same media key, the item is skipped unless-operation-countis set and the resolved count is below the threshold (then it is deleted and reprocessed).
- If
Notes:
- The analysis-only path assumes a valid
-vault-urlso thumbnail/dominantcolor/sprite workers can fetch the media. - The analysis-only path skips monitor/sequence side effects (activity counters, sequence creation).
-action: The action to take (required). For migration, usevault-to-hub-migration.-mongodb-uri: The MongoDB URI (optional if host and port are provided).-mongodb-host: The MongoDB host (optional if URI is provided).-mongodb-port: The MongoDB port (optional if URI is provided).-mongodb-source-database: The source database name (required).-mongodb-destination-database: The destination database name (required).-mongodb-database-credentials: The database credentials (optional).-mongodb-username: The MongoDB username (optional).-mongodb-password: The MongoDB password (optional).-username: The username to filter data (required).-queue: The integration used to transfer events to the hub pipeline.-vault-url: Optional Vault API URL override (e.g.https://vault.example.com/api).-start-timestamp: The start timestamp for filtering data (required).-end-timestamp: The end timestamp for filtering data (required).-timezone: The timezone for converting timestamps (optional, default isUTC).-pipeline: The pipeline to execute (optional, default ismonitor,sequence).-operation-count: Minimum resolved operations required to keep an existing analysis. If provided, analyses below this count are deleted and reprocessed (analysis-only path). Set to0to disable.-batch-size: The size of each batch (optional, default is10).-batch-delay: The delay between batches in milliseconds (optional, default is1000).-mode: You can choose to run adry-runorlive.
To run the Vault to Hub migration, use the following command:
go run main.go -action vault-to-hub-migration \
-mongodb-uri "mongodb+srv://<username>:<password>@<host>/<database>?retryWrites=true&w=majority&appName=<appName>" \
-mongodb-source-database=<sourceDatabase> \
-mongodb-destination-database=<destinationDatabase> \
-queue <rabbitmq-integration> \
-vault-url https://vault.kerberos.io/api \
-username <username> \
-start-timestamp <startTimestamp> \
-end-timestamp <endTimestamp> \
-timezone <timezone> \
-pipeline 'monitor,sequence,analysis' \
-operation-count 3 \
-mode dry-run \
-batch-size 100 \
-batch-delay 1000 _ _ _ _ _____ _____ _ _
| | | | | | |/ ____| / ____(_) |
| | | | | | | | __ | | _| |
| | | | | | | | |_ | | | | | |
| |__| | |__| | |__| | | |____| | |
\____/ \____/ \_____| \_____|_|_|
Starting Vault to Hub migration...
2024/12/12 09:37:39 ====================================
2024/12/12 09:37:39 Configuration:
2024/12/12 09:37:39 MongoDB URI: mongodb+srv://xxxx
2024/12/12 09:37:39 MongoDB Host:
2024/12/12 09:37:39 MongoDB Port:
2024/12/12 09:37:39 MongoDB Source Database: KerberosStorage
2024/12/12 09:37:39 MongoDB Destination Database: Kerberos
2024/12/12 09:37:39 MongoDB Database Credentials:
2024/12/12 09:37:39 MongoDB Username:
2024/12/12 09:37:39 MongoDB Password: ************
2024/12/12 09:37:39 Queue: rabbitmq-xxxx
2024/12/12 09:37:39 Username: xxxx
2024/12/12 09:37:39 Start Time 2024-04-01 08:47:40 +0200 CEST
2024/12/12 09:37:39 End Time 2025-04-06 17:41:00 +0200 CEST
2024/12/12 09:37:39 Pipeline monitor,sequence,analysis
2024/12/12 09:37:39 ====================================
>> Please wait while we migrate the data. Press Ctrl+C to stop the process.
Vault to Hub: delta complete 42s [====================================================================] 100%
Transferring media 1s [====================================================================] 100%
2024/12/12 09:38:26
2024/12/12 09:38:26 >>Media transferred:
2024/12/12 09:38:26
2024/12/12 09:38:26 +---------------------------------------------------------------------------------------+-----------------+-----------------+-------------------------------------+
2024/12/12 09:38:26 | File Name | File Size | Timestamp | Device |
2024/12/12 09:38:26 +---------------------------------------------------------------------------------------+-----------------+-----------------+-------------------------------------+
2024/12/12 09:38:26 | xxxxxxxx/1733992176_6-967003_melle-insidegarage_200-200-400-400_25819_769.mp4 | 7261430 | 1733992211 | melle-insidegarage |
2024/12/12 09:38:26 | xxxxxxxx/1733991780_6-967003_melle-garage_200-200-400-400_256_769.mp4 | 14206272 | 1733991818 | melle-garage |
2024/12/12 09:38:26 | xxxxxxxx/1733991781_6-967003_melle-street_200-200-400-400_819_769.mp4 | 3603494 | 1733991817 | melle-street |
2024/12/12 09:38:26 | xxxxxxxx/1733991587_6-967003_gb-side_200-200-400-400_342_769.mp4 | 8167534 | 1733991611 | vSQBjrhqGGOXseLoidIBFhKeJjCjTM |
2024/12/12 09:38:26 | xxxxxxxx/1733991547_6-967003_gb-side_200-200-400-400_883_769.mp4 | 12197049 | 1733991586 | vSQBjrhqGGOXseLoidIBFhKeJjCjTM |
2024/12/12 09:38:26 | xxxxxxxx/1733991501_6-967003_gb-side_200-200-400-400_1472_769.mp4 | 12130152 | 1733991538 | vSQBjrhqGGOXseLoidIBFhKeJjCjTM |
2024/12/12 09:38:26 | xxxxxxxx/1733990945_6-967003_gb-side_200-200-400-400_6547_769.mp4 | 7097803 | 1733990966 | vSQBjrhqGGOXseLoidIBFhKeJjCjTM |
2024/12/12 09:38:26 | xxxxxxxx/1733990903_6-967003_gb-side_200-200-400-400_9313_769.mp4 | 7709073 | 1733990928 | vSQBjrhqGGOXseLoidIBFhKeJjCjTM |
2024/12/12 09:38:26 | xxxxxxxx/1733990868_6-967003_gb-frontdoor_200-200-400-400_158_769.mp4 | 1882747 | 1733990885 | lQtymLrehWpHkTavfcNTFgwfDMoSfg |
2024/12/12 09:38:26 | xxxxxxxx/1733990742_6-967003_gb-side_200-200-400-400_214198_769.mp4 | 7015592 | 1733990765 | vSQBjrhqGGOXseLoidIBFhKeJjCjTM |
2024/12/12 09:38:26 | xxxxxxxx/1733990698_6-967003_gb-frontdoor_200-200-400-400_155_769.mp4 | 1487304 | 1733990713 | lQtymLrehWpHkTavfcNTFgwfDMoSfg |
2024/12/12 09:38:26 | xxxxxxxx/1733990683_6-967003_gb-frontdoor_200-200-400-400_177_769.mp4 | 1966309 | 1733990701 | lQtymLrehWpHkTavfcNTFgwfDMoSfg |
2024/12/12 09:38:26 | xxxxxxxx/1733990499_6-967003_gb-frontdoor_200-200-400-400_176_769.mp4 | 3430326 | 1733990530 | lQtymLrehWpHkTavfcNTFgwfDMoSfg |
2024/12/12 09:38:26 | xxxxxxxx/1733990314_6-967003_gb-frontdoor_200-200-400-400_168_769.mp4 | 3667828 | 1733990347 | lQtymLrehWpHkTavfcNTFgwfDMoSfg |
2024/12/12 09:38:26 | xxxxxxxx/1733990130_6-967003_gb-frontdoor_200-200-400-400_153_769.mp4 | 3839485 | 1733990165 | lQtymLrehWpHkTavfcNTFgwfDMoSfg |
2024/12/12 09:38:26 | xxxxxxxx/1733989945_6-967003_gb-frontdoor_200-200-400-400_151_769.mp4 | 3590120 | 1733989979 | lQtymLrehWpHkTavfcNTFgwfDMoSfg |
2024/12/12 09:38:26 | xxxxxxxx/1733989761_6-967003_gb-frontdoor_200-200-400-400_164_769.mp4 | 3730173 | 1733989794 | lQtymLrehWpHkTavfcNTFgwfDMoSfg |
2024/12/12 09:38:26 | xxxxxxxx/1733989573_6-967003_gb-frontdoor_200-200-400-400_154_769.mp4 | 3878792 | 1733989606 | lQtymLrehWpHkTavfcNTFgwfDMoSfg |
2024/12/12 09:38:26 | xxxxxxxx/1733989397_6-967003_gb-frontdoor_200-200-400-400_166_769.mp4 | 3846902 | 1733989430 | lQtymLrehWpHkTavfcNTFgwfDMoSfg |
2024/12/12 09:38:26 | xxxxxxxx/1733989379_6-967003_gb-frontdoor_200-200-400-400_162_769.mp4 | 2107887 | 1733989400 | lQtymLrehWpHkTavfcNTFgwfDMoSfg |
2024/12/12 09:38:26 +---------------------------------------------------------------------------------------+-----------------+-----------------+-------------------------------------+
This tool performs a read-only compatibility audit over legacy data, and reports missing required/recommended fields by domain.
-action: The action to take (required). For compatibility audit, useaudit-legacy-compat.-mongodb-uri: The MongoDB URI (optional if host and port are provided).-mongodb-host: The MongoDB host (optional if URI is provided).-mongodb-port: The MongoDB port (optional if URI is provided).-mongodb-source-database: The source database name (required if destination database is not set).-mongodb-destination-database: The destination database name (required if source database is not set).-mongodb-database-credentials: The database credentials (optional).-mongodb-username: The MongoDB username (optional).-mongodb-password: The MongoDB password (optional).-username: Optional username used to resolve organisation scope.-organisation-id: Optional organisation/user scope ID.-start-timestamp: Optional start timestamp for time-scoped collections.-end-timestamp: Optional end timestamp for time-scoped collections.-domains: Optional comma-separated domain list. Default:media,analysis,users,devices,groups,sites,settings.-mode: Accepted for consistency (dry-run/live), this action is read-only.
go run main.go -action audit-legacy-compat \
-mode dry-run \
-mongodb-uri "mongodb+srv://<username>:<password>@<host>/<database>?retryWrites=true&w=majority&appName=<appName>" \
-mongodb-destination-database=<database> \
-organisation-id <organisationId> \
-start-timestamp <startTimestamp> \
-end-timestamp <endTimestamp> \
-domains media,analysis,usersThis tool backfills missing fields on legacy media documents and can insert missing media docs from analysis-shaped records.
-action: The action to take (required). For this migration, usemigrate-legacy-media.-mongodb-uri: The MongoDB URI (optional if host and port are provided).-mongodb-host: The MongoDB host (optional if URI is provided).-mongodb-port: The MongoDB port (optional if URI is provided).-mongodb-source-database: Source database name (optional if destination database is set).-mongodb-destination-database: Destination database name (optional if source database is set).-mongodb-database-credentials: The database credentials (optional).-mongodb-username: The MongoDB username (optional).-mongodb-password: The MongoDB password (optional).-organisation-id: Recommended. Scope migration to one organisation.-username: Optional alternative to resolve organisation scope.-start-timestamp: Recommended. Use bounded windows for safer runs.-end-timestamp: Recommended. Use bounded windows for safer runs.-migration-timeout-minutes: Optional timeout for this action (default60). Set to0to disable timeout for very large datasets.-skip-matched-count: Optional performance flag (defaulttrue). Skips the initialCountDocumentspre-scan; report will showmatchedFilter: -1.-migration-version: Optional migration-step version selector (default1, latest supported). Use this to branch future media migration behavior without changing CLI shape.-check-migration-indexes: Optional. Checks required indexes for this action and reports missing/existing.-apply-migration-indexes: Optional. Creates missing required indexes for this action.-mode:dry-run(recommended first) orlive.-generate-default-marker-options: Optional. When set, generate defaultmarker_optionswith categoryclassification. If-organisation-idis provided, it targets that single org/user id. If omitted, it targets all users inusers. This always seeds a built-in default classification list, then adds any extra discovered classifications from scoped media/analysis data.
- Run
dry-runfirst with-organisation-idand a bounded timestamp window. - Review the report (
Needs,Cases, examples). - Run
livewith the same scope. - Repeat per time window until complete.
go run main.go -action migrate-legacy-media \
-mode dry-run \
-mongodb-uri "mongodb+srv://<username>:<password>@<host>/<database>?retryWrites=true&w=majority&appName=<appName>" \
-mongodb-destination-database=<database> \
-organisation-id <organisationId> \
-migration-version 1 \
-migration-timeout-minutes 60 \
-skip-matched-count=true \
-check-migration-indexes \
-apply-migration-indexes \
-generate-default-marker-options \
-start-timestamp <startTimestamp> \
-end-timestamp <endTimestamp>This tool adds starting labels to existing users in the database.
-action: The action to take (required). For labels, usegenerate-default-labels.-mongodb-uri: The MongoDB URI (optional if host and port are provided).-mongodb-host: The MongoDB host (optional if URI is provided).-mongodb-port: The MongoDB port (optional if URI is provided).-mongodb-source-database: The source database name (required).-mongodb-database-credentials: The database credentials (optional).-mongodb-username: The MongoDB username (optional).-mongodb-password: The MongoDB password (optional).-label-names: The names of the labels to add. Comma separated. Will add predefined default values if not provided.-username: A specific user to add labels to (optional).-mode: You can choose to run adry-runorlive.
To run the default label generation, use the following command:
go run main.go -action generate-default-labels \
-mode dry-run \
-mongodb-uri "mongodb+srv://<username>:<password>@<host>/<database>?retryWrites=true&w=majority&appName=<appName>" \
-mongodb-source-database=<sourceDatabase> \
-label-names=<labelNames> \
Add -username to add labels to just one specific user
This tool re-queues media for analysis when analysis has not been created yet.
-action: The action to take (required). For reprocessing, usereprocess-media.-mongodb-uri: The MongoDB URI (optional if host and port are provided).-mongodb-host: The MongoDB host (optional if URI is provided).-mongodb-port: The MongoDB port (optional if URI is provided).-mongodb-source-database: The Vault database name (required, used to fetch queue config).-mongodb-destination-database: Ignored for this action (reprocess runs within the source database).-mongodb-database-credentials: The database credentials (optional).-mongodb-username: The MongoDB username (optional).-mongodb-password: The MongoDB password (optional).-queue: The queue used to send analysis events (required).-user-id: The hub user ID to reprocess.-start-timestamp: The start timestamp for filtering media (required).-end-timestamp: The end timestamp for filtering media (required).-timezone: The timezone for converting timestamps (optional, default isUTC).-mode: You can choose to run adry-runorlive.-batch-size: The size of each batch (optional, default is10).-batch-delay: The delay between batches in milliseconds (optional, default is1000).
go run main.go -action reprocess-media \
-mongodb-uri "mongodb+srv://<username>:<password>@<host>/<database>?retryWrites=true&w=majority&appName=<appName>" \
-mongodb-source-database=<vaultDatabase> \
-mongodb-destination-database=<ignored> \
-queue <analysis-queue> \
-user-id <userId> \
-start-timestamp <startTimestamp> \
-end-timestamp <endTimestamp> \
-timezone <timezone> \
-mode dry-run \
-batch-size 100 \
-batch-delay 1000