From Offline to Periodic Adaptation for Pose-Based Shoplifting Detection in Real-world Retail Security
This Github repository contains the official implementation and dataset details for the paper:
"From Offline to Periodic Adaptation for Pose-Based Shoplifting Detection in Real-world Retail Security" published in IEEE Internet of Things Journal.
We present a privacy-preserving, pose-based framework for shoplifting detection designed for on-site IoT deployment. Our pipeline enables edge devices to adapt from streaming, unlabeled data through continual unsupervised learning, overcoming environmental drift and changing shopper behaviors.
- Periodic Adaptation Pipeline: A three-stage (Filtering, Collecting, Training) framework designed for periodic model updates on unlabeled streaming data.
- RetailS Dataset: A large-scale, multi-camera dataset featuring nearly 20M normal frames and both staged and authentic shoplifting incidents
- IoT-Optimized Metrics: Introduction of the
$H_{PRS}$ Score (harmonic mean of Precision, Recall, and Specificity) to strictly control false alarms in retail environments - Privacy-Preserving: Represents human activity through anonymized pose sequences (COCO17 format), removing raw pixel information.
The framework is divided into three operational stages to mirror IoT deployment:
1- Filtering: Uses adaptive thresholds (
2-Collection: Aggregates pseudo-labeled normal frames into buffered sets.
3-Training: Periodically fine-tunes the model (Half-day or Daily cycles) to capture local drift.
Fig 1:. A conceptual overview of our IoT-oriented continual unsupervised anomaly detection pipeline with pseudo filtering, collection, and training. Model updates are designed to run incrementally on edge-grade devices, enabling scalable deployment across distributed surveillance nodes.
1- Privacy-Preserving: The dataset includes pose sequences derived from CCTV footage, with anonymized human identities and no raw pixel-level video data. This ensures full following of privacy regulations and safeguards individual privacy.
2- Environemntal Consistency: Unlike prior datasets recorded in static labs, our staged shoplifting events were recorded in the same real-world retail environment as the normal and authentic shoplifting sets.
3- Pose-Based Annotations: PoseLift provides bounding boxes, person IDs, and human pose annotations instead of raw videos to support privacy-preserving shoplifting detection.
4- Camera Views: The dataset utilizes videos from 6 indoor cameras positioned across various aisles and locations in a local retail store in the USA.
5- Diverse Shoplifting Behaviors: The dataset includes a wide range of normal shopping behaviors alongside real and staged shoplifting activities. The shoplifting behaviors demonstrated in these videos included actions such as placing items into pockets, placing them in bags, and hiding them under shirts, jackets, and pants.
![]() Hiding an item in their pants |
![]() Hiding an item under their T-shirt |
![]() Placing an item in their pockets |
-
Pose Data Extraction: Anonymized pose data is extracted from the original videos using state-of-the-art models, including YOLOv8 for object detection, ByteTrack for person tracking, and HRNet for human pose estimation.
-
Data Modifications: To address occlusions caused by store shelves, specific areas of interest for each camera were defined. Missing poses were interpolated, and data smoothing was applied for continuity.
-
Annotations and Shoplifting Labels: Each video in the RetailS dataset has a corresponding annotation file in JSON format. The files are named according to the camera and video ID and provide detailed frame-level information for all detected individuals. Specifically, each annotation file includes:
-
Person ID: A unique identifier assigned to each detected individual.
-
Frame ID: The frame index within the video sequence.
-
Keypoints: Pose keypoints represented in XYC format, where X and Y denote the spatial coordinates of each joint, and C indicates the confidence score of the keypoint detection.
Additionally, anomaly labels are provided as binary NumPy arrays (.npy) for every frame in the test sets. The labeling follows the same naming pattern as the corresponding video file, ensuring easy mapping between the pickle file and its labels. The length of the array corresponds to the total number of frames in the respective video. A value of 0 indicates a "normal" frame, where no shoplifting behavior is detected. A value of 1 indicates an "anomalous" frame, where shoplifting behavior is identified based on the observed actions within that frame. 0: Normal behavior (e.g., browsing, walking). 1: Shoplifting anomaly (e.g., pocket or bag concealment).
Table 1: Summary of Normal and Shoplifting Frames, Events, and Camera Views in RetailS dataset
| Subsets | Normal Frames | Shoplifting Events | Shoplifting Frames | Camera Views |
|---|---|---|---|---|
| RetailS Train Set | 19,971,589 | 0 | 0 | 6 |
| Retails Real-world Test Set | 2,432 | 1,933 | 53 | 6 |
| Retails Staged Test Set | 20,578 | 20,335 | 898 | 6 |
Our periodic adaptation framework outperforms offline baselines in 91.6% of evaluations.
Fig 2:. Model performance trends at one-day update frequency.
Table 2: Average training time (in minutes) per update for continual learning with half-day and one-day data batches across three state-of-the-art pose-based models.
| Model | Half-day data | One-day data |
|---|---|---|
| STG-NF | 3.5 | 7.3 |
| TSGAD | 26.8 | 65 |
| SPARTA | 2.05 | 3.2 |
To download the dataset, please use the following link:
If you find our work useful, please consider citing:
@ARTICLE{11370135,
author={Yao, Shanle and Rashvand, Narges and Pazho, Armin Danesh and Tabkhi, Hamed},
journal={IEEE Internet of Things Journal},
title={From Offline to Periodic Adaptation for Pose-Based Shoplifting Detection in Real-world Retail Security},
year={2026},
volume={},
number={},
pages={1-1},
keywords={Training;Filtering;Cameras;Internet of Things;Anomaly detection;Adaptation models;Pipelines;Image edge detection;Real-time systems;Privacy;Shoplifting;artificial intelligence;IoT;computer vision;application;continual learning;dataset;real-world;edge;anomaly},
doi={10.1109/JIOT.2026.3660205}}
If you have any questions or need assistance, please contact the authors at nrashvan@charlotte.edu, and syao@charlotte.edu.




