diff --git a/.assets/abstract_architecture.jpeg b/.assets/abstract_architecture.jpeg new file mode 100644 index 00000000..af8488da Binary files /dev/null and b/.assets/abstract_architecture.jpeg differ diff --git a/LICENCE b/LICENCE new file mode 100644 index 00000000..00e47292 --- /dev/null +++ b/LICENCE @@ -0,0 +1,22 @@ +GNU AFFERO GENERAL PUBLIC LICENSE + Version 3, 19 November 2007 + +Copyright (C) 2025 University of Münster, Germany + +This program is free software: you can redistribute it and/or modify +it under the terms of the GNU Affero General Public License as published +by the Free Software Foundation, either version 3 of the License, or +(at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU Affero General Public License for more details. + +You should have received a copy of the GNU Affero General Public License +along with this program. If not, see . + +--- + +This project incorporates the Ultralytics library, which is licensed +under the GNU Affero General Public License v3.0. diff --git a/NOTICE b/NOTICE new file mode 100644 index 00000000..8c83854d --- /dev/null +++ b/NOTICE @@ -0,0 +1,10 @@ +Leezencounter +An educational open-source university project. + +This project uses Ultralytics (https://github.com/ultralytics/ultralytics), +which is licensed under the GNU Affero General Public License v3.0. + +Portions of this software are © Ultralytics and other contributors. + +All modifications and additions in this repository are © 2025 University of Münster +and licensed under the GNU Affero General Public License v3.0. diff --git a/README.md b/README.md index 1df3044d..adf9f749 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,58 @@ -# leezencounter +# Leezencounter -### Mono Repository Structure +![Abstract Architecture of Leezencounter](.assets/abstract_architecture.jpeg) + +Leezencounter is a university project carried out by master students of the University of Münster, Germany, of the Information Systems Department at the chair Machine Learning and Data Engineering. + +Within this project, microcontroller units (MCUs) were utilized in combination with quantized object detection models. + +The goal of this project was to enable the real-time on-device bicycle detection in public spaces and transmit the bicycle detections via LoRaWAN to a web application. The web application provides a detailed overview of the occupancy statistics where the corresponding hardware is deployed. + +Leezencounter is organized as a multi-component project (mono repository). See the [repository structure](#mono-repository-structure) for details. + +# Hardware + +We used the [XIAO ESP32-S3 Sense](https://wiki.seeedstudio.com/xiao_esp32s3_getting_started/) MCU with the [XIAO Wio-SX1262 LoRa module](https://www.seeedstudio.com/Wio-SX1262-with-XIAO-ESP32S3-p-5982.html). The default camera module of the ESP32-S3 was replaced by the OV5640 21mm 160 degrees one, which enabled the capturing of wider images and higher resolutions. + +A custom 3D-printed case for the hardware was designed for mounting purposes. + +# Software + +The following software stack was used: + +**Model Training** +- Python >=3.10 +- [Ultralytics](https://docs.ultralytics.com/) +- [Weights & Biases](https://docs.wandb.ai/) (experiment tracking) +- [DVC](https://dvc.org/doc) (data and model versioning) +- [Digital Ocean](https://docs.digitalocean.com/) (S3-like cloud storage) + +**Model Conversion & Deployment** +- [ESP-IDF](https://github.com/espressif/esp-idf) v5.5 (C/C++ project compilation and on-device application) +- [ESP-DL](https://github.com/espressif/esp-dl/tree/master) (on-device model inference) +- [ESP-PPQ](https://github.com/espressif/esp-ppq/tree/master) (model compression) +- Arduino + +**Web Application** + +Frontend: +- Next.js 15.3.0 +- Language: TypeScript 5 +- Styling: TailwindCSS 4 with shadcn/ui components + +Backend & Database: +- Runtime: Node.js with Next.js API Routes +- Database: PostgreSQL with pg driver + +Infrastructure: +- Vercel Platform + +**Sketches** +- Arduino IDE + +A snapshot of the images used for model training can be found [here](https://uni-muenster.sciebo.de/s/7F6Wqp4oMBHok7K). + +# Mono Repository Structure ``` . @@ -11,5 +63,33 @@ │ └── notebooks // Jupyter notebooks for local experimentation ├── webapplication // Webapp to display collected data ├── model-deployment // Model deployment files for ESP32-S3 +├── cad // CAD files for 3D-printable case └── sketches // Arduino Sketches for MCUs ``` + +Have a look at the sub-repository README files for more details. + +# Project Building +The project workflow was as follows (if you want to re-produce the results or re-use this project for your purposes): + +1. ``sketches`` (camera capture scripts) - Collect images +2. ``model-training`` - Train/Fine-tune model utilizing the images collected from `sketches` scripts +3. ``model-deployment`` (``model_conversion`` package) - Convert and compress the fine-tuned YOLO model +4. ``model-deployment`` (``yolo11_detect``) - Build the ESP-IDF project and deploy quantized model on hardware; send model predictions via BLE +5. ``webapplication`` - Setup web application for tracking collected data from MCUs +6. ``sketches`` (``lorawan_send`` package) - Receive model predictions via BLE and forward results via LoRaWAN to TNN node, which get fetched from the web application +7. (``cad`` - Use/Customize 3D-printable case (e.g., with [Tinkercad](https://www.tinkercad.com/)) to have an out-of-the-box usable mounting solution; optional) + + +# Licence +Leezencounter is licensed under the [GNU Affero General Public License v3.0](./LICENCE). + +This project makes use of [Ultralytics](https://github.com/ultralytics/ultralytics), +which is licensed under the same terms. + + +# Contributing +Contributions are welcome! Please open issues or submit pull requests. + +# Miscellaneous +Are you wondering what "Leezencounter" means? Have a look at our [trivia section in our wiki](https://github.com/SteffChef/Leezencounter/wiki). diff --git a/cad/tinayiot_case.stl b/cad/tinayiot_case.stl new file mode 100644 index 00000000..a5e78c9c Binary files /dev/null and b/cad/tinayiot_case.stl differ diff --git a/model-deployment/README.md b/model-deployment/README.md index 6c33666e..4b6385f6 100644 --- a/model-deployment/README.md +++ b/model-deployment/README.md @@ -1,12 +1,35 @@ # Model Conversion +This sub-repository contains all the code required for model conversion & compression and model deployment. Within this repository, the C-code project for deploying the quantized model on the ESP chip is organized in the `yolo_detect` directory. + +## Sub-Repository Structure + +```text +├── build # build directory of ESP-IDF +├── calib_images_compressed # compressed images used for model calibration during model compression +├── coco_detect # coco_detect sub-directory containing the code snippets and examples from Espressif for model conversion +│ ├── generate_onnx +│ └── models +├── data # selection of images used for model calibration +│ └── calibration_datasets +├── model_conversion # Python code for model conversion (.pt -> .onnx -> .espdl) +│ ├── core # config constants and paths +│ └── utils # auxiliary methods and function for model conversion and data preparation +├── models -> ../model-training/models # sym link to models directory (synced by DVC) +├── notebooks # ignored directory containing .ipynb notebook files for exploration +└── yolo11_detect # ESP-IDF project directory, containing C/C++ code and IDF project files + ├── build + ├── main + └── managed_components +``` + ## Conversion Requirements - Install requirements with uv: uv sync -- Ensure you have pulled the dvc dataset -- Put a `yolo11n.pt` model in the `model-deployment/coco_detect/models/` folder +- Ensure you have pulled the dvc dataset (cf. [README in model-training sub-repo](../model-training/README.md)) +- Put a `yolo11n.pt` model (trained-model) in the `model-deployment/coco_detect/models/` folder - Optional: Change parameters of paths and constants in `model-deployment/model_conversion/core/` ## Conversion @@ -29,8 +52,9 @@ - You can make a prediction on one image with this ESP32S3 script. To change the image, copy an image of size 640x640 named bikes.jpg into `model-deployment/yolo11_detect/main/`. - Optional: Change parameters for the detection thresholds in `model-deployment/coco_detect/coco_detect.cpp`. There, the first parameters after `m_model` represents the confidence and IoU threshold and the max detection value, in this case 25%, 70% and 100 respectively: - - new dl::detect::yolo11PostProcessor(m_model, 0.25, 0.7, 100, {{8, 8, 4, 4}, {16, 16, 8, 8}, {32, 32, 16, 16}}); +```c++ +new dl::detect::yolo11PostProcessor(m_model, 0.25, 0.7, 100, {{8, 8, 4, 4}, {16, 16, 8, 8}, {32, 32, 16, 16}}); +``` ## Deployment @@ -41,16 +65,18 @@ This deployment is tested for ESP32S3. - Move into `./yolo11_detect/` to build the program. Use the following command to build the program: - idf.py fullclean build flash monitor +```bash +idf.py fullclean build flash monitor +``` +You can also use the [VS Code ESP-IDF Extension](https://docs.espressif.com/projects/vscode-esp-idf-extension/en/latest/) for building, flashing, and monitoring. ### Bugs before build - if the build is crashing, it might be due to a too big image. Reduce the size with the help of the `model_deployment.ipynb` or some other software. The yolo11n is trained on images of size 640x640. Smaller resolutions also work. - make sure you are in your esp-idf virtual environment with Python 3.10 - sometimes you have to set the IDF_TARGET again: - - unset IDF_TARGET - idf.py set-target esp32s3 - - +```bash +unset IDF_TARGET +idf.py set-target esp32s3 +``` diff --git a/model-training/README.md b/model-training/README.md index 04ceb7c2..a5c860a7 100644 --- a/model-training/README.md +++ b/model-training/README.md @@ -40,8 +40,35 @@ Activate the virtual environment by running: source .venv/bin/activate ``` +The following setup instructions are optional: + +Download and start Label Studio for image labelling: +```bash +uv sync --group labeling +label-studio start +``` + +Install the Jupyter Notebook dependency group: +```bash +uv sync --group notebooks +``` + +Intall dependencies for Quantization-aware Training: +```bash +uv sync --group qat +``` + +If you want to install all dependencies, just run: +```bash +uv sync --all-groups +``` + + #### DVC +> [!NOTE] +> The following instructions are relevant for development purposes only. You need the secrets to access the Digital Ocean cloud storage. If you are a maintainer or contributing developer, reach out to the authors to get the secrets. Otherwise, if you want to have a snapshot of the images used for training, you can find it [here](https://uni-muenster.sciebo.de/s/7F6Wqp4oMBHok7K). + Enable data tracking with [DVC](https://dvc.org/doc/start) by running. Set the cloud storage reference by running ```bash dvc remote modify tinyaiot access_key_id DIGITAL_OCEAN_ACCESS_KEY_ID --local diff --git a/opencv_imageparsing.py b/opencv_imageparsing.py deleted file mode 100644 index 75f2ed0c..00000000 --- a/opencv_imageparsing.py +++ /dev/null @@ -1,56 +0,0 @@ -import cv2 -import os -import numpy as np - -video_path = "yourdirectorypath" - -output_dir = "extracted_framesbutyoucanchangethisname" -os.makedirs(output_dir, exist_ok=True) - -cap = cv2.VideoCapture(video_path) - -if not cap.isOpened(): - print("Error: Cannot open video file.") - exit() - - -fps = int(cap.get(cv2.CAP_PROP_FPS)) -total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT)) - -#if you want to capture between 6:14 -6:20 you should tell 6 * 60 + 34 , 6*60+20 basically minute * 60 + seconds -time_ranges = [ - (8 * 60 + 34, 9 * 60), # 8:34 to 9:00 - (14 * 60 + 34, 15 * 60), # 14:34 to 15:00 - (25 * 60 + 34, 26 * 60), # 25:34 to 26:00 - (48 * 60 + 34, 49 * 60), # 48:34 to 49:00 - (51 * 60 + 34, 52 * 60), # 51:34 to 52:00 -] - -frames_per_range = 25 - -for i, (start_time, end_time) in enumerate(time_ranges): - start_frame = int(start_time * fps) - end_frame = int(end_time * fps) - - frame_indices = np.linspace(start_frame, end_frame, frames_per_range, dtype=int) - - for j, frame_idx in enumerate(frame_indices): - if frame_idx >= total_frames: - print(f"Frame {frame_idx} exceeds total frame count. Skipping.") - continue - - cap.set(cv2.CAP_PROP_POS_FRAMES, frame_idx) - ret, frame = cap.read() - - if not ret: - print(f"Error reading frame {frame_idx}. Skipping.") - continue - - #you can give it a differnt name - frame_filename = os.path.join(output_dir, f"{i+1}_{start_time}_frame{j+1:02d}.jpg") - cv2.imwrite(frame_filename, frame) - print(f"Saved: {frame_filename}") - - -cap.release() -print("Finished extracting frames.") \ No newline at end of file diff --git a/sketches/README.md b/sketches/README.md new file mode 100644 index 00000000..ac2d374d --- /dev/null +++ b/sketches/README.md @@ -0,0 +1,31 @@ +# Sketches + +This sub-repository contains all MCU scripts in form of arduino sketches that were used throughout this project. + +## Sub-Repository Structure + +```text +├── camera_capture # initial attempt for photo capturing using ESP32-S3's on-device camera +├── camera_capture_v2_bydatetime # improved version of camera_capture, but with datetime as file names for storing images on SD cards +├── camera_capture_w_calibration # final version of camera_capture incorporating camera re-calibration logic +├── camera_server # auxiliary sketch for streaming camera views to a dedicated URL +└── lorawan_send # production sketch used for LoRaWAN data transmission +``` + +[camera_capture](./camera_capture/) has some flaws as it was out initial attempt for capturing images through the on-device camera of the ESP32-S3. Collected images are stored on a micro SD card. Image file names receive a number starting at one. Everytime the sketch gets re-ran, the existing files on the SD card get overwritten. + +[camera_capture_v2_bydatetime](./camera_capture_v2_bydatetime/) improves this a bit by setting the image file names to the datetime the image was captured. + +[camera_capture_w_calibration](./camera_capture_w_calibration/) is the final version. It assigned image file names to unique interger numbers, starting at the next biggest number that does not exist on the SD card. Further, it incorporates an camera re-calibration logic that prevented consistent image capturing in varying illumination settings. More information about this can be found in the documentation (see this repo's wiki). + +[camera_server](./camera_server/) is an auxiliary sketch that allows streaming the camera output to a dedicated local webserver. This might be useful to test whether the mounted MCU captures a good point of view of the scene. + +[lorawan_send](./lorawan_send/) includes the data transmission logic. This sketch runs on a second MCU and includes the BLE client. The BLE client receives the model predictions from the BLE server and forwards them via LoRaWAN to TTN. Secrets for TTN nodes have to be changed as well as the BLE server-client credentials. + + +## SD Card Formatting + +> [!NOTE] +> Before inserting a micro SD card into dedicated ESP32-S3 card slot, the SD card has to exhibit the FAT32 file system format. Otherwise, writing data to the SD card will fail. Have a look at the [official documentation](https://wiki.seeedstudio.com/xiao_esp32s3_sense_filesystem/#prepare-the-microsd-card) how to do this. + +For macOS users, [this official SD card formatting software](https://www.sdcard.org/downloads/formatter/sd-memory-card-formatter-for-mac-download/) was used. Windows and Linux users may refer to the [XIAO documentation on how to perpare the micro SD card](https://wiki.seeedstudio.com/xiao_esp32s3_sense_filesystem/#prepare-the-microsd-card).