This project endeavors to address the intricacies of indoor localization through the fusion of deep learning methodologies and camera image processing. By harnessing the capabilities of Convolutional Neural Networks (CNNs), including YOLO (You Only Look Once), VGG (Visual Geometry Group), and AlexNet, alongside the incorporation of RGB-D Dataset 7-Scenes from Microsoft, our research aims to substantially enhance the precision and efficiency of mobile robots' indoor localization. Distinctly, this project is pioneering in its ambition to achieve comprehensive pose estimation of robots, thus providing nuanced insights into their spatial orientation and positioning within diverse indoor environments.
- Ayça Elif Aktaş
- Mustafa Mert Gökbayrak
- Mustafa Ünel
- 12.11.2023
The quest for impeccable indoor localization is more than a technical challenge; it's a gateway to elevating mobile robotics across logistics, healthcare, and beyond. Our vision is to pave the way for robots to navigate with unprecedented precision, ensuring their invaluable contributions are both effective and reliable. Navigating the challenges presented by indoor environments requires innovative approaches beyond traditional GPS and landmark-based systems. Our research is motivated by the pressing need for reliable indoor localization solutions that can adapt to dynamic conditions without compromising on accuracy or efficiency. Leveraging the depth-enhanced imaging capabilities of the RGB-D Dataset 7-Scenes from Microsoft, we aim to develop a system that not only localizes but also accurately estimates the pose of mobile robots.
- Forge a cutting-edge localization system employing CNNs, with a spotlight on YOLO, VGG, and AlexNet networks.
- To introduce pose estimation into the system, enhancing the dimensional understanding of a robot's orientation and position within an indoor space.
- To empirically validate the proposed system's effectiveness across a variety of indoor settings, ensuring compliance with industry standards and IEEE benchmarks.
Our approach involves a meticulous integration of the RGB-D Dataset 7-Scenes from Microsoft, enabling us to enrich our models with depth-aware spatial information. This comprehensive dataset serves as the cornerstone for training our selected CNN architectures, facilitating nuanced recognition and localization capabilities. This dataset not only enriches our model's understanding of space but also introduces a layer of depth perception critical for accurate pose estimation.
The project utilizes the RGB-D Dataset 7-Scenes from Microsoft for its rich depth and color information, vital for training our models to accurately perceive and interpret complex indoor environments. Preprocessing steps include image normalization and augmentation to ensure model robustness and generalizability.
- Data Mastery: Leveraging the RGB-D Dataset 7-Scenes, we embark on a journey to encapsulate a broad spectrum of indoor scenarios.
- Architectural Innovation: Our experimental odyssey explores the realms of YOLO, VGG, and AlexNet, each a contender in the arena of image processing prowess. Employing transfer learning to adapt pre-trained CNN architectures for room classification and precise localization.
- Pose Revelation: Beyond mere localization, we delve into the realm of pose estimation, charting the robot's orientation with finesse.
- Prototype Realization: The culmination of our efforts materializes as a tangible prototype, a testament to our dedication and a beacon for future exploration.
YOLO, VGG, and AlexNet architectures will evaluated for their effectiveness in processing and classifying indoor scenes. Each architecture's performance will be assessed based on its ability to accurately classify rooms, estimate coordinates, and derive the pose of the robot, offering a holistic view of its spatial presence.
The project adopts a dual-phased experimental design:
- CNN-Based Localization: Assessing YOLO, VGG, and AlexNet for room classification, coordinate estimation, and pose estimation, evaluating the models' precision, efficiency, and adaptability.
- Pose Estimation: Integrating pose estimation methodologies to augment the spatial awareness of the localization system, focusing on the orientation and positioning accuracy of the mobile robot.
The project envisages ongoing refinement and expansion, exploring broader datasets, advanced neural architectures, and extensive real-world testing. Our commitment is to drive forward the capabilities of indoor localization technologies, setting new standards for accuracy and efficiency in mobile robotics.