In this paper, we present a sequential training methodology aimed at improving the recognition of elevator buttons using the YOLOv5 object detection model. The methodology is structured into three distinct phases. In the first phase, we generate a syn...
In this paper, we present a sequential training methodology aimed at improving the recognition of elevator buttons using the YOLOv5 object detection model. The methodology is structured into three distinct phases. In the first phase, we generate a synthetic dataset where elevator buttons, cropped from their original context, are placed on random image backgrounds. This phase is designed to help the model learn to identify buttons independently of their surroundings, ensuring a foundational understanding of button features without contextual distractions. In the second phase, we augment the cropped button dataset by applying various transformations such as random flips, rotations, and scaling. These augmentations increase the diversity and robustness of the training data, allowing the model to generalize better to variations in button appearances. The final phase involves training the model on images of full elevator panels. This step is crucial for helping the model understand the contextual placement and spatial relationships of the buttons within the panel, which is essential for accurate detection in real-world scenarios. Additionally, we enhance the real-time video input exposure to improve visibility under varying lighting conditions, addressing common challenges faced in practical applications. For postprocessing, we integrate a Channel and Spatial Reliability Tracker (CSRT) to maintain button-tracking consistency in video sequences. This tracker helps ensure that once a button is detected, its position is reliably followed across frames, improving the overall accuracy and reliability of the system. This comprehensive approach, which combines the use of synthetic data, extensive data augmentation techniques, and contextual training on full panel images, aims to better simulate real-world scenarios. As a result, the proposed methodology significantly enhances the robustness and reliability of the YOLOv5 model in recognizing elevator buttons under diverse conditions.