Autonomous driving technology has seen active research in object recognition, driven by the need to enhance safety and reliability for practical deployment. This paper proposes a Master-Slave Transformer architecture based on Multimodal Fusion to impr...
Autonomous driving technology has seen active research in object recognition, driven by the need to enhance safety and reliability for practical deployment. This paper proposes a Master-Slave Transformer architecture based on Multimodal Fusion to improve 3D object recognition in autonomous driving systems. The proposed structure distributes the processing of multiple sensor data across multiple Slave Transformers, and integrates the results into a global feature map in the Master Transformer. Additionally, an adaptive controller is introduced to dynamically respond to real-time changes in the environment, enhancing the stability and reliability of object recognition. Experimental results show that the proposed architecture outperforms existing models, achieving even higher accuracy when the adaptive controller is applied.