AUTHOR=Dai Jiahai , Fu Yunhao , Wang Songxin , Chang Yuchun TITLE=Siamese hierarchical feature fusion transformer for efficient tracking JOURNAL=Frontiers in Neurorobotics VOLUME=Volume 16 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/neurorobotics/articles/10.3389/fnbot.2022.1082346 DOI=10.3389/fnbot.2022.1082346 ISSN=1662-5218 ABSTRACT=Object tracking is a fundamental task in computer vision. Most Siamese-based trackers estimate target positions based on the responses generated by comparing template and search branches. Trackers with deeper backbones that are computationally expensive find meeting the real-time requirements of edge platforms difficult. However, features extracted by a lightweight backbone are inadequate for clear discrimination in complex scenarios. In this study, we adopted a lightweight backbone and extracted features from multiple levels. A hierarchical feature fusion transformer (HFFT) was designed to mine the interdependencies of multi-level features in a novel model called SiamHFFT. Therefore, our tracker can exploit comprehensive feature representations in an end-to-end manner, and the proposed model is capable of handling complex scenarios on a CPU at a rate of 29 FPS. Comprehensive experimental results on UAV123, UAV123@10fps, LaSOT, VOT2020, and GOT-10k benchmarks with multiple trackers demonstrate the effectiveness and efficiency of SiamHFFT. In particular, our SiamHFFT achieves good performance both in accuracy and speed, which has excellent practical implications in terms of improving object tracking performance in the real world.