Citation: | JI Zhongping, WANG Xiangwei, HE Zhiwei, DU Chenjie, JIN Ran, CHAI Bencheng. End-to-end Multi-Object Tracking Algorithm Integrating Global Local Feature Interaction and Angular Momentum Mechanism[J]. Journal of Electronics & Information Technology, 2024, 46(9): 3703-3712. doi: 10.11999/JEIT240277 |
[1] |
张红颖, 贺鹏艺. 基于卷积注意力模块和无锚框检测网络的行人跟踪算法[J]. 电子与信息学报, 2022, 44(9): 3299–3307. doi: 10.11999/JEIT210634.
ZHANG Hongying and HE Pengyi. Pedestrian tracking algorithm based on convolutional block attention module and anchor-free detection network[J]. Journal of Electronics & Information Technology, 2022, 44(9): 3299–3307. doi: 10.11999/JEIT210634.
[2] |
伍瀚, 聂佳浩, 张照娓, 等. 基于深度学习的视觉多目标跟踪研究综述[J]. 计算机科学, 2023, 50(4): 77–87. doi: 10.11896/jsjkx.220300173.
WU Han, NIE Jiahao, ZHANG Zhaowei, et al. Deep learning-based visual multiple object tracking: A review[J]. Computer Science, 2023, 50(4): 77–87. doi: 10.11896/jsjkx.220300173.
[3] |
BEWLEY A, GE Zongyuan, OTT L, et al. Simple online and realtime tracking[C]. 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, USA, 2016: 3464–3468. doi: 10.1109/ICIP.2016.7533003.
[4] |
SUN Shijie, AKHTAR N, SONG Huansheng, et al. Deep affinity network for multiple object tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(1): 104–119. doi: 10.1109/TPAMI.2019.2929520.
[5] |
WOJKE N, BEWLEY A, and PAULUS D. Simple online and realtime tracking with a deep association metric[C]. 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 2017: 3645–3649. doi: 10.1109/ICIP.2017.8296962.
[6] |
WANG Zhongdao, ZHENG Liang, LIU Yixuan, et al. Towards real-time multi-object tracking[C]. 16th European Conference on Computer Vision, Glasgow, UK, 2020: 107–122. doi: 10.1007/978-3-030-58621-8_7.
[7] |
ZHANG Yifu, WANG Chunyu, WANG Xinggang, et al. FairMOT: On the fairness of detection and re-identification in multiple object tracking[J]. International Journal of Computer Vision, 2021, 129(11): 3069–3087. doi: 10.1007/s11263-021-01513-4.
[8] |
YU En, LI Zhuoling, HAN Shoudong, et al. RelationTrack: Relation-aware multiple object tracking with decoupled representation[J]. IEEE Transactions on Multimedia, 2023, 25: 2686–2697. doi: 10.1109/TMM.2022.3150169.
[9] |
CHU Peng, WANG Jiang, YOU Quanzeng, et al. TransMOT: Spatial-temporal graph transformer for multiple object tracking[C]. Proceedings of 2023 IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, USA, 2023: 4859–4869. doi: 10.1109/WACV56688.2023.00485.
[10] |
XU Yihong, BAN Yutong, DELORME G, et al. TransCenter: Transformers with dense representations for multiple-object tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(6): 7820–7835. doi: 10.1109/TPAMI.2022.3225078.
[11] |
PENG Jinlong, WANG Changan, WAN Fangbin, et al. Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking[C]. 16th European Conference on Computer Vision, Glasgow, UK, 2020: 145–161. doi: 10.1007/978-3-030-58548-8_9.
[12] |
PANG Bo, LI Yizhuo, ZHANG Yifan, et al. TubeTK: Adopting tubes to track multi-object in a one-step training model[C]. Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 6307–6317. doi: 10.1109/CVPR42600.2020.00634.
[13] |
ZHANG Chuang, ZHENG Sifa, WU Haoran, et al. AttentionTrack: Multiple object tracking in traffic scenarios using features attention[J]. IEEE Transactions on Intelligent Transportation Systems, 2024, 25(2): 1661–1674. doi: 10.1109/TITS.2023.3315222.
[14] |
OGAWA T, SHIBATA T, and HOSOI T. FRoG-MOT: Fast and robust generic multiple-object tracking by IoU and motion-state associations[C]. 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, USA, 2024: 6549–6558. doi: 10.1109/WACV57701.2024.00643.
[15] |
HUANG Huimin, XIE Shiao, LIN Lanfen, et al. ScaleFormer: Revisiting the transformer-based backbones from a scale-wise perspective for medical image segmentation[C]. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, Vienna, Austria, 2022: 964–971. doi: 10.24963/ijcai.2022/135.
[16] |
MILAN A, LEAL-TAIXÉ L, REID I, et al. MOT16: A benchmark for multi-object tracking[EB/OL]., 2016.
[17] |
DU Dawei, QI Yuankai, YU Hongyang, et al. The unmanned aerial vehicle benchmark: Object detection and tracking[C]. Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 2018: 375–391. doi: 10.1007/978-3-030-01249-6_23.
[18] |
LUO Yutong, ZHONG Xinyue, ZENG Minchen, et al. CGLF-Net: Image emotion recognition network by combining global self-attention features and local multiscale features[J]. IEEE Transactions on Multimedia, 2024, 26: 1894–1908. doi: 10.1109/TMM.2023.3289762.
[19] |
BRASÓ G and LEAL-TAIXÉ L. Learning a neural solver for multiple object tracking[C]. Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 6246–6256. doi: 10.1109/CVPR42600.2020.00628.
[20] |
XIANG Jun, XU Guohan, MA Chao, et al. End-to-end learning deep CRF models for multi-object tracking deep CRF models[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(1): 275–288. doi: 10.1109/TCSVT.2020.2975842.
[21] |
BERGMANN P, MEINHARDT T, and LEAL-TAIXÉ L. Tracking without bells and whistles[C]. Proceedings of 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea (South), 2019: 941–951. doi: 10.1109/ICCV.2019.00103.
[22] |
ZHOU Yan, CHEN Junyu, WANG Dongli, et al. Multi-object tracking using context-sensitive enhancement via feature fusion[J]. Multimedia Tools and Applications, 2024, 83(7): 19465–19484. doi: 10.1007/s11042-023-16027-z.
[23] |
BOCHINSKI E, EISELEIN V, and SIKORA T. High-speed tracking-by-detection without using image information[C]. 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, Italy, 2017: 1–6. doi: 10.1109/AVSS.2017.8078516.
[24] |
ZHANG Yifu, SUN Peize, JIANG Yi, et al. ByteTrack: Multi-object tracking by associating every detection box[C]. 17th European Conference on Computer Vision, Tel Aviv, Israel, 2022: 1–21. doi: 10.1007/978-3-031-20047-2_1.
[25] |
LIU Songtao, HUANG Di, and WANG Yunhong. Receptive field block net for accurate and fast object detection[C]. Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 2018: 404–419. doi: 10.1007/978-3-030-01252-6_24.
[26] |
PAN Huihui, HONG Yuanduo, SUN Weichao, et al. Deep dual-resolution networks for real-time and accurate semantic segmentation of traffic scenes[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(3): 3448–3460. doi: 10.1109/TITS.2022.3228042.
[27] |
FU Jun, LIU Jing, TIAN Haijie, et al. Dual attention network for scene segmentation[C]. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 3141–3149. doi: 10.1109/CVPR.2019.00326.
[28] |
XIE Yakun, ZHU Jun, LAI Jianbo, et al. An enhanced relation-aware global-local attention network for escaping human detection in indoor smoke scenarios[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 186: 140–156. doi: 10.1016/j.isprsjprs.2022.02.006.