Recognition of Basketball Tactics Based on Vision Transformer and Track Filter

XU Guoliang; SHEN Gang; LIANG Xupeng; LUO Jiangtao

doi:10.11999/JEIT230079

Volume 46 Issue 2

Feb. 2024

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2024 > 46(2): 615-623

XU Guoliang, SHEN Gang, LIANG Xupeng, LUO Jiangtao. Recognition of Basketball Tactics Based on Vision Transformer and Track Filter[J]. Journal of Electronics & Information Technology, 2024, 46(2): 615-623. doi: 10.11999/JEIT230079

Citation:

XU Guoliang, SHEN Gang, LIANG Xupeng, LUO Jiangtao. Recognition of Basketball Tactics Based on Vision Transformer and Track Filter[J]. Journal of Electronics & Information Technology, 2024, 46(2): 615-623. doi: 10.11999/JEIT230079

Citation:

PDF( 4321 KB)

Recognition of Basketball Tactics Based on Vision Transformer and Track Filter

doi: 10.11999/JEIT230079

1.
School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2.
School of Physical Education, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

Funds: Chongqing Municipal Sports Bureau Research Key Projects (A2019002, A202113)

Received Date: 2023-02-21
Rev Recd Date: 2023-05-09

Available Online: 2023-05-17

Publish Date: 2024-02-29

Abstract

Abstract

The analysis of player trajectory data using machine learning to obtain offensive or defensive tactics is a crucial component of understanding basketball video content. Traditional machine learning methods require the setting of feature variables manually, significantly reducing flexibility. Therefore, the key issue is how to automatically obtain feature information that can be used for tactic recognition. To address this issue, a basketball Tactic Vision Transformer (TacViT) recognition model is proposed based on player trajectory data from the National Basketball Association (NBA) games. The proposed model adopts Vision Transformer (ViT) as the backbone network and multi-head attention modules to extract rich global trajectory feature information. Trajectory filters are also incorporated in order to not only enhance the feature interaction between the court lines and player trajectories, but also strengthen the representation of player position features in this study. The trajectory filters learn the long-term spatial correlations in the frequency domain with log-linear complexity. A self-built basketball tactic dataset (PlayersTrack) is created from the sequence data of the Sport Vision System (SportVU), which are converted into trajectory graphs in this work. The experiments on this dataset showed that the accuracy of TacViT reached 82.5%, which is a 16.7% improvement over the accuracy of the Vision Transformer S model (ViT-S) without modifications.
- Basketball tactics recognition,
- Players trajectory,
- Track filter,
- Log-linear complexity,
- Multi-head attention

FullText(HTML)

References(25)

References

[1]	PERŠE M, KRISTAN M, KOVAČIČ S, et al. A trajectory-based analysis of coordinated team activity in a basketball game[J]. Computer Vision and Image Understanding, 2009, 113(5): 612–621. doi: 10.1016/j.cviu.2008.03.001.
[2]	YUE Qiang and WEI Chao. Innovation of human body positioning system and basketball training system[J]. Computational Intelligence and Neuroscience, 2022, 2022: 2369925. doi: 10.1155/2022/2369925.
[3]	MILLER A C and BORNN L. Possession sketches: Mapping NBA strategies[C]. The 2017 MIT Sloan Sports Analytics Conference, Boston, USA, 2017.
[4]	TIAN Changjia, DE SILVA V, CAINE M, et al. Use of machine learning to automate the identification of basketball strategies using whole team player tracking data[J]. Applied Sciences, 2019, 10(1): 24. doi: 10.3390/app10010024.
[5]	MCINTYRE A, BROOKS J, GUTTAG J, et al. Recognizing and analyzing ball screen defense in the NBA[C]. The MIT Sloan Sports Analytics Conference, Boston, USA, 2016: 11–12.
[6]	WANG K C and ZEMEL R. Classifying NBA offensive plays using neural networks[C]. MIT Sloan Sports Analytics Conference, Boston, USA, 2016.
[7]	CHEN H T, CHOU C L, FU T S, et al. Recognizing tactic patterns in broadcast basketball video using player trajectory[J]. Journal of Visual Communication and Image Representation, 2012, 23(6): 932–947. doi: 10.1016/j.jvcir.2012.06.003.
[8]	TSAI T Y, LIN Y Y, LIAO H Y M, et al. Recognizing offensive tactics in broadcast basketball videos via key player detection[C]. 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 2017: 880–884.
[9]	TSAI T Y, LIN Y Y, JENG S K, et al. End-to-end key-player-based group activity recognition network applied to basketball offensive tactic identification in limited data scenarios[J]. IEEE Access, 2021, 9: 104395–104404. doi: 10.1109/ACCESS.2021.3098840.
[10]	CHEN C H, LIU T L, WANG Y S, et al. Spatio-temporal learning of basketball offensive strategies[C]. The 23rd ACM international conference on Multimedia, Brisbane, Australia, 2015: 1123–1126.
[11]	BORHANI Y, KHORAMDEL J, and NAJAFI E. A deep learning based approach for automated plant disease classification using vision transformer[J]. Scientific Reports, 2022, 12(1): 11554. doi: 10.1038/s41598-022-15163-0.
[12]	HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. The 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.
[13]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]. The 31st International Conference onAdvances in Neural Information Processing Systems, Long Beach, USA, 2017: 6000–6010.
[14]	LI Shaohua, XUE Kaiping, ZHU Bin, et al. FALCON: A Fourier transform based approach for fast and secure convolutional neural network predictions[C]. The 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 8702–8711.
[15]	DING Caiwen, LIAO Siyu, WANG Yanzhi, et al. CirCNN: Accelerating and compressing deep neural networks using block-circulant weight matrices[C]. The 50th Annual IEEE/ACM International Symposium on Microarchitecture, Cambridge, USA, 2017: 395–408.
[16]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[C]. The 9th International Conference on Learning Representations, Vienna, Austria, 2021.
[17]	WU Kan, PENG Houwen, CHEN Minghao, et al. Rethinking and improving relative position encoding for vision transformer[C]. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 10013–10021.
[18]	HSIEH H Y, CHEN C Y, WANG Y S, et al. BasketballGAN: Generating basketball play simulation through sketching[C]. The 27th ACM International Conference on Multimedia, Chengdu, China, 2019: 720–728.
[19]	HAN Kai, XIAO An, WU Enhua, et al. Transformer in transformer[C]. Advances in Neural Information Processing Systems, 2021: 15908–15919.
[20]	CHEN C F R, FAN Quanfu, and PANDA R. CrossViT: Cross-attention multi-scale vision transformer for image classification[C]. The 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada, 2021: 347–356.
[21]	STSTS. PERFORMANCE ANAL YSIS POWERED BYAI[OL]. https://www.statsperform.com/team-performance. 2022.7.
[22]	RAO Yongming, ZHAO Wenliang, ZHU Zheng, et al. Global filter networks for image classification[C]. Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021: 980–993.
[23]	LIU Ze, LIN Yutong, CAO Yue, et al. Swin transformer: Hierarchical vision transformer using shifted windows[C]. The 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 9992–10002.
[24]	TOUVRON H, CORD M, DOUZE M, et al. Training data-efficient image transformers & distillation through attention[C]. The 38th International Conference on Machine Learning, Vienna, Austria, 2021: 10347–10357.
[25]	TOUVRON H, BOJANOWSKI P, CARON M, et al. ResMLP: Feedforward networks for image classification with data-efficient training[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(4): 5314–5321. doi: 10.1109/TPAMI.2022.3206148.