Multi-party Human-computer Interaction Dialogue Psychology Model Based on Stackelberg Game
-
摘要: 针对现有的多方人机交互存在分寸感把握较差、机器人回复自主性不强的问题,该文依据心理语言学关于对话心理理论,提出一种基于主从博弈的多方人机交互对话心理模型。该模型模拟了人与人交流时交互多方的心理过程,考虑到多方交互中主导者与从属者的交流特征,采取单主多从的博弈模型加以形式化表示,并使机器人扮演从属者角色,对多方主从博弈过程中从属关系带来的收益进行考量,将这一考量结果作为机器人回复的重要决策依据。实验结果表明,扮演从属者角色的机器人在与多方进行交互时,能准确把握对话分寸,在合适的时机下进行回复,进一步提升机器人回复的合理性与自主性。Abstract: Considering the problems of the existing in the process of multi-party human-computer interaction system, such as lack of propriety and low autonomy, a dialogue psychological model based on Stackelberg game is proposed in this paper. The multi-party human-computer interaction model is used to simulate the psychological game process in interpersonal communication. Taking into account the communication characteristics between the leader and the follower in the multi-party interaction, the game model of single leader and multiple followers is adopted to formalize it. The robot is played the role of the follower and considered the benefit brought by subordinate relationship in the multi-party Stackelberg game to make the effective decision-making strategy. The experimental result shows that the robot played the role of follower is polite and replied at the right time in communication with the multi-party, which improves further the rationality and autonomy in response for the robot.
-
表 1 收益值设置
角色类型 收益值 策略 主导者主导者 101 作为话题的发起方,展开话题作为话题的结束方,结束话题 从属者 10 发表高质量、主题相关的话语 从属者 1 发表低质量、主题无关的话语 算法1 基于主从博弈的多方人机交互对话心理模型 输入: $ k $次会话多方的输入内容 $ \left\{ {{C_1},{C_2},\cdots,{C_n}} \right\} $,机器人的角色 定位初始化为从属者 ${S_0}$; 输出: $ k{\text{ + 1}} $次会话时机器人的回复策略; (1) Repeat: (2)多方输入内容 $ \left\{ {{C_1},{C_2},\cdots,{C_n}} \right\} $; (3) 根据式(2)—式(11)分析多方的对话策略,计算多方的从属度; (4) 根据式(12)得到博弈方的策略-收益函数 $ Q $; (5) 根据式(13)—式(15)计算机器人作为从属者的最优回复策 略 $ \pi $; (6) 通过最优回复策略 $ \pi $得到机器人的下一次回复行为 $ A $; (7) 当机器人需要参与对话时,机器人在下一次会话中做出响 应回复; (8) 令 $ k = k{\text{ + 1}} $; (9) Until 多方停止交互输入; (10) 多方人机交互会话终止; 表 2 友好性和合理评价标准
友好性 评价指标 +2 内容相关性好,连贯性强,语言表达自然流畅, 符合人的说话习惯 +1 内容勉强相关,连贯性一般,语言表达正确 +0 内容勉强相关,语言表达有些错误 –1 内容相关性很低,语言表达出现较多错误 –2 内容不相关,答非所问 合理性 评价指标 +2 回复的频率适度 +1 回复频率较高或较低 +0 需要用户重复对话才能理解 –1 强行打断用户之间的互动交流 –2 长时间的“尴尬”沉默 表 3 不同对比模型的自动评测结果
模型 MRR MAP CATD 0.525 0.548 GSN 0.517 0.545 ChatterBot 0.452 0.483 MIDS 0.639 0.674 本文 0.656 0.687 表 4 真阳性率和伪阳性率计算结果表
计算结果 TPR FPR 分数 0.80 0.40 表 5 志愿者与模型交互轮数和时间统计
模型 平均交互轮数 平均交互时间(s) CATD 4.68 84.29 GSN 5.86 82.98 ChatterBot 2.95 73.68 MIDS 6.18 110.27 本文 9.45 122.58 表 6 志愿者对各模型友好性和合理性打分统计
模型 友好性 合理性 CATD 1.42 0.87 GSN 1.38 1.21 ChatterBot 1.22 0.72 MIDS 1.60 1.38 本文 1.58 1.47 -
[1] 陈鑫, 周强. 开放型对话技术研究综述[J]. 中文信息学报, 2021, 35(11): 1–12.CHEN Xin and ZHOU Qiang. A survey of research on open domain dialogue systems[J]. Journal of Chinese Information Processing, 2021, 35(11): 1–12. [2] ZHU Qingfu, CUI Lei, ZHANG Weinan, et al. Retrieval-enhanced adversarial training for neural response generation[C]. The 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 2019: 3763–3773. [3] UTHUS D C and AHA D W. Multiparticipant chat analysis: A survey[J]. Artificial Intelligence, 2013, 199/200: 106–121. doi: 10.1016/j.artint.2013.02.004 [4] 张开颜, 张伟男, 刘挺. 基于深度学习的多方对话研究综述[J]. 中国科学:信息科学, 2021, 51(8): 1217–1232. doi: 10.1360/SSI-2020-0176ZHANG Kaiyan, ZHANG Weinan, and LIU Ting. A survey of multi-party dialogue research based on deep learning[J]. Scientia Sinica Informationis, 2021, 51(8): 1217–1232. doi: 10.1360/SSI-2020-0176 [5] KENNINGTON C, FUNAKOSHI K, TAKAHASHI Y, et al. Probabilistic multiparty dialogue management for a game master robot[C]. 2014 ACM/IEEE International Conference on Human-Robot Interaction, Bielefeld, Germany, 2014: 200–201. [6] ŻARKOWSKI M. Multi-party turn-taking in repeated human-robot interactions: An interdisciplinary evaluation[J]. International Journal of Social Robotics, 2019, 11(5): 693–707. doi: 10.1007/s12369-019-00603-1 [7] DE BAYSER M G, GUERRA M A, CAVALIN P, et al. Specifying and implementing multi-party conversation rules with finite-state-automata[C]. The Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, USA, 2018: 713–719. [8] MALIK U, SAUNIER J, FUNAKOSHI K, et al. Who speaks next? Turn change and next speaker prediction in multimodal multiparty interaction[C]. The 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence, Baltimore, USA, 2020: 349–354. [9] LE Ran, HU Wenpeng, SHANG Mingyue, et al. Who is speaking to whom? Learning to identify utterance addressee in multi-party conversations[C]. The 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, 2019: 1909–1919. [10] TAN Ming, WANG Dakuo, GAO Yupeng, et al. Context-aware conversation thread detection in multi-Party Chat[C]. 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, 2019: 6456–6461. [11] HU Wenpeng, CHAN Zhangming, LIU Bing, et al. GSN: A graph-structured network for multi-party dialogues[C]. The 28th International Joint Conference on Artificial Intelligence, Macao, China, 2019: 5010–5016. [12] DZIRI N, KAMALLOO E, MATHEWSON K, et al. Augmenting neural response generation with context-aware topical attention[C]. The First Workshop on NLP for Conversational AI, Florence, Italy, 2019: 18–31. [13] 成驰. 一类基于Stackelberg博弈的多智能体强化学习算法[D]. [硕士论文], 南京大学, 2017.CHENG Chi. A multi-agent reinforcement learning algorithm based on stackelberg game[D]. [Master dissertation], Nanjing University, 2017. [14] 华生. 欲望心理学: 人际交往中的心理博弈[M]. 北京: 中央编译出版社, 2016: 1–5.HUA Sheng. Psychology on Desire: Psychological Game in Interpersonal Communication[M]. Beijing: Central Compilation & Translation Press, 2016: 1–5. [15] 赵姝, 刘晓曼, 段震, 等. 社交关系挖掘研究综述[J]. 计算机学报, 2017, 40(3): 535–555. doi: 10.11897/SP.J.1016.2017.00535ZHAO Shu, LIU Xiaoman, DUAN Zhen, et al. A survey on social ties mining[J]. Chinese Journal of Computers, 2017, 40(3): 535–555. doi: 10.11897/SP.J.1016.2017.00535 [16] 韩程程, 李磊, 刘婷婷, 等. 语义文本相似度计算方法[J]. 华东师范大学学报:自然科学版, 2020(5): 95–112. doi: 10.3969/j.issn.1000-5641.202091011HAN Chengcheng, LI Lei, LIU Tingting, et al. Approaches for semantic textual similarity[J]. Journal of East China Normal University: Natural Science, 2020(5): 95–112. doi: 10.3969/j.issn.1000-5641.202091011 [17] HASAN M, RAHMAN A, KARIM R, et al. Normalized approach to find optimal number of topics in Latent Dirichlet allocation (LDA)[M]. KAISER M S, BANDYOPADHYAY A, MAHMUD M, et al. Proceedings of International Conference on Trends in Computational and Cognitive Engineering. Singapore: Springer, 2021: 341–354. [18] SUN Xiang, LIU Lu, AYORINDE A, et al. ED-SWE: Event detection based on scoring and word embedding in online social networks for the internet of people[J]. Digital Communications and Networks, 2021, 7(4): 559–569. doi: 10.1016/j.dcan.2021.03.006 [19] LU Xin, DENG Yao, SUN Ting, et al. MKPM: Multi keyword-pair matching for natural language sentences[J]. Applied Intelligence, 2022, 52(2): 1878–1892. doi: 10.1007/s10489-021-02306-5 [20] SHANMUGAPRIYA P and MARIMUTHU H. Development of chatterbot using python[J]. International Journal of Computer Applications, 2020, 176(21): 18–20. doi: 10.5120/ijca2020920184 [21] YOUFOU. Wxpy messages[EB/OL]. https://wxpy.readthedocs.io/zh/latest/messages.html, 2021. [22] YANG Qichuan, HE Zhiqiang, ZHAN Zhiqiang, et al. End-to-End personalized humorous response generation in untrimmed multi-role dialogue system[J]. IEEE Access, 2019, 7: 94059–94071. doi: 10.1109/ACCESS.2019.2926830 [23] WU Yu, WU Wei, XING Chen, et al. Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots[C]. The 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017: 496–505.