基于加权有限状态机的动态匹配词图生成算法
doi: 10.3724/SP.J.1146.2013.00422
Exact Word Lattice Generation in Weighted Finite State Transducer Framework
-
摘要: 由于现有的加权有限状态机(WFST)解码网络没有精确词尾标记,导致当前已有的词图生成算法不含精确的词尾时间点,或者仅是状态、音素级别的词图,无法应用到关键词检索中。该文提出在WFST静态解码器下的语音识别词图生成算法。首先从理论上分析了WFST解码音素图和词图的可转换关系,然后提出了字典的动态音素匹配方法解决了WFST网络中词尾时间点对齐的问题,最后通过令牌传递的遍历方法生成了词图。同时,考虑到计算量优化,在令牌传递过程中引入了剪枝算法,使音素图转词图的耗时不到解码耗时的3%。得到的词图,不仅可以用于语言模型重打分,由于含有精确的词尾时间点,还可以直接应用到关键词检索系统中。实验结果表明,该文的词图生成算法具有较高的计算效率;和已有动态解码器的词图相比,词图中包含更多解码信息,在大词汇连续语音识别的重打分结果和关键词检索中都能取得更好的性能。Abstract: The existing lattice generation algorithms have no exact word end time because the Weighted Finite State Transducer (WFST) decoding networks have no word end node. An algorithm is proposed to generate the standard speech recognition lattice within the WFST decoding framework. The lattices which have no exact word end time can not be used in the keyword spotting system. In this paper, the transformation relationship between WFST phone lattices and standard word lattice is firstly studied. Afterward, a dynamic lexicon matching method is proposed to get back the word end time. Finally, a token passing method is proposed to transform the phone lattices into standard word lattices. A prune strategy is also proposed to accelerate the token passing process, which decreases the transforming time to less than 3% additional computation time above one-pass decoding. The lattices generated by the proposed algorithm can be used in not only the language model rescoring but also the keyword spotting systems. The experimental results show that the proposed algorithm is efficient for practical application and the lattices generated by the proposed algorithm have more information than the lattices generated by the comparative dynamic decoder. This algorithm has a good performance in language model rescoring and keyword spotting.
计量
- 文章访问数: 2780
- HTML全文浏览量: 131
- PDF下载量: 2744
- 被引次数: 0