基于等价伪译词模型的无指导译文消歧研究
doi: 10.3724/SP.J.1146.2007.01029
Unsupervised Translation Disambiguation Based on Equivalent PseudoTranslation Model
-
摘要: 该文提出了一种基于等价伪译词进行无指导译文消歧的方法。该方法利用源语言岐义词不同语义下目标语译文的单义同义词集合,定义并构造等价伪译词。利用等价伪译词从目标语语料中自动获取大量已标注语义的目标语实例。由这些实例得到的目标语语义知识,可直接形成该等价伪译词的语义分类器。利用Hownet可将含目标歧义词的英语实例映射成汉语词集合,然后利用这个语义分类器进行译文消歧。在国际标准语义评测集上进行的测试表明,该方法优于其余两种自动获取已标注语料的系统,且与Senseval-2 ELS上可比较的最好无指导系统的性能相当。Abstract: This paper describes an unsupervised translation disambiguation method based on the Equivalent Pseudo Translation (EPT). EPT is constructed by using non-ambiguous words of target language, which is semantically equivalent to the source ambiguous words. Sense-tagged examples are automatically extracted from a large scale Chinese corpus, by which a semantic classifier of EPT is formed. In order to apply the EPT classifier, English examples are mapped into a set of Chinese words by Hownet. This method is evaluated on the Senseval-2 framework of English lexical sample task, achieving the top performance against all other previous works which extract sense-tagged examples automatically. The results are very close to the state-of-the-art comparable unsupervised systems.
计量
- 文章访问数: 2823
- HTML全文浏览量: 87
- PDF下载量: 776
- 被引次数: 0