首页 | 本学科首页   官方微博 | 高级检索  
     检索      


USAF: Multimodal Chinese named entity recognition using synthesized acoustic features
Institution:1. College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China;2. Inner Mongolia University, Hohhot 010021, China;3. Engineering Research Center of Ecological Big Data, Ministry of Education, China
Abstract:Due to the particularity of Chinese word formation, the Chinese Named Entity Recognition (NER) task has attracted extensive attention over recent years. Recently, some researchers have tried to solve this problem by using a multimodal method combining acoustic features and text features. However, the text-speech data pairs required by the above methods are lacking in real-world scenarios, making it difficult to apply widely. To address this, we proposed a multimodal Chinese NER method called USAF, which uses synthesized acoustic features instead of actual human speech. USAF aligns text and acoustic features through unique position embeddings and uses a multi-head attention mechanism to fuse the features of the two modalities, which stably improves the performance of Chinese named entity recognition. To evaluate USAF, we implemented USAF on three Chinese NER datasets. Experimental results show that USAF witnesses a stable improvement compare to text-only methods on each dataset, and outperforms SOTA external-vocabulary-based method on two datasets. Specifically, compared to the SOTA external-vocabulary-based method, the F1 score of USAF is improved by 1.84 and 1.24 on CNERTA and Aishell3-NER, respectively.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号