首都医科大学学报 ›› 2011, Vol. 32 ›› Issue (6): 737-741.doi: 10.3969/j.issn.1006-7795.2011.06.005

• 耳鼻咽喉头颈外科进展 • 上一篇    下一篇

人工神经网络对儿童汉语发声声调的识别

李永新1, 陈秀伍1,2, 赵小燕1,2, 周宁3, 徐立3, 刘婷1, 张国平1, 王顺成1, 崔丹墨1   

  1. 1. 首都医科大学附属北京同仁医院耳鼻咽喉头颈外科,教育部耳鼻咽喉头颈外科重点实验室,北京 100730;2. 北京市耳鼻咽喉科研究所,北京 100005;3. School of Hearing, Speech and Language Sciences, Ohio University, Athens, OH 45701 USA
  • 收稿日期:2011-09-16 修回日期:1900-01-01 出版日期:2011-12-21 发布日期:2011-12-21
  • 通讯作者: 李永新

Recognition of tone production in Chinese phonation of children with an artificial neural network

LI Yong-xin1, CHEN Xiu-wu1,2, ZHAO Xiao-yan1,2, ZHOU Ning3, XU Li3, LIU Ting1, ZHANG Guo-ping1, WANG Shun-cheng1, CUI Dan-mo1   

  1. 1. Department of Otorhinolaryngology Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Key Laboratory of Otorhinolaryngology Head and Neck Surgery, Ministry of Education, Beijing 100730, China;2. Beijing Institute of Otorhinolaryngology, Beijing 100005, China;3. School of Hearing, Speech and Language Sciences, Ohio University, Athens, OH 45701, USA
  • Received:2011-09-16 Revised:1900-01-01 Online:2011-12-21 Published:2011-12-21

摘要: 目的 传统评估声调的方法是由听力正常人主观判断,本研究旨在探讨使用人工神经网络评估说汉语儿童声调的有效性。方法 61名听力正常儿童参加了本研究。首先提取他们所录的汉语单字的基频,此基频即作为前馈式多层神经网络的输入,输入数设为12,隐藏神经元数设为16。神经网络的输出层包含有4个代表汉语四声的神经元。该神经网络对声调的识别率与成人在声调感知实验中的对声调感知识别吻合率进行了比较。结果 结果显示该神经网络能成功地识别这61名儿童的声调。识别吻合率达85%,比成人的声调感知的正确率略高。神经网络和成人声调感知的结果都显示这些儿童在声调发声上有个体差异。结论 本研究显示人工神经网络可以成功识别由多个儿童发出的声调。神经网络可用于客观地评估儿童声调发声的准确性。

关键词: 声调语言, 声调发声, 声调识别, 汉语声调, 模式识别

Abstract: Objective Traditionally, tone production is evaluated subjectively using human listeners. The present study was designed to investigate the efficacy of using an artificial neural network in evaluating tone production of Mandarin-speaking children. Methods The subject group included 61 normal-hearing children aged between 3 and 9 years old. The fundamental frequency(F0) of their produced monosyllabic words was extracted. The F0 were then used as inputs to a feed-forward backpropagation artificial neural network. The numbers of inputs and neurons in the hidden layer were 12 and 16, respectively. The output layer consisted of 4 neurons representing the 4 Mandarin tone patterns. The tone-recognition performance of the neural network was further compared with that of native-Mandarin-speaking adult listeners. Results The neural network successfully classified the tone patterns of the 61 children speakers with an accuracy of about 85% correct. The score was shown to be significantly better than the perception score by the adult listeners. There was individual variability in the children's tone production accuracy as revealed by both the tone recognition of the neural network and by the tone perception of the adult listeners. Conclusion This study demonstrates that the artificial neural network can successfully classify Mandarin-Chinese tone patterns produced by multiple children. The neural network can be used as an objective way of evaluating tone production of children.

Key words: tone language, tone production, tone recognition, Mandarin tones, pattern recognition

中图分类号: