发音过程中舌头运动的3D可视化方法<sup>*</sup>

doi:10.16451/j.cnki.issn1003-6059.201605001

摘要
图/表
参考文献
相关文章 (0)

全文: PDF (1434 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要研究中文发音过程中舌头运动的3D可视化问题。根据核磁共振数据构建舌头精细的3D模型，在此基础上，提取舌背表面处3个点的EMA数据为驱动源，利用弹簧网技术真实再现中文发音过程中的舌头运动。为了验证文中建模和舌头运动合成方法的有效性，使用计算机图形学的方法模拟舌头运动的细节效果，并对比其与由语言学家亲自拍摄的“普通话发音器官动作特征”的X光影像。实验表明，文中方法实现的3D舌头运动符合真实的舌头运动情况，拥有广泛的应用前景。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	李睿
	於俊
	罗常伟
	汪增福

关键词 ： 3D可视化发音动画, 舌头建模, 舌头运动模拟, 碰撞处理

Abstract：Problem of 3D visualization of tongue movements in pronunciation is studied. Firstly, a precise 3D tongue model according to magnetic resonance imaging scan data is built. Based on the 3D tongue model, the electromagnetic articulometer(EMA) data collected from three points on tongue dorsum surface are used as the driven data. The mass spring technique is used to realize realistic tongue movements in pronunciation. To evaluate the effect of modeling and synthesis methods for tongue movements, the computer graphics techniques are employed to simulate the detailed effect of the tongue movements. Finally, the simulation results are compared with X-ray video of the motion characteristics of articulators for Mandarin Chinese recorded by a pronunciation specialist. The experimental result shows the proposed method achieves precise and realistic results of 3D tongue movements and it has a wide application prospect.

Key words： 3D Visual Speech Animation Tongue Modeling Tongue Movement Simulation Collision Handling

收稿日期: 2015-09-10

基金资助:国家自然科学基金项目(No.61472393,61303150)资助

作者简介: 李睿，女，1989年生，博士研究生，主要研究方向为计算机图形学、可视化语音处理、人机交互.E-mail:ruili89@mail.ustc.edu.cn.
於俊，男，1982年生，博士，副研究员，主要研究方向为人机交互、计算机图形学、可视化语音处理.E-mail:harryjun@ustc.edu.cn.
罗常伟，男，1985年生，博士，主要研究方向为计算机图形学、人机交互、视频跟踪.E-mail:luocw@mail.ustc.edu.cn.
汪增福(通讯作者)，男，1960年生，博士，教授，主要研究方向为计算机视觉、模式识别、语音可视化、人机交互、智能机器人.E-mail:zfwang@ustc.edu.cn.

引用本文:

李睿，於俊，罗常伟，汪增福. 发音过程中舌头运动的3D可视化方法^*[J]. 模式识别与人工智能, 2016, 29(5): 385-392. LI Rui, YU Jun, LUO Changwei, WANG Zengfu. 3D Visualization Method for Tongue Movements in Pronunciation. , 2016, 29(5): 385-392.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.201605001 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2016/V29/I5/385

[1] DORAN G A, BAGGETT H. A Structural and Functional Classifica-
tion of Mammalian Tongues. Journal of Mammalogy, 1971, 52(2): 427-429.
[2] HIIEMAE K M, PALMER J B. Tongue Movements in Feeding and Speech. Critical Reviews in Oral Biology and Medicine, 2003, 14(6): 413-429.
[3] WANG L, CHEN H, LI S, et al. Phoneme-Level Articulatory Animation in Pronunciation Training. Speech Communication, 2012, 54(7): 845-856.
[4] 江辰,於俊,罗常伟,等.基于生理舌头模型的语音可视化系统.中国图象图形学报, 2015, 20(9): 1237-1246.
(JIANG C, YU J, LUO C W, et al. Speech Visualization System Based on Physiological Tongue Model. Journal of Image and Graphics, 2015, 20(9): 1237-1246.)
[5] SONG C, WEI J G, FANG Q, et al. Tongue Shape Synthesis Based on Active Shape Model // Proc of the 8th International Symposium on Chinese Spoken Language Processing. Hong Kong, China, 2012: 383-386.
[6] ENGWALL O. Combining MRI, EMA and EPG Measurements in a Three-Dimensional Tongue Model. Speech Communication, 2003, 41(2/3): 303-329.
[7] COHEN M M, BESKOW J, MASSARO D W. Recent Developments in Facial Animation: An Inside View [C/OL] .[2015-08-20]. http://www.speech.kth.se/prod/publications/files/1143.pdf.
[8] WESTBURY J R. X-Ray Microbeam Speech Production Database User's Handbook: 1.0 Version [EB/OL]. [2015-08-20]. http://www.haskins.yale.edu/staff/gafos_downloads/ubdbman.pdf.
[9] WESTBURY J R, SEVERSON E J, LINDSTROM M J. Kinematic Event Patterns in Speech: Special Problems. Language and Speech, 2000, 43(4): 403-428.
[10] TASKO S M, KENT R D, WESTBURY J R. Variability in Tongue Movement Kinematics during Normal Liquid Swallowing. Dysphagia, 2002, 17(2): 126-138.
[11] GRARD J M, WILHELMS-TRICARICO R, PERRIER P, et al. A 3D Dynamical Biomechanical Tongue Model to Study Speech Motor Control. Research Developments in Biomechanics, 2003, 1: 49-64.
[12] ENGWALL O. A 3D Tongue Model Based on MRI Data // Proc of
the 6th International Conference on Speech and Language Proce-
ssing. Beijing, China, 2000: 901-904.
[13] 宋婵.人体发音过程中的三维声道几何建模.硕士学位论文.天津:天津大学, 2013.
(SONG C. Modeling of 3D Geometry Vocal Tract in the Procession of Speech Production. Master Dissertation. Tianjin, China: Tianjin University, 2013.)
[14] TAKEMOTO H, KITAMURA T, NISHIMOTO H, et al. A Method of Tooth Superimposition on MRI Data for Accurate Measurement of Vocal Tract Shape and Dimensions. Acoustical Science and Technology, 2004, 25(6): 468-474.
[15] BADIN P, ELISEI F, BAILLY G, et al. An Audiovisual Talking Head for Augmented Speech Generation: Models and Animations Based on a Real Speaker's Articulatory Data // Proc of the 5th International Conference on Articulated Motion and Deformable Objects. Mallorca, Spain, 2008: 132-143.
[16] SI H. Tetgen: The Quality Tetrahedral Mesh Generator. Version 1.0 User′s Manual [EB/OL]. [2015-08-20]. http://chem.skku.ac.kr/~wkpark/tutor/chem/molsurf/tetgen/UserManual.pdf.
[17] KING S A, PARENT R E. A 3D Parametric Tongue Model for Animated Speech. The Journal of Visualization and Computer Animation, 2001, 12(3): 107-115.
[18] LI R, YU J, JIANG C, et al. A Mass-Spring Tongue Model with Efficient Collision Detection and Response during Speech // Proc of the 9th International Symposium on Chinese Spoken Language Processing. Singapore, Singapore, 2014: 354-358.
[19] YU J, WANG Z F. A Video, Text, and Speech-Driven Realistic 3-D Virtual Head for Human-Machine Interface. IEEE Trans on Cybernetics, 2014, 45(5): 991-1002.
[20] 鲍怀翘,杨力立.普通话发音器官动作特性(X光录像带).北京:北京语言学院出版社, 1985.
(BAO H Q, YANG L L. The Motion Characteristics of Articulators for Mandarin Chinese (X-Ray Video). Beijing, China: Beijing Language and Culture University Press, 1985.)