基于SVD的唇动视觉语音特征提取技术被引量：3

Visual-audio feature extraction of lip movements based on SVD

在线阅读下载PDF

导出

摘要唇动视觉语音特征提取是音视频驱动的人脸动画唇动表示和唇读研究的关键技术.首先针对彩色视频图像进行唇色增强,对增强后的灰度图像进行阈值分割,获取唇部包围框,并根据口型发音的视觉特征进行初分类;然后进行尺度与灰度归一化处理,对预处理后的图像提取奇异值特征;最后采用基于欧氏距离的模板匹配法对该奇异值特征所包含的视觉语音信息进行测试试验,结果表明该低维度特征包含了大量唇动视觉语音信息,可用于单个人在自然环境下的唇语口型识别. Visual feature extraction of lip movement is a key issue in video and speech driven face (animation systems. )The approach is like this: firstly enhance chromatic video image, then segment enhanced gray images with thresholds, and finally obtain lip shapes. This method classifies the lip-shapes according to the visual features of pronunciations, regulates the dimensions and grayscales of lip images, and (extracts) features based on SVD from the preprocessed images. Finally, the template (matching algorithm based ) on Euclidean distance is applied. The results show that the character of lower dimensions includes a large number of visual speeches' information, so it can be applied in individual natural conditions.

作者张建明陶宏王良民詹永照宋顺林

机构地区江苏大学计算机科学与通讯工程学院

出处《江苏大学学报（自然科学版）》 EI CAS 2004年第5期426-429,共4页 Journal of Jiangsu University：Natural Science Edition

基金国家自然科学基金资助项目(60273040) 江苏省高校自然科学基金资助项目(02KJB520003)

关键词唇动特征提取 SVD 唇读 lip movements feature extraction SVD lipreading

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献13

1Potamianos G, Neti C, Iyengar G, et al. A cascade visual front end for speaker independent automatic speechreading[J]. International Journal of speech technology, 2001 (4) :193 -208.
2Gerasimos Potamianos, Chalapathy Neti. Improved ROI and within frame discriminant features for lipreading[A]. In: Proceedings of the International Conference on Image Processing[C]. Piscataway: IEEE, 2001.
3Kazuhiro Nakamura, Noriaki Murakam, Ka-zuyoshi Takagi, et al. A real-time lipreading LSI for word recognition [J/OL]. http:∥www. ap-asic. org/2002/proceedings/SC/3C _ 5. pdf, 2002.
4姚鸿勋,吕雅娟,高文.基于色度分析的唇动特征提取与识别[J].电子学报,2002,30(2):168-172. 被引量：9
5AWC Liew, SH Leung, WH Lau. Lip contour extraction from color images using a deformable model[J]. Pattern Recognition, 2002, 35: 2949- 2962.
6Uda K, Tagawa N, Minagawa A, et al. Effectiveness evaluation of word characteristics obtained from 3D image information for lipreading[A]. In: Proceedings 11th International Conference on Image Analysis and Processing[C]. Los Alamitos: IEEE, 2001.
7Matthews I, Potamianos G, Neti C, et al. A comparison of model and transform-based visual features for audiovisual LVCSR[A]. In: Proc lnt Conf Multimedia Expo[C]. Los Alamitos: IEEE, 2001.
8lain Matthews J , Andrew Bangham , Richard Harvey.Extraction of visual features for lipreading [J]. IEEE Transaction on Pattern Analysis and Machine Intelligence, 2002,24(2) :198 -213.
9Zhang Jian-ming, Wang Liang-min, Niu De-jiao, et al.Research and implementation of a real time approach to lip detection in video sequences [A]. In: Proceedings of 2003 International Conference on Machine Learning and Cybernetics [C]. Piscataway: IEEE, 2003.
10王良民,张建明,詹永照,宋顺林.人脸检测研究现状和发展[J].江苏大学学报（自然科学版）,2003,24(3):75-79. 被引量：13

二级参考文献30

1Yang J, Waibel A. A Real-Tine Face Tracker[C]. In:IEEE Proc of the 3^rd Workshop on ACV, Florida,1996.
2Hongo H, Ohya M, Yasumoto M. Focus of Attention for Face and Hand Gesture Recognition Using Multiple Cameras[C]. In: pro 4^th IEEE Inter Conf on AFGR,2000.
3Yoo T W, Oh I S. A Fast Algorithm for Tracking Humane Faces Based in Chromatic Histogram[J ]. Pattern Recognition Letters, 1999, 20(10) :967 - 978.
4Yuille A, Hallinan P, Cohen D. Feature Extraction from Faces Using Deformable Templates [ J ]. International Journal of Computer Vision, 1992, 8 (2) : 99 - 111.
5Yang G Z, Huang T S. Human Face Detection in a Complex Background [ J ]. Pattern Recognition, 1994,27(1): 53-63.
6Turk M, Pentland, A. Eigenface for Recognition[J]. JCog Neurosei, 1991,23(3) :71 - 86.
7Moghaddam B, Pentland A. Probabilistic Visual Learning for Object Representation[J ]. IEEE Trans on Pattern Analysis and Machine Intelligence, 1997, 19 (7) :696 - 710.
8Sung K K , Poggio T . Example - Based Learning for View-Based Human Face Detection[J ]. IEEE Trans on Pattern Analysis and Machine Intelligence, 1998, 20(1):39-51.
9Rowley H A, Baluja S, Kanade T. Neural Network Based Face Detection[J]. IEEE Trans Pattern Analysis and Machine Intelligence, 1998, 20(1) :23 - 38.
10Rowley H A, Baluja S, Kanade T. Rotation Invariant Neural Network Based Face Detection[C]. In:Proceeding of IEEE Computer Society Conference on CVPR'98. California, 1998.

共引文献97

1吴小俊,王士同,杨静宇,刘同明.基于扰动方法和广义K-L变换的人脸特征抽取[J].系统仿真学报,2006,18(z2):906-908.
2陈永智.语文教学要遵循认识规律[J].青海师专学报,2005(S1):111-112.
3戴雯惠,叶良.一种基于多特征融合的人脸检测技术实现[J].微型电脑应用,2011(9):47-49.
4李峰,江波,陈金华,金红.一种面向复杂场景的彩色图像人脸检测方法[J].江苏大学学报（自然科学版）,2004,25(4):356-360. 被引量：2
5WuXiaojun,YangJingyu,JosefKittler,WangShitong,LiuTongming,KieronMesser.Study on the Essence of Optimal Statistically Uncorrelated Discriminant Vectors and Its Application to Face Recognition[J].工程科学（英文版）,2004,2(2):61-66.
6杜飞涛,陈先桥,万勇.一种基于肤色分割的人脸检测方法[J].湖北工学院学报,2004,19(5):40-42. 被引量：2
7吴小俊,杨静宇,王士同,Josef Kittler,陆介平.改进的统计不相关最优鉴别矢量集[J].电子与信息学报,2005,27(1):47-50. 被引量：8
8尤媛媛,吴小俊.一种个体特征脸子空间与奇异值相结合的人脸验证算法[J].华东船舶工业学院学报,2005,19(1):44-48. 被引量：3
9洪子泉,杨静宇.基于奇异值特征和统计模型的人像识别算法[J].计算机研究与发展,1994,31(3):60-65. 被引量：49
10陈才扣,王正群,杨静宇,杨健.一种用于人脸识别的非线性鉴别特征融合方法[J].小型微型计算机系统,2005,26(5):793-797. 被引量：3

同被引文献87

1汤敏,王元全,夏德深.基于Snake模型的嘴部特征分割[J].计算机工程,2004,30(21):7-9. 被引量：5
2洪子泉,杨静宇.基于奇异值特征和统计模型的人像识别算法[J].计算机研究与发展,1994,31(3):60-65. 被引量：49
3刘剑毅,郑南宁,游屈波.一种基于小波的人脸衰老化合成方法[J].软件学报,2007,18(2):469-476. 被引量：7
4Scanlon P,Reilly R B.Feature analysis for automatic speechreading[C]//IEEE International Conference on Multimedia Signal Processing.Cannes,France:IEEE,2001:625-630.
5Guitarte Pérez J F,Frangi A F,Lleida Solano E,et al.Lip reading for robust speech recognition on embedded devices[C]//IEEE International Conference on Acoustics,Speech,and Signal Processing.Philadelphia,PA,USA:IEEE,2005,1:473-476.
6Potamianos G,Neti C.Improved ROI and within frame discriminant features for lipreading[C]//International Conference on Image Processing.Thessaloniki.Greece:IEEE.2001,3:250-253.
7Iwano K,Tamura S,Furui S.Bimodal speech recognition using lip movement measured by opticalflow analysis[C]//InternationaI Workshop on Hands-Free Speech Communication.Kyoto,Japan:[s.n.],2001:187-190.
8Gray M S,Movellan J R,Sejnowski T J,et al.Dynamic features for visual speechreading:a systematic comparison[C]//3rd Joint Symposium on Neural Computation.La Jolla,CA,USA:University of California,San Diego,1996,6:222-230.
9Luettin J,Thaeker N A,Beet S W.Visual speechrecognition using active shape models and hidden markov models[C]//IEEE International Conference on Acoustics,Speech,and Signal Processing.Atlanta,GA,USA:IEEE,1996,2:817-820.
10Luettin J,Thacker N A,Beet S W.Speechreading using shape and intensity information[C]//IEEE 4th International Conference on Spoken Language.Philadelphia,PA,USA:IEEE,1996,1:58-61.

引证文献3

1荣传振,岳振军,贾永兴,王渊,杨宇.唇语识别关键技术研究进展[J].数据采集与处理,2012,27(S2):277-283. 被引量：4
2王晓平,郝玉峰,付德刚,袁春伟.计算机唇读研究进展[J].数据采集与处理,2007,22(3):353-359. 被引量：2
3张建明,陈君.基于多方向滤波的人脸年龄图像合成方法[J].江苏大学学报（自然科学版）,2009,30(4):392-395. 被引量：1

二级引证文献7

1严乐贫,奉小慧.双模态车载语音控制仿真系统的设计与实现[J].计算机与现代化,2010(8):211-215.
2刘建通.基于Kinect的听障人士语言能力康复辅助系统[J].现代计算机,2016,22(5):92-95. 被引量：1
3张剑,屈丹,李真.基于循环神经网络语言模型的N-best重打分算法[J].数据采集与处理,2016,31(2):347-354. 被引量：3
4宫慧娜,雷江华,陈亮.1946-2017年国际唇读研究进展——基于科学知识图谱的可视化研究[J].岭南师范学院学报,2018,39(2):43-54. 被引量：1
5王蓉蓉,师睿.基于BIF和KR-RCA的跨年龄人脸图像识别[J].计算技术与自动化,2018,37(4):90-94.
6马金林,巩元文,马自萍,陈德光,朱艳彬,刘宇灏.唇语识别的视觉特征提取方法综述[J].计算机科学与探索,2021,15(12):2256-2275. 被引量：3
7陶志勇,陈露,刘影,郭京.LipSense:基于CSI相位差的自适应唇语识别方法[J].传感技术学报,2023,36(3):419-426. 被引量：1

1贾熹滨,尹宝才,孙艳丰.基于双层码本的语音驱动视觉语音合成系统[J].计算机科学,2014,41(1):100-104. 被引量：2
2李亚东,王洪栋,朱美强.改进单尺度Retinex算法在矿井图像中的运用[J].煤矿机械,2015,36(5):282-284. 被引量：6
3陶铭洋,郭阶添,韩立新,陈肖,孙周宝,王敏.一种基于区域分割的SIFT图像特征提取算法[J].信息技术,2013,37(10):126-130.
4文韬,李峰,周书仁.低维度特征的行人检测方法[J].计算机工程与设计,2013,34(9):3174-3178.
5王志明,蔡莲红,艾海舟.视觉语音参数的自动估计[J].计算机研究与发展,2005,42(7):1185-1190.
6新发现与新技术[J].电子测试,2007(4):104-104.
7肖庆阳,张金,左闯,范娟婷,梁碧玮,邸硕临.基于语义约束的口型序列识别方法[J].计算机应用与软件,2012,29(9):226-229.
8吕国云,赵荣椿,蒋冬梅,蒋晓悦,侯云舒,Sahli H.基于BTSM和DBN模型的唇读和视素切分研究[J].计算机工程与应用,2007,43(14):21-24.
9付强,甘亮,李爱平,吴泉源.一种基于主成分分析算法的网络异常检测实现[J].南京师范大学学报（工程技术版）,2008,8(4):13-16. 被引量：4
10李若寒,宋梅萍,蔡刘芬.航拍高光谱溢油图像中的连续油区划分方法研究[J].中国水运（下半月）,2016,16(2):295-299. 被引量：1

江苏大学学报（自然科学版）

2004年第5期

浏览历史

内容加载中请稍等...

基于SVD的唇动视觉语音特征提取技术被引量：3

参考文献13

二级参考文献30

共引文献97

同被引文献87

引证文献3

二级引证文献7

相关作者

相关机构

相关主题

浏览历史

基于SVD的唇动视觉语音特征提取技术 被引量：3

参考文献13

二级参考文献30

共引文献97

同被引文献87

引证文献3

二级引证文献7

相关作者

相关机构

相关主题

浏览历史

基于SVD的唇动视觉语音特征提取技术被引量：3