Title:A invariant structure approach for media representation and recognition
Speaker:Dr. Yu QIAO  (SIAT, CAS)
Time:10:30AM-12:00AM, 10 Aug. 2012
Address:Room 307, Building No. 5, School of Remote Sensing and Information Engineering, Wuhan University
This talk will be divided into two parts. In the first part, I will spend time to explain our recent work on structural representation of media, with speech as an example. One of the major challenging problems in speech engineering is to deal with non-linguistic variations contained in speech signals. These variations are caused by the difference of speakers, communication channels, environment noise, etc. Modern speech approaches mainly rely on statistical methods (such as GMM and HMM) to model the distributions of acoustic features. These methods always require a large amount of data for training. It is well-known that the performance of speech recognizers drops significantly if mismatch exists. We proposed an invariant structural representation of speech which aims at removing the non-linguistic factors from speech signals. Different from classical speech models, the structural representations make use of globally contrastive features to model the global and dynamic aspects of speech and discard the local and static features. It can be proved that these contrastive features (f-divergence) are invariant to any invertible transformations and thus are robust to non-linguistic variations. Experimental results on connected Japanese vowel utterances show that the structural approach achieves better recognition rates than HMM. In the second part, I will review several ongoing projects in Multimedia laboratory, Shenzhen Institutes of Advance Technology, including image retrieval, activity classification, 3D reconstruction, and face recognition.
Yu Qiao received Ph.D from the University of Electro-Communications, Japan,in 2006. He was a JSPS fellow and then a project assistant professor with theUniversity of Tokyo from 2007 to 2010. Now he is a professor and a CAS BaiRenscholar with the Shenzhen Institutes of Advanced Technology, the ChineseAcademy of Sciences. His research interests include pattern recognition, computervision, multimedia, image processing and machine learning. He has publishedmore than 100 papers in these fields. He received the Lu Jiaxi young researcheraward from the Chinese Academy of Sciences.