Improving Phoneme to Viseme Mapping for Indonesian Language
Anung Rachman(1*), Risanuri Hidayat(2), Hanung Adi Nugroho(3)
(1) Universitas Gadjah Mada Institut Seni Indonesia (ISI) Surakarta
(2) Universitas Gadjah Mada
(3) Universitas Gadjah Mada
(*) Corresponding Author
Abstract
Keywords
Full Text:
PDFReferences
S. Taylor, B.J. Theobald, and I. Matthews, “A Mouth Full of Words: Visually Consistent Acoustic Redubbing,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015, pp. 4904–4908.
C.F. Rademan and T. Niesler, “Improved Visual Speech Synthesis Using Dynamic Viseme k -means Clustering and Decision Trees,” Facial Analysis, Animation, and Auditory-Visual Speech Processing (FAAVSP), 2015, pp. 169–174.
Arifin, S. Sumpeno, Muljono, and M. Hariadi, “A Model of Indonesian Dynamic Visemes from Facial Motion Capture Database Using a Clustering-based Approach,” IAENG International Journal of Computer Science, Vol. 44, No. 1, pp. 41–51, Feb. 2017.
S.L. Taylor, “Discovering Dynamic Visemes,” Doctoral dissertation, University of East Anglia, Norwich, UK, 2013.
P. Shih, A. Paul, J. Wang, and Y. Chen, “Speech-Driven Talking Face Using Embedded Confusable System for Real Time Mobile Multimedia,” Multimedia Tools and Applications, Vol. 73, No. 1, pp. 417–437, Nov. 2014.
E. Setyati, S. Sumpeno, M.H. Purnomo, K. Mikami, M. Kakimoto, and K. Kondo, “Phoneme-Viseme Mapping for Indonesian Language Based on Blend Shape Animation,” IAENG International Journal of Computer Science, Vol. 42, No. 3, pp. 233–244, Jul. 2015.
S.-M. Hwang, H.-K. Yun, and B.-H. Song, “Korean Speech Recognition Using Phonemics for Lip-Sync Animation,” Information Science, Electronics and Electrical Engineering (ISEEE), 2014, pp. 1011–1014.
J. Xu, J. Pan, and Y. Yan, “Agglutinative Language Speech Recognition Using Automatic Allophone Deriving,” Chinese Journal of Electronics, Vol. 25, No. 2, pp. 328–333, Mar. 2016.
L. Cappelletta and N. Harte, “Phoneme-to-Viseme Mapping for Visual Speech Recognition,” 1st International Conference on Pattern Recognition Applications and Methods (ICPRAM) Volume 2, 2012, pp. 322–329.
J. Jeffers and M. Barley, Speechreading (lipreading), 1st ed. Springfield, USA: Charles C. Thomas Publisher, 1971.
C. Neti, G. Potamianos, J. Luettin, I. Matthews, H. Glotin, D. Vergyri, J. Sison, A. Mashari, and J. Zhou, “Audio Visual Speech Recognition,” IDIAP, Workshop Final Report, 2000.
T.J. Hazen, “Visual Model Structures and Synchrony Constraints for Audio-Visual Speech Recognition,” IEEE Transactions on Audio, Speech, and Language Processing, Vol. 14, No. 3, pp. 1082–1089, May 2006.
E. Bozkurt, Ç.E. Erdem, E. Erzin, T. Erdem, and M. Özkan, “Comparison of Phoneme and Viseme Based Acoustic Units for Speech Driven Realistic lip Animation,” 3DTV Conference, 2007, pp. 1–4.
S. Lee and D. Yook, “Audio-to-Visual Conversion Using Hidden Markov Models,” Pacific Rim International Conference on Artificial Intelligence (PRICAI), 2002, pp. 563–570.
A.A. Perwitasari, M. Klamer, J. Witteman, and N.O. Schiller, “Vowel Duration in English as a Second Language Among Javanese Learners,” International Conference on Phonetic Sciences, 2015, pp. 1-4.
L. Burrows, L. Jarmulowicz, and D.K. Oller, “Allophony in English Language Learners: The Case of Tap in English and Spanish,” Language, Speech, and Hearing Services in Schools, Vol. 50, No. 1, pp. 138–149, Jan. 2019.
S.-G. Bae, B.-M. Lim, and M.-J. Bae, “A Comparative Analysis on Allophone of Korean for English Natives,” Information, Vol. 20, No. 5(A), pp. 3291–3298, May 2017.
E. van Zanten, “The Indonesian vowels : Acoustic and Perceptual Explorations,” Doctoral dissertation, Rijksuniversiteit te Leiden, Netherlands, 1989.
N. Adisasmito-Smith, “Phonetic and phonological influences of Javanese on Indonesian,” Doctoral dissertation, Cornell University, New York, USA, 2004.
E.C. Horne, Beginning Javanese, 3rd ed. London, UK: Yale University Press, 1961.
K.M. Dudas, “The Phonology and Morphology of Modern Javanese,” Doctoral dissertation, University of Illinois, Urbana-Champaign, USA, 1976.
K. Hayward, “Lexical Phonology and the Javanese Vowel System,” SOAS Working Papers in Linguistics, Vol. 9, pp. 191–225, 1999.
Wedhawati, W.E.S. Nurlina, E. Setiyanto, R. Sukesi, Marsono, and I.P. Baryadi, Tata Bahasa Jawa Mutakhir, Revisi ed. Yogyakarta, Indonesia: Penerbit Kanisius, 2006.
C.D. Soderberg and K.S. Olson, “Indonesian,” Journal of the International Phonetic Association, Vol. 38, No. 2, pp. 209–213, Aug. 2008.
Arifin, Muljono, S. Sumpeno, and M. Hariadi, “Towards Building Indonesian Viseme: A Clustering-Based Approach,” IEEE International Conference on Computational Intelligence and Cybernetics (CYBERNETICSCOM), 2013, pp. 57–61.
M. Liyanthy, H. Nugroho, and W. Maharani, “Realistic Facial Animation Of Speech Synchronization For Indonesian Language,” 3rd International Conference on Information and Communication Technology (ICoICT ), 2015, pp. 563–567.
I.R. Titze, R.J. Baken, K.W. Bozeman, S. Granqvist, N. Henrich, C.T. Herbst, D.M. Howard, E.J. Hunter, D. Kaelin, R.D. Kent, J. Kreiman, M. Kob, A. Löfqvist, S. McCoy, D.G. Miller, H. Noé, R.C. Scherer, J.R. Smith, B.H. Story, J.G. Švec, S. Ternström, and J. Wolfe, “Toward a Consensus on Symbolic Notation of Harmonics, Resonances, and Formants in Vocalization,” J. Acoust. Soc. Am., Vol. 137, No. 5, pp. 3005–3007, May 2015.
H.L. Bear and R. Harvey, “Phoneme-to-Viseme Mappings: The Good, the Bad, and the Ugly,” Speech Communication, Vol. 95, pp. 40–67, Dec. 2017.
“American National Standard Acoustical Terminology,” ANSI S1.1-1994 (ASA 111-1994), Standards Secretariat, Acoustical Society of America, New York, 1994.
D. O’Shaughnessy, “Linear predictive coding,” IEEE Potentials, Vol. 7, No. 1, pp. 29–32, Feb. 1988.
P. Ladefoged and K. Johnson, A Course in Phonetics, 7th ed. Stamford, USA: Cengage Learning, 2014.
S.J. Cox, R.W. Harvey, Y. Lan, J.L. Newman, and B.-J. Theobald, “The Challenge of Multispeaker Lip-Reading.,” International Conference on Auditory-Visual Speech Processing (AVSP), 2008, pp. 179–184.
Z. Zhou, G. Zhao, X. Hong, and M. Pietikäinen, “A Review of Recent Advances in Visual Speech Decoding,” Image and Vision Computing, Vol. 32, No. 9, pp. 590–605, Sep. 2014.
A. Asthana, S. Zafeiriou, S. Cheng, and M. Pantic, “Incremental Face Alignment in the Wild,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 1859–1866.
C. Goodall, “Procrustes Methods in the Statistical Analysis of Shape,” Journal of the Royal Statistical Society. Series B (Methodological), Vol. 53, No. 2, pp. 285–339, Jan. 1991.
S. Wold, K. Esbensen, and P. Geladi, “Principal Component Analysis,” Chemometrics and Intelligent Laboratory Systems, Vol. 2, No. 1–3, pp. 37–52, Aug. 1987.
K. Sasirekha and P. Baby, “Agglomerative Hierarchical Clustering Algorithm-A Review,” International Journal of Scientific and Research Publications, Vol. 3, No. 3, pp. 1–3, Mar. 2013.
W.H.E. Day and H. Edelsbrunner, “Efficient Algorithms for Agglomerative Hierarchical Clustering Methods,” Journal of Classification, Vol. 1, No. 1, pp. 7–24, Dec. 1984.
S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. (Andrew) Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, “The HTK Book (for HTK version 3.4),” Cambridge University Engineering Department, 2006.
L. Cappelletta and N. Harte, “Viseme definitions comparison for visual-only speech recognition,” 19th European Signal Processing Conference (EUSIPCO), 2011, pp. 2109–2113.
DOI: https://doi.org/10.22146/ijitee.47577
Article Metrics
Abstract views : 1347 | views : 1113Refbacks
- There are currently no refbacks.
Copyright (c) 2020 IJITEE (International Journal of Information Technology and Electrical Engineering)
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
ISSN : 2550-0554 (online)
Contact :
Department of Electrical engineering and Information Technology, Faculty of Engineering
Universitas Gadjah Mada
Jl. Grafika No 2 Kampus UGM Yogyakarta
+62 (274) 552305
Email : ijitee.ft@ugm.ac.id
----------------------------------------------------------------------------