Audio-Visual CNN using Transfer Learning for TV Commercial Break Detection

Muhammad Zha'farudin Pudya Wardana; Moh. Edi Wibowo

doi:10.22146/ijccs.76058

Audio-Visual CNN using Transfer Learning for TV Commercial Break Detection

https://doi.org/10.22146/ijccs.76058

Muhammad Zha'farudin Pudya Wardana^(1*), Moh. Edi Wibowo⁽²⁾

(1) Master Program in Computer Science, FMIPA UGM, Yogyakarta
(2) Department of Computer Science and Electronics, FMIPA UGM, Yogyakarta
(*) Corresponding Author

Abstract

The TV commercial detection problem is a hard challenge due to the variety of programs and TV channels. The usage of deep learning methods to solve this problem has shown good results. However, it takes a long time with many training epochs to get high accuracy.

This research uses transfer learning techniques to reduce training time and limits the number of training epochs to 20. From video data, the audio feature is extracted with Mel-spectrogram representation, and the visual features are picked from a video frame. The datasets were gathered by recording programs from various TV channels in Indonesia. Pre-trained CNN models such as MobileNetV2, InceptionV3, and DenseNet169 are re-trained and are used to detect commercials at the shot level. We do post-processing to cluster the shots into segments of commercials and non-commercials.

The best result is shown by Audio-Visual CNN using transfer learning with an accuracy of 93.26% with only 20 training epochs. It is faster and better than the CNN model without using transfer learning with an accuracy of 88.17% and 77 training epochs. The result by adding post-processing increases the accuracy of Audio-Visual CNN using transfer learning to 96.42%.

Keywords

Commercial, TV, CNN, Transfer Learning, InceptionV3, MobileNetV2, DenseNet169, Video

Full Text:

PDF

References

[1] S. Li Yujuns and Luo, “A TV Commercial Detection System,” in Web Information Systems and Mining, 2011, pp. 35–43.

[2] X. Wu and S. Satoh, “Ultrahigh-Speed TV Commercial Detection, Extraction, and Matching,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 23, no. 6, pp. 1054–1069, 2013, doi: 10.1109/TCSVT.2013.2248991.

[3] Z. Feng and C. Lab, “Real Time Commercial Detection in Videos,” 2013.

[4] A. Vyas, R. Kannao, V. Bhargava, and P. Guha, “Commercial Block Detection in Broadcast News Videos,” 2014. doi: 10.1145/2683483.2683546.

[5] A. Gomes, M. P. Queluz, and F. Pereira, “Automatic detection of TV commercial blocks: A new approach based on digital on-screen graphics classification,” in 2017 11th International Conference on Signal Processing and Communication Systems (ICSPCS), 2017, pp. 1–6.

[6] M. Li, Y. Guo, and Y. Chen, “CNN-Based Commercial Detection in TV Broadcasting,” in Proceedings of the 2017 VI International Conference on Network, Communication and Computing, 2017, pp. 48–53. doi: 10.1145/3171592.3171619.

[7] S. Minaee, I. Bouazizi, P. Kolan, and H. Najafzadeh, “Ad-Net: Audio-Visual Convolutional Neural Network for Advertisement Detection In Videos,” ArXiv, vol. abs/1806.08612, 2018.

[8] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the Inception Architecture for Computer Vision,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826, 2016.

[9] G. Huang, Z. Liu, and K. Q. Weinberger, “Densely Connected Convolutional Networks,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269, 2017.

[10] M. Sandler, A. G. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520, 2018.

[11] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” CoRR, vol. abs/1412.6980, 2015.

DOI: https://doi.org/10.22146/ijccs.76058

Article Metrics

Abstract views : 3492 |

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Copyright of :IJCCS (Indonesian Journal of Computing and Cybernetics Systems)ISSN 1978-1520 (print); ISSN 2460-7258 (online)is a scientific journal the results of Computingand Cybernetics Systems
A publication of IndoCEISS.Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281Fax: +62274 555133email:ijccs.mipa@ugm.ac.id | http://jurnal.ugm.ac.id/ijccs

View My Stats1View My Stats2

Username
Password
Remember me