바로가기메뉴

본문 바로가기 주메뉴 바로가기

An Implementation of a System for Video Translation on Window Platform Using OCR

Journal of The Korea Internet of Things Society / Journal of The Korea Internet of Things Society, (P)2799-4791;
2019, v.5 no.2, pp.15-20
https://doi.org/https://doi.org/10.20465/kiots.2019.5.2.015


Abstract

As the machine learning research has developed, the field of translation and image analysis such as optical character recognition has made great progress. However, video translation that combines these two is slower than previous developments. In this paper, we develop an image translator that combines existing OCR technology and translation technology and verify its effectiveness. Before developing, we presented what functions are needed to implement this system and how to implement them, and then tested their performance. With the application program developed through this paper, users can access translation more conveniently, and also can contribute to ensuring the convenience provided in any environment.

keywords
기계학습, 광학문자인식(OCR), 이미지 번역기, 기계번역, 영상 번역, Machine learning, Optical Character Recognition, Image translator, Machine translation, Video translation

Reference

1.

K.H.Cho, et al., “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” arXiv preprint arXiv: 1406.1078, 2014.

2.

B.Dzmitry, K.H.Cho, and Y.Bengio, “Neural machine translation by jointly learning to align and translate,”arXiv preprint arXiv:1409.0473, 2014.

3.

Tu, Zhaopeng, et al., “Context gates for neural machine translation,” Transactions of the Association for Computational Linguistics 5, pp.87-99, 2017.

4.

V.Ashish, et al., “Attention is all you need,” Advances in Neural Information Processing Systems, 2017.

5.

Ma, Mingbo, et al., “Osu multimodal machine translation system report,” arXiv preprint arXiv:1710.02718, 2017.

6.

Madhyastha, P.Swaroop, J.Wang, and L.Specia, “Sheffield multimt: Using object posterior predictions for multimodal machine translation,” Proc. of the Second Conference on Machine Translation, 2017.

7.

Caglayan, Ozan, et al., “Lium-cvc submissions for wmt17 multimodal translation task,” arXiv preprint arXiv:1707.04481, 2017.

8.

N.Kalchbrenner and P.Blunsom, “Recurrent continuous translation models,” EMNLP, 2013.

9.

I.Sutskever, O.Vinyals, Q.V.Le, “Sequence to Sequence Learning with Neural Networks,” Advances in Neural Information Processing Systems (NIPS), 2014.

10.

D.Bahdanau, K.Cho and Y.Bengio, “Neural Machine Translation by Jointly Learning to Align and Translate,” Int'l Conf. on Learning Representations (ICLR), 2015.

11.

P.Koehn, “Statistical Machine Translation. Statistical Machine Translation,” Cambridge University Press, ISBN 9780521874151, 2010.

12.

R.Mithe, S.Indalkar, and N.Divekar, “Optical character recognition,” International Journal of Recent Technology and Engineering, Vol.2, pp.72-75, 2013.

13.

E.B.Go, Y.J.Ha, S.R.Choi, K.H.Lee, and Y.H.Park, “An implementation of an android mobile system for extracting and retrieving texts from images,” Journal of Digital Contents Society, Vol.12, No.1, pp.57-67, 2011.

14.

M.H.Cho, “A study on character recognition using wavelet transformation and moment,” Journal of The Korea Society of Computer and Information, Vol.15, No.10, pp.49-57, 2010.

15.

J.W.Song, N.R.Jung, and H.S.Kang, “Container BIC-code region extraction and recognition method using multiple thresholding,” Journal of the Korea Institute of Information and Communication Engineering, Vol.19, No.6, pp.1462-1470, 2015.

Journal of The Korea Internet of Things Society