기계학습 연구가 발달함에 따라 번역 분야 및, 광학 문자 인식(Optical Character Recognition, OCR) 등의이미지 분석 기술은 뛰어난 발전을 보였다. 하지만 이 두 가지를 접목시킨 영상 번역은 기존의 개발에 비해 그 진척이더딘 편이다. 본 논문에서는 기존의 OCR 기술과 번역기술을 접목시킨 이미지 번역기를 개발하고 그 효용성을 검증한다. 개발에 앞서 본 시스템을 구현하기 위하여 어떤 기능을 필요로 하는지, 기능을 구현하기 위한 방법은 어떤 것이있는지 제시한 뒤 각기 그 성능을 시험하였다. 본 논문을 통하여 개발된 응용프로그램으로 사용자들은 좀 더 편리하게번역에 접근할 수 있으며, 영상 번역이라는 특수한 환경으로 한정된 번역기능에서 벗어나 어떠한 환경에서라도 제공되는 편의성을 확보하는데 기여할 수 있을 것이다.
As the machine learning research has developed, the field of translation and image analysis such as optical character recognition has made great progress. However, video translation that combines these two is slower than previous developments. In this paper, we develop an image translator that combines existing OCR technology and translation technology and verify its effectiveness. Before developing, we presented what functions are needed to implement this system and how to implement them, and then tested their performance. With the application program developed through this paper, users can access translation more conveniently, and also can contribute to ensuring the convenience provided in any environment.
K.H.Cho, et al., “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” arXiv preprint arXiv: 1406.1078, 2014.
B.Dzmitry, K.H.Cho, and Y.Bengio, “Neural machine translation by jointly learning to align and translate,”arXiv preprint arXiv:1409.0473, 2014.
Tu, Zhaopeng, et al., “Context gates for neural machine translation,” Transactions of the Association for Computational Linguistics 5, pp.87-99, 2017.
V.Ashish, et al., “Attention is all you need,” Advances in Neural Information Processing Systems, 2017.
Ma, Mingbo, et al., “Osu multimodal machine translation system report,” arXiv preprint arXiv:1710.02718, 2017.
Madhyastha, P.Swaroop, J.Wang, and L.Specia, “Sheffield multimt: Using object posterior predictions for multimodal machine translation,” Proc. of the Second Conference on Machine Translation, 2017.
Caglayan, Ozan, et al., “Lium-cvc submissions for wmt17 multimodal translation task,” arXiv preprint arXiv:1707.04481, 2017.
N.Kalchbrenner and P.Blunsom, “Recurrent continuous translation models,” EMNLP, 2013.
I.Sutskever, O.Vinyals, Q.V.Le, “Sequence to Sequence Learning with Neural Networks,” Advances in Neural Information Processing Systems (NIPS), 2014.
D.Bahdanau, K.Cho and Y.Bengio, “Neural Machine Translation by Jointly Learning to Align and Translate,” Int'l Conf. on Learning Representations (ICLR), 2015.
P.Koehn, “Statistical Machine Translation. Statistical Machine Translation,” Cambridge University Press, ISBN 9780521874151, 2010.
R.Mithe, S.Indalkar, and N.Divekar, “Optical character recognition,” International Journal of Recent Technology and Engineering, Vol.2, pp.72-75, 2013.
E.B.Go, Y.J.Ha, S.R.Choi, K.H.Lee, and Y.H.Park, “An implementation of an android mobile system for extracting and retrieving texts from images,” Journal of Digital Contents Society, Vol.12, No.1, pp.57-67, 2011.
M.H.Cho, “A study on character recognition using wavelet transformation and moment,” Journal of The Korea Society of Computer and Information, Vol.15, No.10, pp.49-57, 2010.
J.W.Song, N.R.Jung, and H.S.Kang, “Container BIC-code region extraction and recognition method using multiple thresholding,” Journal of the Korea Institute of Information and Communication Engineering, Vol.19, No.6, pp.1462-1470, 2015.