ISSN : 1226-9654
최근 컴퓨터 공학의 패턴인식 분야에서는 딥러닝 알고리즘이 도입 및 활용되고 있다. 하지만, 언어 처리에 관한 계산주의적 접근을 위해 패턴인식 알고리즘들을 적용해 왔던 연결주의 모델링 분야에서는 아직까지 딥러닝 알고리즘이 제대로 활용되지 못하고 있다. 본 연구에서는 딥러닝 알고리즘 중 하나인 deep belief network 알고리즘을 이용한 모델링을 구축하고, 단어와 의미 간의 관계를 학습시켰다. 학습 과정 이후 모델용의 어휘 판단 과제를 수행하여, 빈도 효과(frequency effect)를 중심으로 행동 실험과 유사한 결과가 도출되는지 통계적으로 검증하였다. 모델링 수행 결과, 모델은 행동실험과 유사하게 빈도 효과를 도출해냄으로써, 모델이 정상적인 언어 처리가 가능함을 보였다. 본 연구의 결과는, 딥러닝 알고리즘으로 구축된 모델이 연결주의 모델링, 더 나아가 실제 사람의 언어 처리를 모사해 낼 수 있음을 제시한다. 아울러서 본 연구에서는, 어떻게 deep belief network 알고리즘이 연결주의 모델링에 적용이 가능한가에 대해서도 논의하였다.
Nowadays, computer science has recently been introduced and used deep learning algorithms in the field of pattern recognition. However, those deep learning algorithm has not been utilized in the field of connectionist modeling of language process which has used pattern recognization algorithms for computational perspective yet. In this study, we made a modeling which use the deep belief network which is a type of deep learning algorithm, and train the relation between words and semantic. After training, the model conducted the lexical decision task for model, and the results were statistically verified that it is similar with result of behavioral experiment with frequency effect as center. As the results of this study, the model showed that model was able to conduct language process through drawing the frequency effect. This result suggested that the model which used deep learning algorithm is able to be used as connectionist modeling, and to simulate the language process of human. In addition, in this study, we discussed how deep belief network can be applied to the connectionist modeling.
Andrews, S. (1992). Frequency and neighborhood effects on lexical access: Lexical similarity or orthographic redundancy?. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 234-254.
Bengio, Y., Courville, A., & Vincent, P. (2013). Representation Learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 1798-1828.
Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning Long Term Dependencies with Gradient Descent is Difficult. IEEE Transactions on Neural Networks, 5, 157-166.
Bowman, S. R. (2013). Can recursive neural tensor networks learn logical reasoning?. arXiv preprint arXiv:1312.6192.
Bowman, S. R., Angeli, G., Potts, C., & Manning, C. D. (2015). A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326.
Bowman, S. R., Manning, C. D., & Potts, C. (2015). Tree-structured composition in neural networks without tree-structured architectures. arXiv preprint arXiv:1506.04834.
Bowman, S. R., Potts, C., & Manning, C. D. (2014). Learning distributed word representations for natural logic reasoning. arXiv preprint arXiv:1410.4176.
Caruana, R., Lawrence, S., & Giles, L. (2001). Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping. In Leen, T. K., Dietterich, T. G., & Tresp, V. (Eds.), Advances in Neural Information Processing Systems (pp. 402-408). Cambridge, MA: The MIT Press.
Chen, D., & Manning, C. D. (2014). A Fast and Accurate Dependency Parser using Neural Networks. In Conference on Empirical Methods in Natural Language Processing (pp. 740-750). Doha: Association for Computational Linguistics.
Ciresan, D. C., Meier, U., & Gambardella, L. (2010). Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition. Neural Computation, 22, 1-14.
Cree, G. S., McNorgan, C., & McRae, K. (2006). Distinctive features hold a privileged status in the computation of word meaning:Implications for theories of semantic memory. Journal of Experimental Psychology. Learning, Memory, and Cognition, 32, 643-658.
Dahl, G. E., Yu, D., Deng, L., & Acero, A. (2012). Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 20, 30-42.
Dahl, G. E., Yu, D., Li, D., & Alex, A. (2011). Large vocabulary continuous speech recognition with context-dependent DBN-HMMs. In IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 4688-4691). Piscataway, NJ: IEEE.
Deng, L., & Yu, D. (2014). Deep Learning: Methods and Applications. Found. Trends Signal Process. Hanover, MA: Now Publishers Inc.
Grainger, J. (1990). Word frequency and neighborhood frequency effects in lexical decision and naming. Journal of Memory and Language, 29, 228-244.
Graves, A., Mohamed, A. R., & Hinton, G. (2013). Speech recognition with deep recurrent neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 6645-6649). Piscataway, NJ:IEEE.
Gwak, D., Bak, S., & Yi, H. (2015). Machine Learning to Deep Learning. Seoul: Purple.
Harm, M. W., & Seidenberg, M. S. (1999). Phonology, reading acquisition, and dyslexia:insights from connectionist models. Psychological Review, 106, 491-528.
Hinton, G. (2010). A Practical Guide to Training Restricted Boltzmann Machines A Practical Guide to Training Restricted Boltzmann Machines. Computer, 9, 1.
Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A. R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Kingsbury, B., & Sainath, T. (2012). Deep neural networks for acoustic modeling in speech recognition:The shared views of four research groups. IEEE Signal Processing Magazine, 29, 82-97.
Hinton, G. E., Osindero, S., & Teh, Y. -W. (2006). A Fast Learning Algorithm for Deep Belief Nets. Neural Computation, 18, 1527-1554.
Huang, E. H., Socher, R., Manning, C. D., & Ng, A. Y. (2012). Improving word representations via global context and multiple word prototypes. In the 50th Annual Meeting of the Association for Computational Linguistics (pp. 873-882). Jeju: Association for Computational Linguistics.
Hutzler, F., Ziegler, J. C., Perry, C., Wimmer, H., & Zorzi, M. (2004). Do current connectionist learning models account for reading development in different languages?. Cognition, 91, 273-296.
Ishii, T., Komiyama, H., Shinozaki, T., Horiuchi, Y., & Kuroiwa, S. (2013). Reverberant speech recognition based on denoising autoencoder. In the 14th Annual Conference of the International Speech Communication Association (pp. 3512-3516). Lyon: International Speech Communication Association.
Kim, J., Nan, C., & Zhang, B. (2015). Deep Learning-based Video Analysis Techniques. Communications of the Korea Information Science Society, 33, 21-31.
Kim, S. K., McAfee, L. C., McMahon, P. L., & Olukotun, K. (2009). A Highly Scalable Restricted Boltzmann Machine Fpga Implementation. In 2009 International Conference on Field Programmable Logic and Applications (pp. 367-372). Prague: IEEE.
Kwon, Y., & Lee, Y. (2014). Time course of Word Frequency and Word Length Effect in Visual Word Recognition: Evidence from Event-Related Brain Potential Study. The Journal of Linguistic Science, 69, 43-62.
Kwon, Y., & Nam, K. (2011). The Relationship Between Morphological Family Size and Syllabic Neighborhoods Density in Korean Visual Word Recognition. The Korean Journal of Cognitive and Biological Psychology, 23, 301-319.
Kwon, Y., Park, K., Lim, H., Jung, S., & Nam, K. (2006). Word frequency effect and word similarity effect in korean lexical decision task and their computational model. Neural Information Processing, 4234, 331-340.
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, 2278-2323.
Luong, T., Socher, R., & Manning, C. D. (2013). Better Word Representations with Recursive Neural Networks for Morphology. In 17th Conference on Computational Natural Language Learning (pp 104-113). Sofia: Special Interest Group on Natural Language Learning of the Association for Computational Linguistics.
Monsell, S., Doyle, M. C., & Haggard, P. N. (1989). Effects of frequency on visual word recognition tasks: where are they?. Journal of Experimental Psychology. General, 118, 43-71.
Nam, K., Seo, K., Choi, K. -S., Lee, K., Kim, T., & Lee, M. (1997). The Word Length Effect on Hangul Word Recognition. Korean Journal of Experimental and Cognitive Psychology, 9, 1-18.
Park, K., & Lim, H. (2014). A computational model explaining language phenomena on Korean visual word recognition. Cognitive Systems Research, 27, 11-24.
Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global Vectors for Word Representation. In Proceedings of Conference on Empirical Methods in Natural Language Processing (pp. 1532-1543). Doha: Association for Computational Linguistics.
Plaut, D. C. (1996). Relearning after damage in connectionist networks: toward a theory of rehabilitation. Brain and Language, 52, 25-82.
Plaut, D. C. (1997). Structure and Function in the Lexical System: Insights from Distributed Models of Word Reading and Lexical Decision. Language and Cognitive Processes, 12, 765-806.
Plaut, D. C. (1999). A Connectionist Approach to Word Reading and Acquired Dyslexia:Extension to Sequential Processing. Cognitive Science, 23, 543-568.
Plaut, D. C., McClelland, J. L., Seidenberg, M. S., & Patterson, K. (1996). Understanding normal and impaired word reading: computational principles in quasi-regular domains. Psychological Review, 103, 56-115.
Richmond, K., Hoole, P., & King, S. (2011). Announcing the Electromagnetic Articulography (Day 1) Subset of the mngu0Articulatory Corpus. In 12th Annual Conference of the International Speech Communication Association (pp. 1505-1508). Portland:International Speech Communication Association.
Seidenberg, M. S., & McClelland, J. L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96, 523-568.
Socher, R., Chen, D., Manning, C. D., & Ng, A. (2013). Reasoning with neural tensor networks for knowledge base completion. In Advances in Neural Information Processing Systems (pp. 926-934). Lake Tahoe, NV: Neural Information Processing Systems.
Socher, R., Huang, E. H., Pennington, J., Manning, C. D., & Ng, A. Y. (2011). Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Proceeding of Advances in Neural Information Processing Systems (pp. 801-809). Granada:Neural Information Processing Systems.
Socher, R., Huval, B., Manning, C. D., & Ng, A. Y. (2012). Semantic compositionality through recursive matrix-vector spaces. In the 2012Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (pp. 1201-1211). Jeju:Association for Computational Linguistics.
Socher, R., Pennington, J., Huang, E. H., Ng, A. Y., & Manning, C. D. (2011). Semi-supervised recursive autoencoders for predicting sentiment distributions. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 151-161). Edinburgh: Association for Computational Linguistics.
Socher, R., Perelygin, A., Wu, J. Y., Chuang, J., Manning, C. D., Ng, A. Y., & Potts, C. (2013). Recursive deep models for semantic compositionality over a sentiment treebank. In the Conference on Empirical Methods in Natural Language Processing (pp. 1642). Seattle:Association for Computational Linguistics.
Tenenbaum, J. B., Kemp, C., Griffiths, T. L., & Goodman, N. D. (2011). How to Grow a Mind: Statistics, Structure, and Abstraction. Science, 331, 1279-1285.
Yi, K. (1993). On the Role of Frequency and Internal Structure in the Processing of Kulca. Korean Journal of Experimental and Cognitive Psychology, 5, 26-39.
Yi, K., Park, K., Abe, J., Liu, Y., & Zhang, Y. (2010). A Cross-Linguistic Study on Representation and Processing of Hanja Words: Naming and Lexical Decision. The Korean Journal of Cognitive and Biological Psychology, 22, 277-291.
Yim, H., Lim, H., Park, K., & Nam, K. (2005). A Computation Model of Korean Lexical. Advances in Natural Computation, 3610, 844-849
You, H., Nam, K., & Nam, H. (2015). Semantic Process Possibility Research in Featural View Connectionist Modeling. The Korean Journal of Cognitive and Biological Psychology, 27, 613-638.
Ziegler, J. C., Besson, M., Jacobs, A. M., & Nazir, T. A. (1997). Word, Pseudoword, and Nonword Processing: A Multitask Comparison Using Event-Related Brain Potentials. Journal of Cognitive Neuroscience, 9, 758-775.