ISSN : 1229-067X
빅데이터, 기계학습, AI 등의 새로운 기술의 발달은 사람들의 사고와 행동을 변화시키고 이전에는 접근하기 힘들었던 인간에 대한 다양한 활동을 관찰하는 것을 가능하게 한다. 사람들이 인터넷을 광범위하게 사용함에 따라서, 개인의 행동도 인터넷에 저장되고 있다. 자료들은 매우 광범위하며 다양하기 때문에 이를 적절하게 분석하면 인간 심리를 이해하는 범위를 확대할 수 있을 것이다. 이 논문에서는 새롭게 발달된 이러한 기술들을 심리학 연구에 활용하는 방법에 대하여 모색하고자 하였다. 특히 기술의 발달로 가능해진 새로운 자료, 빅데이터의 특성과 심리학에서의 활용방안에 대하여 논의하였다. 이 논문에서는 첫째, 빅데이터의 특성과 빅데이터가 심리학에서 어떠한 역할을 할 수 있는지 살펴보았다. 심리학의 모형주도적 분석법과 다른 빅데이터의 자료주도적 분석법의 문제점들과 이러한 분석을 심리학연구에 어떻게 적용될 수 있는지에 대하여 논의하였다. 둘째, 자료의 분석 방법론에 대하여 살펴보았다. 기존 심리학 연구에서는 정교한 연구설계에 의해 자료가 수집되기 때문에 분석이 상대적으로 덜 중요하지만, 빅데이터 분석에서는 자료분석의 역할이 아주 중요해진다. 방대하고 구조화되지 않은 자료를 처리할 수 있어야 하고, 언어 자료와 같은 숫자 이외의 자료도 분석할 수 있어야 한다. 특히 주제 모형화, 능선 회귀분석과 라소 회귀분석, 지지벡터 기계, 신경망, 딥러닝 등에 대한 원리를 소개하고 심리학 연구에 적용되는 방법들에 대하여 논의하였다. 셋째, 심리학에서 빅데이터 분석 적용의 한계점을 살펴보고, 마지막으로 빅데이터의 심리학 연구의 적용에 대한 방법을 제안하였다.
The development of new technology such as big data, machine learning, and Artificial Intelligence changes human behaviors and thought. Increased use of the internet makes it possible to observe various human activities that were not observable before. Huge amounts of data about various types of human activities are being stored on the internet. Analyzing this information will help extend the scope of understanding human behaviors and psychology. The present paper attempts to find a way of applying new technology to psychological studies. Specifically, we focused on what big data are like and how they can be used for psychological research. This paper first reviewed the characteristics of big data and their role in psychological research. In this context, it discussed the problems of data-driven analysis techniques in which big data analysis is applied and the possibility of applying such methods to psychological research. In this context, it discussed the problems of the data-driven analytic scheme that big data analysis adapting and the possibilities of applying such a method to psychological research. Second, data analytic techniques used in big data analyses are reviewed. These techniques should be able to deal with big and unorganized data and unstructured data such as pictures, video clips, texts, etc. Specifically, it reviewed basic principles of topic modeling, ridge or lasso regression, support vector machine, neural network, and deep learning, and their application to psychological data. Third, the limitations of the use of big data in psychological research are discussed. Finally, it proposed ways of applying big data technology to psychological research.
김청택, 이태헌 (2002). 뇌와 인지모형: 잠재의미분석을 사용한 문서분류. 한국심리학지:실험 및 인지, 14(4), 309-320.
박성준, 박희영, 김청택 (2019). 잠재의미분석을 활용한 성격검사문항의 의미표상과 요인구조의 비교. 인지과학, 30(3), 133-156.
이태헌, 김청택 (2004). LSA모형에서 다의어 의미의 표상, 인지과학, 15, 23-31.
Adjerid, I.,, & Kelley, K. (2018). Big data in psychology: A framework for research advancement. American Psychologist, 73(7), 899-917. https://doi.org/10.1037/amp0000190
Amato A., & Coronato, A. (2017). Supporting hypothesis generation by machine learning in smart health. Advances in Intelligent Systems and Computing, 612, 401-410. https://doi.org/10.1007/978-3-319-61542-4_38
Anderson, J. (1990). The Adaptive Character of Thought. Hillsdale, NJ: Erlbaum Associates.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation, Journal of Machine Learning Research, 3, 993-1022.
Boser, B. E., Guyon, I., & Vapnik, V.N. (1992). Training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop of Computational Learning Theory (pp 144-152), Pittsburgh: ACM. https://doi.org/10.1145/130385.130401
Cheung, M. W. L., & Jak, S. (2016). Analyzing big data in psychology: A split/analyze/metaanalyze approach. Frontiers in Psychology, 7, https://doi.org/10.3389/fpsyg.2016.00738
Farnadi, G., Sitaraman, G., Sushmita, S., Celli, F., Kosinski, M., Stillwell, D., Marvalos, S. Moens, M-F., & De Cock, M. (2016). Computational personality recognition in social media. User Modeling and User-Adapted Interaction, 26, 109-142. https://doi.org/10.1007/s11257-016-9171-0
Griggs, B. (2014, January 27). It's Facebook vs. Princeton in study smackdown. CNN. https://edition.cnn.com/2014/01/24/tech/social-media/facebook-princeton-smackdown/index.html
Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313, 504-507.
Hofmann, T. (1999). Probabilistic latent semantic analysis. In K. B. Laskey, & H. Prade (Eds.), Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence(pp. 289-296). Stockholm Sweden: Morgan Kaufmann Publishers Inc.
Hofmann, T. (2001). Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42, 177-196. https://doi.org/10.1023/A:1007617005950
HostingFacts (2019, November) Internet Stats & Facts for 2019. Retrieved November 25, 2019from https://hostingfacts.com/internet-factsstats
Kaplan, R. M., & Saccuzzo, D. P. (2018). Psychological Testing: Principles, Applications, and Issues. Boston, MA: Cengage Learning.
Kosinski, M., Matz, S., Gosling, S., Popov, V., & Stillwell, D. (2015). Facebook as a research tool for the social sciences: Opportunities, challenges, ethical considerations, and practical guidelines, American Psychologist, 70(6), 543-556. https://doi.org/10.1037/a0039210
Kosinski, M., Wang, Y., Lakkaraju, H., & Leskovec, J. (2016). Mining big data to extract patterns and predict real-life outcomes. Psychological Methods, 21(4), 493. https://doi.org/10.1037/met0000105
Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s Problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211-240. https://doi.org/10.1037/0033-295X.104.2.211
Landauer, T. K., Foltz, P. W., & Laham, D. (1998). Introduction to latent semantic analysis. Discourse Processes, 25, 259-284. https://doi.org/10.1080/01638539809545028
Landers, R., & Behrend, T. (2015). An inconvenient truth: arbitrary distinctions between organizational, Mechanical Turk, and other convenience samples. Industrial and Organizational Psychology, 8(2), 142-164. https://doi.org/10.1017/iop.2015.13
Laney, D. (2001) 3D Data management:controlling data volume, velocity and variety. META Group Research Note, 6.
Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). The parable of google flu: Traps in big data analysis. Science, 343(6176). 1203-1205. https://doi.org/10.1126/science.1248506
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521, 436-444. https://doi.org/10.1038/nature14539
Markowetz, A, Błaszkiewicz, K, Montag, C, Switala, C, & Schlaepfer, T. E. (2014). Psycho-informatics: Big data shaping modern psychometrics. Medical Hypotheses, 82(4), 405-411.
McClelland, J. L., & Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception: I. An account of basic findings. Psychological Review, 88(5), 375-407. https://doi.org/10.1037/0033-295X.88.5.375
McClelland, J. L., Rumelhart, D. E., & the PDP Research Group (Eds.). (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Vol. 2. Psychological and biological models. Cambridge, MA: MIT Press.
McCulloch, W. S, & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 5(4), 115-133. https://doi.org/10.1007/BF02478259
Moustafa, A. A., Diallo, T. M. O., Amoroso, N., Zaki, N., Hassan, M., & Alashwal, H. (2018). Applying big data methods to understanding human behavior and health. Frontiers in Computational Neuroscience, 12, 1-4. https://doi.org/10.3389/fncom.2018.00084
Oquendo, M. A., Baca-Garcia, E., Artés-Rodríguez, A., Perez-Cruz, F., Galfalvy, H. C., Blasco-Fontecilla, H.,, Madigan D., & Duan, N. (2012, October). Machine learning and data mining: Strategies for hypothesis generation. Molecular Psychiatry. https://doi.org/10.1038/mp.2011.173
Popper, K. R. (1959). The Logic of Scientific Discovery (translation of Logik der Forschung). London: Hutchinson.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533-536. https://doi.org/10.1038/323533a0
Rumelhart, D. E., McClelland, J. L., & the PDP Research Group (Eds.). (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Vol. 1. Foundations. Cambridge, MA: MIT Press.
Sang, S., Yang, Z., Li, Z., & Lin, H. (2015). Supervised learning based hypothesis generation from biomedical literature. BioMed Research International, 215, https://doi.org/10.1155/2015/698527.
Shawe-Taylor, J., & Cristianini, N. (2004). Kernel Methods for Pattern Analysis. Cambridge:Cambridge University Press.
Snijders, C., Matzat, U., & Reips, U.-D. (2012). ‘Big Data’: Big gaps of knowledge in the field of internet. International Journal of Internet Science, 7, 1-5.
Steyvers, M., & Griffiths, T. (2006). Probabilistic topic models. In D. Landauer, D. McNamara, S. Dennis, & W. Kintsch (Eds.). Latent Semantic Analysis: A Road to Meaning. Mahwah:Erlbaum.
Thomas, K. A., & Clifford, S. (2017). Validity and Mechanical Turk: An assessment of exclusion methods and interactive experiments. Computers in Human Behavior, 77, 184-197. https://doi.org/10.1016/j.chb.2017.08.038
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B, 58, 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
Young, J. L. (2018). The long history of big data in psychology. The American Journal of Psychology, 131(4), 477-482. https://doi.org/10.5406/amerjpsyc.131.4
Youyou, W., Kosinski, M., & Stillwell, D. (2015). Computer-based personality judgments are more accurate than those made by humans. Proceedings of the National Academy of Sciences of the United States of America, 112(4), 1036-1040. https://doi.org/10.1073/pnas.1418680112