바로가기메뉴

본문 바로가기 주메뉴 바로가기

Korean Journal of Psychology: General

  • KOREAN
  • P-ISSN1229-067X
  • E-ISSN2734-1127
  • KCI

Studying Psychology using Big Data

Korean Journal of Psychology: General / Korean Journal of Psychology: General, (P)1229-067X; (E)2734-1127
2019, v.38 no.4, pp.519-548
https://doi.org/10.22257/kjp.2019.12.38.4.519

Abstract

The development of new technology such as big data, machine learning, and Artificial Intelligence changes human behaviors and thought. Increased use of the internet makes it possible to observe various human activities that were not observable before. Huge amounts of data about various types of human activities are being stored on the internet. Analyzing this information will help extend the scope of understanding human behaviors and psychology. The present paper attempts to find a way of applying new technology to psychological studies. Specifically, we focused on what big data are like and how they can be used for psychological research. This paper first reviewed the characteristics of big data and their role in psychological research. In this context, it discussed the problems of data-driven analysis techniques in which big data analysis is applied and the possibility of applying such methods to psychological research. In this context, it discussed the problems of the data-driven analytic scheme that big data analysis adapting and the possibilities of applying such a method to psychological research. Second, data analytic techniques used in big data analyses are reviewed. These techniques should be able to deal with big and unorganized data and unstructured data such as pictures, video clips, texts, etc. Specifically, it reviewed basic principles of topic modeling, ridge or lasso regression, support vector machine, neural network, and deep learning, and their application to psychological data. Third, the limitations of the use of big data in psychological research are discussed. Finally, it proposed ways of applying big data technology to psychological research.

keywords
Big Data, Artificial Intelligence, Machine Learning, Topic Modeling, Deep Learning, Data-driven Analysis, Model-driven analysis, 빅데이터, 인공지능, 기계학습, 주제모형, 딥러닝, 자료주도적 분석, 모형주도적 분석

Reference

1.

김청택, 이태헌 (2002). 뇌와 인지모형: 잠재의미분석을 사용한 문서분류. 한국심리학지:실험 및 인지, 14(4), 309-320.

2.

박성준, 박희영, 김청택 (2019). 잠재의미분석을 활용한 성격검사문항의 의미표상과 요인구조의 비교. 인지과학, 30(3), 133-156.

3.

이태헌, 김청택 (2004). LSA모형에서 다의어 의미의 표상, 인지과학, 15, 23-31.

4.

Adjerid, I.,, & Kelley, K. (2018). Big data in psychology: A framework for research advancement. American Psychologist, 73(7), 899-917. https://doi.org/10.1037/amp0000190

5.

Amato A., & Coronato, A. (2017). Supporting hypothesis generation by machine learning in smart health. Advances in Intelligent Systems and Computing, 612, 401-410. https://doi.org/10.1007/978-3-319-61542-4_38

6.

Anderson, J. (1990). The Adaptive Character of Thought. Hillsdale, NJ: Erlbaum Associates.

7.

Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation, Journal of Machine Learning Research, 3, 993-1022.

8.

Boser, B. E., Guyon, I., & Vapnik, V.N. (1992). Training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop of Computational Learning Theory (pp 144-152), Pittsburgh: ACM. https://doi.org/10.1145/130385.130401

9.

Cheung, M. W. L., & Jak, S. (2016). Analyzing big data in psychology: A split/analyze/metaanalyze approach. Frontiers in Psychology, 7, https://doi.org/10.3389/fpsyg.2016.00738

10.

Farnadi, G., Sitaraman, G., Sushmita, S., Celli, F., Kosinski, M., Stillwell, D., Marvalos, S. Moens, M-F., & De Cock, M. (2016). Computational personality recognition in social media. User Modeling and User-Adapted Interaction, 26, 109-142. https://doi.org/10.1007/s11257-016-9171-0

11.

Griggs, B. (2014, January 27). It's Facebook vs. Princeton in study smackdown. CNN. https://edition.cnn.com/2014/01/24/tech/social-media/facebook-princeton-smackdown/index.html

12.

Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313, 504-507.

13.

Hofmann, T. (1999). Probabilistic latent semantic analysis. In K. B. Laskey, & H. Prade (Eds.), Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence(pp. 289-296). Stockholm Sweden: Morgan Kaufmann Publishers Inc.

14.

Hofmann, T. (2001). Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42, 177-196. https://doi.org/10.1023/A:1007617005950

15.

HostingFacts (2019, November) Internet Stats & Facts for 2019. Retrieved November 25, 2019from https://hostingfacts.com/internet-factsstats

16.

Kaplan, R. M., & Saccuzzo, D. P. (2018). Psychological Testing: Principles, Applications, and Issues. Boston, MA: Cengage Learning.

17.

Kosinski, M., Matz, S., Gosling, S., Popov, V., & Stillwell, D. (2015). Facebook as a research tool for the social sciences: Opportunities, challenges, ethical considerations, and practical guidelines, American Psychologist, 70(6), 543-556. https://doi.org/10.1037/a0039210

18.

Kosinski, M., Wang, Y., Lakkaraju, H., & Leskovec, J. (2016). Mining big data to extract patterns and predict real-life outcomes. Psychological Methods, 21(4), 493. https://doi.org/10.1037/met0000105

19.

Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s Problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211-240. https://doi.org/10.1037/0033-295X.104.2.211

20.

Landauer, T. K., Foltz, P. W., & Laham, D. (1998). Introduction to latent semantic analysis. Discourse Processes, 25, 259-284. https://doi.org/10.1080/01638539809545028

21.

Landers, R., & Behrend, T. (2015). An inconvenient truth: arbitrary distinctions between organizational, Mechanical Turk, and other convenience samples. Industrial and Organizational Psychology, 8(2), 142-164. https://doi.org/10.1017/iop.2015.13

22.

Laney, D. (2001) 3D Data management:controlling data volume, velocity and variety. META Group Research Note, 6.

23.

Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). The parable of google flu: Traps in big data analysis. Science, 343(6176). 1203-1205. https://doi.org/10.1126/science.1248506

24.

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521, 436-444. https://doi.org/10.1038/nature14539

25.

Markowetz, A, Błaszkiewicz, K, Montag, C, Switala, C, & Schlaepfer, T. E. (2014). Psycho-informatics: Big data shaping modern psychometrics. Medical Hypotheses, 82(4), 405-411.

26.

McClelland, J. L., & Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception: I. An account of basic findings. Psychological Review, 88(5), 375-407. https://doi.org/10.1037/0033-295X.88.5.375

27.

McClelland, J. L., Rumelhart, D. E., & the PDP Research Group (Eds.). (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Vol. 2. Psychological and biological models. Cambridge, MA: MIT Press.

28.

McCulloch, W. S, & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 5(4), 115-133. https://doi.org/10.1007/BF02478259

29.

Moustafa, A. A., Diallo, T. M. O., Amoroso, N., Zaki, N., Hassan, M., & Alashwal, H. (2018). Applying big data methods to understanding human behavior and health. Frontiers in Computational Neuroscience, 12, 1-4. https://doi.org/10.3389/fncom.2018.00084

30.

Oquendo, M. A., Baca-Garcia, E., Artés-Rodríguez, A., Perez-Cruz, F., Galfalvy, H. C., Blasco-Fontecilla, H.,, Madigan D., & Duan, N. (2012, October). Machine learning and data mining: Strategies for hypothesis generation. Molecular Psychiatry. https://doi.org/10.1038/mp.2011.173

31.

Popper, K. R. (1959). The Logic of Scientific Discovery (translation of Logik der Forschung). London: Hutchinson.

32.

Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533-536. https://doi.org/10.1038/323533a0

33.

Rumelhart, D. E., McClelland, J. L., & the PDP Research Group (Eds.). (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Vol. 1. Foundations. Cambridge, MA: MIT Press.

34.

Sang, S., Yang, Z., Li, Z., & Lin, H. (2015). Supervised learning based hypothesis generation from biomedical literature. BioMed Research International, 215, https://doi.org/10.1155/2015/698527.

35.

Shawe-Taylor, J., & Cristianini, N. (2004). Kernel Methods for Pattern Analysis. Cambridge:Cambridge University Press.

36.

Snijders, C., Matzat, U., & Reips, U.-D. (2012). ‘Big Data’: Big gaps of knowledge in the field of internet. International Journal of Internet Science, 7, 1-5.

37.

Steyvers, M., & Griffiths, T. (2006). Probabilistic topic models. In D. Landauer, D. McNamara, S. Dennis, & W. Kintsch (Eds.). Latent Semantic Analysis: A Road to Meaning. Mahwah:Erlbaum.

38.

Thomas, K. A., & Clifford, S. (2017). Validity and Mechanical Turk: An assessment of exclusion methods and interactive experiments. Computers in Human Behavior, 77, 184-197. https://doi.org/10.1016/j.chb.2017.08.038

39.

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B, 58, 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x

40.

Young, J. L. (2018). The long history of big data in psychology. The American Journal of Psychology, 131(4), 477-482. https://doi.org/10.5406/amerjpsyc.131.4

41.

Youyou, W., Kosinski, M., & Stillwell, D. (2015). Computer-based personality judgments are more accurate than those made by humans. Proceedings of the National Academy of Sciences of the United States of America, 112(4), 1036-1040. https://doi.org/10.1073/pnas.1418680112

Korean Journal of Psychology: General