정보관리학회지, 한국정보관리학회

11

김희영(연세대학교 일반대학원 문헌정보학과) ; 박지홍(연세대학교 문헌정보학과) 2022, Vol.39, No.1, pp.1-15 https://doi.org/10.3743/KOSIM.2022.39.1.001

초록보기

초록

본 연구는 약물 연구 분야에 속하는 특허 사이에 나타나는 지식의 흐름을 살펴보고 이들 간의 영향력을 파악해보기 위해 특허데이터에서 나타나는 인용 관계를 분석하였다. 특허데이터의 수집은 Google Patents에서 진행하였다. 약물 연구와 관련된 특허 문서를 검색하여 상위 25개의 출원인을 선정하였고, 이를 바탕으로 출원인 사이에서의 인용 관계를 알아보고 각 출원인의 각 문서에 대한 피인용빈도와 순위를 활용하여 h-지수와 h-지수의 파생지표들의 값을 계산하여 비교하였다. 분석 결과를 종합하면, ‘Pfizer, MIT, Abbott’ 등의 출원인이 약물 연구 분야에서 영향력이 높은 출원인으로 드러났다. 5개의 계량서지학적 지표 중에서 g-지수와 hS-지수가 서로 유사한 결과를 보여주었고, 총인용빈도, 최대인용빈도, CPP의 순위를 가장 잘 반영하는 지표로 나타났다. 또한, 총인용빈도, CPP, 최대인용빈도 순으로 5개의 계량서지학적 지표와의 상관관계가 높았다. 한편, 기존의 특허 출원인의 기술적 영향력을 나타내는 것으로 알려진 지표인 CPP만으로는 정확한 비교가 어려운 경우도 나타났다.

Abstract

This study analyzes the relationship of citations appearing in the patent data to understand knowledge transfers and impacts between patent documents in the field of pharmaceutical research. Patent data were collected from a website, Google Patents. The top 25 assignees were selected by searching for patent documents related to pharmaceutical research. We identify the citation relationships between assignees, then calculate and compare the values of h-index and derived indicators by using the number of citations and rank for each document of each assignee. As a result, in the case of pharmaceutical research, the assignee, such as ‘Pfizer, MIT, and Abbott’ shows a high impact. Among the five bibliometric indicators, the g-index and hS-index show similar results, and the indicators are the most related to the rankings of Total Citation Frequency, Cites per Patents, and Maximum Citation Frequency. In addition, it is highly related to the five indicators in the order of Total Citation Frequency, Cites per Patents, and Maximum Citation Frequency. In some cases, it is difficult to make an accurate comparison with Cites per Patents alone, which is previously known to indicate the technological influence of patent assignees.

12

공공도서관 상주작가 문화프로그램 개선 방안에 관한 연구

문수빈(전남대학교 대학원 문헌정보학과) ; 장우권(전남대학교 문헌정보학과) 2022, Vol.39, No.3, pp.23-50 https://doi.org/10.3743/KOSIM.2022.39.3.023

초록보기

초록

본 연구는 도서관 상주작가 지원사업 문화프로그램 운영현황을 파악하고 프로그램에 대한 이용자의 인식과 선호, 사서의 인식을 조사하여 도서관 상주작가 문화프로그램의 운영개선 및 활성화 방안을 제시하는데 있다. 이를 위한 연구방법은 이론적 연구와 상주작가 문화프로그램 운영 현황조사, 도서관 이용자의 인식 및 선호에 대한 설문조사, 사서의 인식에 대한 인터뷰조사를 통한 실증적 연구를 병행하였다. 현황조사․설문조사․인터뷰조사는 2021년 5월부터 11월까지의 상주작가 지원사업 문화프로그램을 실시하고 있는 도서관을 대상으로 하였다. 문화프로그램의 어려움은 상주작가의 역량과 코로나로 인한 휴․재개관의 반복과 비대면 수업으로 나타났다. 프로그램 개선방안으로 상주작가와 다른 분야의 문인 초청과 문학프로그램 운영, 온라인 홍보에서 오프라인 홍보의 강화, 문화프로그램 운영의 사전교육, 사업 시작 시기 조정, 작가의 특성을 위한 교육, 이용자 연령을 고려한 진행방식, 이용자 선호 프로그램, 프로그램 운영시간, 정기문화프로그램 운영, 지역특성과 프로그램의 다양성 등을 고려해야 할 것이다.

Abstract

This article aims to improve the cultural program for the resident writers in libraries by identifying the current status of the resident writer support project, examining users’ perceptions and preferences, and investigating librarians’ perceptions of the program. The research methods included an investigation (survey) of the program operation status, user perception and preference surveys, and interviews with librarians about their perceptions of the program. Data were collected from libraries with the cultural program for resident writers between May and November 2021. Findings showed there were challenges in operating the library cultural program due to the lack of resident writers’ quality, frequent changes in the program formats (online or offline), and library closures during the COVID-19 pandemic. For the program improvement, suggestions were made, including inviting good-quality resident writers and writers from other fields to the program, training resident writers, advertising the programs using online and offline media, training staff, and developing programs considering user ages and preferences and diversity in local communities.

13

문화재 중심 기록물 서비스 개선을 위한 온톨로지 설계: 황룡사 관련 기록물 중심으로

김시정(대구대학교 기록물관리 전문요원) ; 최상희(대구가톨릭대학교) 2022, Vol.39, No.4, pp.241-268 https://doi.org/10.3743/KOSIM.2022.39.4.241

초록보기

초록

문화재 관련 기록물은 문화재에 대한 구체적인 증거이며 보존에 있어 중요한 근거자료 역할을 하므로 문화재만큼이나 중요한 의미가 있다. 특히 국가적이나 사회적으로 중요한 가치를 가진 특정 문화재인 경우 해당 문화재가 하나의 주제로 다양한 연구가 진행되고 문화재를 주제로 한 프로그램이 기획되는 경우가 많다. 그러나 유명한 문화재를 중심으로 생산되는 기록물은 긴 시간 동안 발생하면서 분산되어 관리되어 왔고 다양한 형태로 나타나고 있어 해당 기록물의 범위와 소재, 내용을 파악하기 어렵다. 이와 같은 문제들의 해결 방안으로, 이 연구는 황룡사와 같이 사회적, 역사적 가치를 가지는 주요 문화재를 중심으로 발생하는 관련 기록물을 11개 공공기관 및 웹서비스에서 수집하여 기록물의 유형, 기록물과 관련된 활동, 메타데이터 분석을 통해 전체 기록물의 범위와 관계를 파악할 수 있는 온톨로지 설계를 하여 특정 문화재 중심으로 기록물을 이해할 수 있도록 하고자 하였다.

Abstract

Records related to a certain cultural heritage are concrete evidence that prove the value of the cultural heritage and become a criterion for long-term preservation of its records. The value of the records is as important as cultural heritage value. In the case of specific cultural heritage with national or socially important values, various studies are conducted on cultural heritage as one theme, and various programs about cultural heritage are developed. However, it is difficult to grasp the scope, record types, and contents of the records because they have been distributed and managed in many institutes. They also appear in various forms. As a solution to these problems, this study collected records of a major cultural heritage with social and historical values such as Hwangnyongsa from 11 public institutions and web services and analyzed the types of records, activities related to the records, and metadata. Through data analysis, an ontology that can understand the range and relationship of the entire record was suggested so that the record can be understood with a focus on specific cultural heritage.

14

1인 사서교사의 인적 네트워크 특성에 관한 탐색적 연구

강재연(연세대학교 대학원 문헌정보학과 박사과정) ; 박지홍(연세대학교 문헌정보학과) 2022, Vol.39, No.4, pp.215-239 https://doi.org/10.3743/KOSIM.2022.39.4.215

초록보기

초록

인적 네트워크는 지식공유의 중요한 창구로서 암묵적 지식을 비롯한 다양한 정보문제에 주요한 해결 수단이 될 수 있다. 특히 조직에서 1인으로 구성된 학교도서관의 사서교사(또는 사서)의 경우, 업무를 위한 인적 네트워크는 조직 내부 구성원보다는 동일 직무를 담당하는 조직 외부인과의 관계가 효과적일 수 있다. 이에 본 연구는 1인 사서교사의 업무 관련 인적 네트워크 특성을 탐색하고 이러한 네트워크 특성과 직무 만족 및 역할 모호성 해소와의 관계성을 파악하는 것을 목적으로 한다. 서울시 지역 내 사서교사 네트워크 협의회 1곳을 대상으로 설문조사를 실시하였으며, 수집된 설문데이터는 소셜네트워크분석(SNA: social network analysis) 방법을 활용하여 분석하였다. 연구결과, 사서교사의 인적 네트워크는 네트워크 협의회 내 대표 교 이력의 사서교사를 중심으로 업무 도움 관계가 활발하였으며, 저 경력의 초임 교사에게는 업무 도움을 요청할 수 있는 대상이 적은 특징이 확인되었다. 또한 개인의 인적 네트워크 특성은 직무 만족과 역할 모호성 해소에 영향을 주지 않는 것으로 나타났다. 연구결과를 토대로 1인 사서교사의 업무 관련 정보문제 해결을 위한 사서교사 간 협력 및 네트워크의 활성화 방안의 필요성을 제안하였다.

Abstract

Human networks can be an important means of solving various information problems including tacit knowledge, as an essential channel of knowledge sharing. Particularly, in the case of a teacher librarian(or librarian) of a school library composed of one person in the organization, the human networks for work can be more effective with people outside the organization who are in charge of the same duties than with members inside the organization. Thus, this study aims at exploring the characteristics of the personal networks related to the work of teacher librarians and understanding the effect of these network characteristics on job satisfaction and role ambiguity resolution. A survey was conducted on one of the teacher librarian associations in Seoul, and the collected data were analyzed using social network analysis(SNA) method. As a result, it was found that personal networking of teacher librarians is active in experienced teacher librarians, while those with shorter career have fewer channels of help-seeking. Also, the characteristics of personal networking do not affect job satisfaction and the resolution of role ambiguity. Based on these results, this study proposes the expansion of collaborating and networking among teacher librarians to solve information problems in a single-person workplace.

15

문헌정보학 분야의 리터러시 연구 동향 분석

장수현(중앙대학교 문헌정보학과) ; 남영준(중앙대학교) 2022, Vol.39, No.3, pp.263-292 https://doi.org/10.3743/KOSIM.2022.39.3.263

초록보기

초록

본 연구는 문헌정보학 현장인 도서관에서 제공되는 서비스인 이용자 교육의 관련 개념인 리터러시가 각종 문헌정보학 연구 분야에서 어떠한 연구 주제를 다루는지 확인하는 것을 목적으로 한다. 이를 위해 WoS와 KCI 데이터베이스에서 문헌정보학 분야 리터러시 관련 논문을 수집하여 키워드 분석 및 토픽 모델링 분석 기법을 상호보완적으로 사용해 분석하였다. 분석 결과, WoS와 KCI의 문헌정보학 분야 리티러시 관련 연구 동향은 저자 키워드, 주요 주제 등에서 차이가 있는 것으로 나타났으며, 토픽 모델링을 통해 KCI의 리터러시 관련 연구를 3개의 토픽으로 분류하였다. 또한, 연구에서 확인한 국내 문헌정보학 분야 리터러시 연구 동향은 전체 리터러시 관련 연구 동향과 연구량 급증 시기, 핵심 다빈출 키워드 차이가 있음을 분석하였다. 특히, 전체 분야 리터러시 연구는 ‘리터러시’, ‘교육’, ‘미디어’, ‘디지털’ 등의 단어가 다수 도출되었지만 문헌정보학 분야의 리터러시 연구는 ‘정보활용능력’, ‘학교도서관’ 등의 키워드가 다수 등장하였다. 이를 바탕으로 향후 국내에서도 정보가 급증하는 오늘날의 정보화 환경에 맞춰 정보에 대한 평가적인 안목을 기를 수 있는 능력에 관한 연구가 필요하다는 결론을 도출하였다.

Abstract

The purpose of this study is to identify the topics of research related to the concepts of literacy in the field of Library and Information Science which is related to user education in libraries. Data were collected from the WoS and KCI databases, and complementary keyword analysis and topic modeling analysis techniques were used to identify topics of literature-related research articles in the field of Library and Information Science. Findings presented that there was a difference in keywords and topics between the two databases. Literacy-related topics identified from the KCI database were classified into three groups through topic modeling. Also, it was analyzed that there is a difference between the overall literacy-related research trend, the timing of the surge in research volume, and key frequent keywords in the Library and Information Science field confirmed in the study. In particular, in the study of literacy in all fields, a number of words such as ‘literacy’, ‘education’, ‘media’, and ‘digital’ were derived. However, in literature research in the field of Library and Information Science, keywords such as ‘information utilization ability’ and ‘school library’ appeared. Based on this, it was concluded that research on the ability to develop an evaluative eye for information is needed in line with today’s information environment, where information is rapidly increasing in Korea in the future.

16

사회과학, 자연과학기술 및 융복합 분야의 약물중독 연구에 대한 계량서지학적 비교 분석 연구

남동인(연세대학교 문헌정보학과 석사과정) ; 박지홍(연세대학교 문헌정보학과) 2022, Vol.39, No.2, pp.203-232 https://doi.org/10.3743/KOSIM.2022.39.2.203

초록보기

초록

약물중독 혹은 약물사용장애(substance use disorder)는 세계적으로 그 위험성과 유행성이 지속적으로 관측 되고 있다. 이러한 배경에서 수많은 관련 연구들이 진행이 되어왔지만, 이와 관련한 계량서지학적 분석은 미진한 상황이다. 특히, 약물중독과 관련된 다양한 특성들을 종합적으로 반영한 거시적 차원의 계량서지학적 접근법을 활용한 연구는 찾아보기가 힘든 상황이다. 이 연구에서는 이러한 약물중독의 다차원적 특성을 반영하기 위해 사회과학, 자연과학기술, 융복합 분야에서의 약물중독 연구 동향을 비교 분석하였다. 이 연구는 2002년부터 2021년까지의 약물중독 연구 논문을 Web of Science로부터 검색 후 수집하였으며, SCI(E) 및 SSCI 정보를 토대로 학문 분야를 분류하였다. 저자 키워드 동시출현 분석을 수행한 결과, 자연과학기술은 신경정신약물과 보상시스템에 관한 연구가 주를 이루었고, 사회과학 분야에서는 이보다는 인구학적 특성이 반영된 약물중독 연구가 수행되어 왔음을 알 수 있었고, 융복합 분야에서는 이러한 동향을 모두 아우르고 있는 것을 확인할 수 있었다. 저자 동시인용 분석도 수행을 하였는데, 이를 통해 자연과학기술 분야는 슈퍼 저자들이 관측된 반면, 사회과학 분야에서는 개인 저자뿐 아니라 기관 저자까지도 인용이 많이 되는 것으로 확인이 되었다.

Abstract

Drug addiction or substance use disorder is continuously observed worldwide for its risks and prevalence. In this context, numerous studies have been conducted regarding this issue. However, bibliometric analysis related to drug addiction is insufficient. In particular, it is difficult to find research that utilizes a macro-level bibliographic approach that comprehensively reflects various characteristics related to drug addiction. In this study, to reflect the multidimensional features of drug addiction, research trends in drug addiction in social science, natural science, and multidisciplinary studies were compared and analyzed. This study collected drug addiction research articles from 2002 to 2021 by searching from the Web of Science, and classified academic disciplines based on SCI(E) and SSCI information. Author keyword co-occurrence analysis was also conducted, which provided confirmation that natural science mainly studied psychoactive substances and the reward system in the brain, while drug addiction studies reflecting demographic characteristics were conducted in the domain of social science. In the multidisciplinary field, all of the above topics were covered. Author co-citation analysis was also employed, which showed that there are superstars (i.e., authors who receive a rigorous amount of citation) in the field of natural science, while in the social science domain, authors were highly cited not only at the individual level but also at the institutional level.

17

북한이탈주민의 정보빈곤에 관한 연구: Chatman의 정보빈곤이론을 기반으로

민수진(성균관대학교 문헌정보학과) ; 이용정(성균관대학교) 2022, Vol.39, No.3, pp.241-261 https://doi.org/10.3743/KOSIM.2022.39.3.241

초록보기

초록

본 연구는 Chatman(1996)의 정보빈곤이론(Theory of Information Poverty)을 바탕으로 정보 빈곤이 북한이탈주민의 한국사회적응에 미치는 영향을 알아보고자 하였다. 연구를 위해 정보빈곤이론을 기반으로 정보빈곤의 개념을 은폐(Secrecy), 기만(Deception), 위험감수(Risk-taking), 상황적 관련성(Situational relevance)에 따른 정보 수용이라는 네 가지 변인으로 구성하였고, 선행연구 분석 결과를 바탕으로 한국사회적응을 사회적 적응과 심리적 적응으로 구분하였다. 또한 생명윤리위원회(IRB)의 승인을 거쳐 2021년 8월 4일부터 8월 30일까지 북한이탈주민 지원 단체 <우리온>을 통해 국내 입국 후 최소 1년이 경과한 민법상 성년인 만 19세 이상의 북한이탈주민을 대상으로 설문조사를 실시하였다. 수집된 100개의 유효한 데이터를 빈도 분석, 신뢰도 분석, 상관관계 분석, 다중회귀분석을 통해 분석한 결과, 정보빈곤은 북한이탈주민의 사회적 적응과 심리적 적응에 유의한 영향을 미치는 것으로 나타났다. 특히, “기만” 변수는 북한이탈주민의 사회적 적응과 심리적 적응에 유의한 부(-)의 영향을 미치는 것으로 나타났다. 본 연구는 북한이탈주민을 정보빈곤층으로 정의하고, 그들의 한국사회적응을 Chatman의 정보빈곤이론을 기반으로 설명하였다는 점에서 학문적 의의가 있다. 무엇보다도, 질적 연구를 수행한 선행연구들과 달리 변수의 조작화를 통해 양적 연구를 시도하였다는 점에서 의미가 있다.

Abstract

The present study aims to investigate the effects of information poverty on North Korean refugees’ social adaptation to South Korea based on Chatman’s Theory of Information Poverty (1996). Based on the Theory of Information Poverty, information poverty consists of four variables: Secrecy, Deception, Risk-taking, and information acceptance in response to situational relevance. And based on the previous studies, adaptation to South Korean life is divided into social adaptation and psychological adaptation. From August 4 to August 30, 2021, after approval by the IRB through the North Korean refugee support organization <Urion>, surveys were conducted with North Korean refugees who had lived in South Korea for at least one year and were aged 19 or older. The 100 collected valid data were analyzed using frequency analysis, reliability analysis, correlation analysis, and multiple linear regression analysis. Findings of the study indicated that information poverty had significant effects on North Korean refugees’ social and psychological adaptation. In particular, the “deception” variable had negative effects on social and psychological adaptation. The study has theoretical implications that it explains North Korean refugees’ adaptation to South Korea based on Theory of Information Poverty by defining them as information poor. Above all, it attempts a quantitative approach through operationalization of key concepts unlike previous studies that were conducted with qualitative approaches.

18

지적구조 규명을 위한 키워드서지결합분석 기법에 관한 연구

이재윤(명지대학교 문헌정보학과) ; 정은경(이화여자대학교 문헌정보학과) 2022, Vol.39, No.1, pp.309-330 https://doi.org/10.3743/KOSIM.2022.39.1.309

초록보기

초록

학문의 구조, 특성, 하위 분야 등을 계량적으로 규명하는 지적구조 분석 연구가 최근 급격히 증가하는 추세이다. 지적구조 분석 연구를 수행하기 위하여 전통적으로 사용되는 분석기법은 서지결합분석, 동시인용분석, 단어동시출현분석, 저자서지결합분석 등이다. 이 연구의 목적은 키워드서지결합분석(KBCA, Keyword Bibliographic Coupling Analysis)을 새로운 지적구조 분석 방식으로 제안하고자 한다. 키워드서지결합분석 기법은 저자서지결합분석의 변형으로 저자 대신에 키워드를 표지로 하여 키워드가 공유한 참고문헌의 수를 두 키워드의 주제적 결합 정도로 산정한다. 제안된 키워드서지결합분석 기법을 사용하여 Web of Science에서 검색된 ‘Open Data’ 분야의 1,366건의 논문집합을 대상으로 분석하였다. 1,366건의 논문집합에서 추출된 7회 이상 출현한 63종의 키워드를 오픈데이터 분야의 핵심 키워드로 선정하였다. 63종의 핵심 키워드를 대상으로 키워드서지결합분석 기법으로 제시된 지적구조는 열린정부와 오픈사이언스라는 주된 영역과 10개의 소주제로 규명되었다. 이에 반해 단어동시출현분석의 지적구조 네트워크는 전체 구성과 세부 영역 구조 규명에 있어 미진한 것으로 나타났다. 이러한 결과는 키워드서지결합분석이 키워드 간의 서지결합도를 사용하여 키워드 간의 관계를 풍부하게 측정하기 때문이라고 볼 수 있다.

Abstract

Intellectual structure analysis, which quantitatively identifies the structure, characteristics, and sub-domains of fields, has rapidly increased in recent years. Analysis techniques traditionally used to conduct intellectual structure analysis research include bibliographic coupling analysis, co-citation analysis, co-occurrence analysis, and author bibliographic coupling analysis. This study proposes a novel intellectual structure analysis method, Keyword Bibliographic Coupling Analysis (KBCA). The Keyword Bibliographic Coupling Analysis (KBCA) is a variation of the author bibliographic coupling analysis, which targets keywords instead of authors. It calculates the number of references shared by two keywords to the degree of coupling between the two keywords. A set of 1,366 articles in the field of ‘Open Data’ searched in the Web of Science were collected using the proposed KBCA technique. A total of 63 keywords that appeared more than 7 times, extracted from 1,366 article sets, were selected as core keywords in the open data field. The intellectual structure presented by the KBCA technique with 63 key keywords identified the main areas of open government and open science and 10 sub-areas. On the other hand, the intellectual structure network of co-occurrence word analysis was found to be insufficient in the overall structure and detailed domain structure. This result can be considered because the KBCA sufficiently measures the relationship between keywords using the degree of bibliographic coupling.

19

BERTopic을 활용한 불면증 소셜 데이터 토픽 모델링 및 불면증 경향 문헌 딥러닝 자동분류 모델 구축

고영수(연세대학교 문헌정보학과 석사과정) ; 이수빈(연세대학교 문헌정보학과 박사과정) ; 차민정(연세대학교 소셜오믹스 연구센터) ; 김성덕(연세대학교 문헌정보학과 석사과정) ; 이주희(연세대학교 문헌정보학과 석사과정) ; 한지영(연세대학교 문헌정보학과 석사과정) ; 송민(연세대학교 문헌정보학과) 2022, Vol.39, No.2, pp.111-129 https://doi.org/10.3743/KOSIM.2022.39.2.111

초록보기

초록

불면증은 최근 5년 새 환자가 20% 이상 증가하고 있는 현대 사회의 만성적인 질병이다. 수면이 부족할 경우 나타나는 개인 및 사회적 문제가 심각하고 불면증의 유발 요인이 복합적으로 작용하고 있어서 진단 및 치료가 중요한 질환이다. 본 연구는 자유롭게 의견을 표출하는 소셜 미디어 ‘Reddit’의 불면증 커뮤니티인 ‘insomnia’를 대상으로 5,699개의 데이터를 수집하였고 이를 국제수면장애분류 ICSD-3 기준과 정신의학과 전문의의 자문을 받은 가이드라인을 바탕으로 불면증 경향 문헌과 비경향 문헌으로 태깅하여 불면증 말뭉치를 구축하였다. 구축된 불면증 말뭉치를 학습데이터로 하여 5개의 딥러닝 언어모델(BERT, RoBERTa, ALBERT, ELECTRA, XLNet)을 훈련시켰고 성능 평가 결과 RoBERTa가 정확도, 정밀도, 재현율, F1점수에서 가장 높은 성능을 보였다. 불면증 소셜 데이터를 심층적으로 분석하기 위해 기존에 많이 사용되었던 LDA의 약점을 보완하며 새롭게 등장한 BERTopic 방법을 사용하여 토픽 모델링을 진행하였다. 계층적 클러스터링 분석 결과 8개의 주제군(‘부정적 감정’, ‘조언 및 도움과 감사’, ‘불면증 관련 질병’, ‘수면제’, ‘운동 및 식습관’, ‘신체적 특징’, ‘활동적 특징’, ‘환경적 특징’)을 확인할 수 있었다. 이용자들은 불면증 커뮤니티에서 부정 감정을 표현하고 도움과 조언을 구하는 모습을 보였다. 또한, 불면증과 관련된 질병들을 언급하고 수면제 사용에 대한 담론을 나누며 운동 및 식습관에 관한 관심을 표현하고 있었다. 발견된 불면증 관련 특징으로는 호흡, 임신, 심장 등의 신체적 특징과 좀비, 수면 경련, 그로기상태 등의 활동적 특징, 햇빛, 담요, 온도, 낮잠 등의 환경적 특징이 확인되었다.

Abstract

Insomnia is a chronic disease in modern society, with the number of new patients increasing by more than 20% in the last 5 years. Insomnia is a serious disease that requires diagnosis and treatment because the individual and social problems that occur when there is a lack of sleep are serious and the triggers of insomnia are complex. This study collected 5,699 data from ‘insomnia’, a community on ‘Reddit’, a social media that freely expresses opinions. Based on the International Classification of Sleep Disorders ICSD-3 standard and the guidelines with the help of experts, the insomnia corpus was constructed by tagging them as insomnia tendency documents and non-insomnia tendency documents. Five deep learning language models (BERT, RoBERTa, ALBERT, ELECTRA, XLNet) were trained using the constructed insomnia corpus as training data. As a result of performance evaluation, RoBERTa showed the highest performance with an accuracy of 81.33%. In order to in-depth analysis of insomnia social data, topic modeling was performed using the newly emerged BERTopic method by supplementing the weaknesses of LDA, which is widely used in the past. As a result of the analysis, 8 subject groups (‘Negative emotions’, ‘Advice and help and gratitude’, ‘Insomnia-related diseases’, ‘Sleeping pills’, ‘Exercise and eating habits’, ‘Physical characteristics’, ‘Activity characteristics’, ‘Environmental characteristics’) could be confirmed. Users expressed negative emotions and sought help and advice from the Reddit insomnia community. In addition, they mentioned diseases related to insomnia, shared discourse on the use of sleeping pills, and expressed interest in exercise and eating habits. As insomnia-related characteristics, we found physical characteristics such as breathing, pregnancy, and heart, active characteristics such as zombies, hypnic jerk, and groggy, and environmental characteristics such as sunlight, blankets, temperature, and naps.

20

딥러닝 기반 소셜미디어 한글 텍스트 우울 경향 분석

박서정(연세대학교 문헌정보학과) ; 이수빈(연세대학교 문헌정보학과) ; 김우정(연세대학교 의과대학 용인세브란스병원 정신건강의학교실) ; 송민(연세대학교 문헌정보학과) 2022, Vol.39, No.1, pp.91-117 https://doi.org/10.3743/KOSIM.2022.39.1.091

초록보기

초록

국내를 비롯하여 전 세계적으로 우울증 환자 수가 매년 증가하는 추세이다. 그러나 대다수의 정신질환 환자들은 자신이 질병을 앓고 있다는 사실을 인식하지 못해서 적절한 치료가 이루어지지 않고 있다. 우울 증상이 방치되면 자살과 불안, 기타 심리적인 문제로 발전될 수 있기에 우울증의 조기 발견과 치료는 정신건강 증진에 있어 매우 중요하다. 이러한 문제점을 개선하기 위해 본 연구에서는 한국어 소셜 미디어 텍스트를 활용한 딥러닝 기반의 우울 경향 모델을 제시하였다. 네이버 지식인, 네이버 블로그, 하이닥, 트위터에서 데이터 수집을 한 뒤 DSM-5 주요 우울 장애 진단 기준을 활용하여 우울 증상 개수에 따라 클래스를 구분하여 주석을 달았다. 이후 구축한 말뭉치의 클래스 별 특성을 살펴보고자 TF-IDF 분석과 동시 출현 단어 분석을 실시하였다. 또한, 다양한 텍스트 특징을 활용하여 우울 경향 분류 모델을 생성하기 위해 단어 임베딩과 사전 기반 감성 분석, LDA 토픽 모델링을 수행하였다. 이를 통해 문헌 별로 임베딩된 텍스트와 감성 점수, 토픽 번호를 산출하여 텍스트 특징으로 사용하였다. 그 결과 임베딩된 텍스트에 문서의 감성 점수와 토픽을 모두 결합하여 KorBERT 알고리즘을 기반으로 우울 경향을 분류하였을 때 가장 높은 정확률인 83.28%를 달성하는 것을 확인하였다. 본 연구는 다양한 텍스트 특징을 활용하여 보다 성능이 개선된 한국어 우울 경향 분류 모델을 구축함에 따라, 한국 온라인 커뮤니티 이용자 중 잠재적인 우울증 환자를 조기에 발견해 빠른 치료 및 예방이 가능하도록 하여 한국 사회의 정신건강 증진에 도움을 줄 수 있는 기반을 마련했다는 점에서 의의를 지닌다.

Abstract

The number of depressed patients in Korea and around the world is rapidly increasing every year. However, most of the mentally ill patients are not aware that they are suffering from the disease, so adequate treatment is not being performed. If depressive symptoms are neglected, it can lead to suicide, anxiety, and other psychological problems. Therefore, early detection and treatment of depression are very important in improving mental health. To improve this problem, this study presented a deep learning-based depression tendency model using Korean social media text. After collecting data from Naver KonwledgeiN, Naver Blog, Hidoc, and Twitter, DSM-5 major depressive disorder diagnosis criteria were used to classify and annotate classes according to the number of depressive symptoms. Afterwards, TF-IDF analysis and simultaneous word analysis were performed to examine the characteristics of each class of the corpus constructed. In addition, word embedding, dictionary-based sentiment analysis, and LDA topic modeling were performed to generate a depression tendency classification model using various text features. Through this, the embedded text, sentiment score, and topic number for each document were calculated and used as text features. As a result, it was confirmed that the highest accuracy rate of 83.28% was achieved when the depression tendency was classified based on the KorBERT algorithm by combining both the emotional score and the topic of the document with the embedded text. This study establishes a classification model for Korean depression trends with improved performance using various text features, and detects potential depressive patients early among Korean online community users, enabling rapid treatment and prevention, thereby enabling the mental health of Korean society. It is significant in that it can help in promotion.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지