정보관리학회지, 한국정보관리학회

31

김용환(연세대학교) ; 정영미(연세대학교) 2012, Vol.29, No.2, pp.155-171 https://doi.org/10.3743/KOSIM.2012.29.2.155

초록보기

초록

텍스트 범주화에 있어서 일반적인 문제는 문헌을 표현하는 핵심적인 용어라도 학습문헌 집합에 나타나지 않으면 이 용어는 분류자질로 선정되지 않는다는 것과 형태가 다른 동의어들은 서로 다른 자질로 사용된다는 점이다. 이 연구에서는 위키피디아를 활용하여 문헌에 나타나는 동의어들을 하나의 분류자질로 변환하고, 학습문헌 집합에 출현하지 않은 입력문헌의 용어를 가장 유사한 학습문헌의 용어로 대체함으로써 범주화 성능을 향상시키고자 하였다. 분류자질 선정 실험에서는 (1) 비학습용어 추출 시 범주 정보의 사용여부, (2) 용어의 유사도 측정 방법(위키피디아 문서의 제목과 본문, 카테고리 정보, 링크 정보), (3) 유사도 척도(단순 공기빈도, 정규화된 공기빈도) 등 세 가지 조건을 결합하여 실험을 수행하였다. 비학습용어를 유사도 임계치 이상의 최고 유사도를 갖는 학습용어로 대체하여 kNN 분류기로 분류할 경우 모든 조건 결합에서 범주화 성능이 0.35%~1.85% 향상되었다. 실험 결과 범주화 성능이 크게 향상되지는 못하였지만 위키피디아를 활용하여 분류자질을 선정하는 방법이 효과적인 것으로 확인되었다.

Abstract

In text categorization, core terms of an input document are hardly selected as classification features if they do not occur in a training document set. Besides, synonymous terms with the same concept are usually treated as different features. This study aims to improve text categorization performance by integrating synonyms into a single feature and by replacing input terms not in the training document set with the most similar term occurring in training documents using Wikipedia. For the selection of classification features, experiments were performed in various settings composed of three different conditions: the use of category information of non-training terms, the part of Wikipedia used for measuring term-term similarity, and the type of similarity measures. The categorization performance of a kNN classifier was improved by 0.35~1.85% in F1 value in all the experimental settings when non-learning terms were replaced by the learning term with the highest similarity above the threshold value. Although the improvement ratio is not as high as expected, several semantic as well as structural devices of Wikipedia could be used for selecting more effective classification features.

32

대학생의 웹기반 전자책 이용에 관한 연구

장혜란(상명대학교) 2006, Vol.23, No.4, pp.233-256 https://doi.org/10.3743/KOSIM.2006.23.4.233

초록보기

초록

대학생의 전자책 이용에 대한 이해를 돕고 현황을 파악하기 위하여 A대학교 학생들을 표집하여 설문조사와 면접을 수행하였다. 466명의 응답에 기초하여 분석한 결과를 보면, 대학생들의 전자책과 서비스에 대한 인지도는 낮은편이며, 약 30%가 이용경험을 가지고 있고, 접근경로는 대학도서관사이트가 지배적이다. 이용자의 73%가 3권 이하의 전자책을 읽었으며, 이용 분야는 다양하나 문학과 장르문학에 치우쳐 있고, 목적은 학술적 독서와 개인적 독서로 양분되어 있다. 부가기능에 대한 인지도와 활용 수준은 미약하다. 이용자들의 만족도 또한 낮고, 50% 이상이 중립적 견해를 보이고 있다. 이용 경험이 없는 학생들의 비이용 요인은 주로 불편함과 관련지식 부족으로 나타났다. 그러나 비이용자의 약 88%가 향후 이용의지를 표명하고 있다. 면접조사 결과를 보면, 적극적 이용자들은 전자책의 유용성을 인식하고, 화면독서에 친숙하며, 실용도서를 이용하는 것을 알 수 있다. 그러나 이들의 부가기능 인지도 및 활용수준 그리고 만족도 또한 낮다. 분석 결과에 따라, 이용 활성화를 위한 홍보, 생산의 다양화, 교육, 서비스 평가의 필요성이 제언되었다.

Abstract

To understand the use of the ebooks among undergraduate students, a questionnaire was devised and collected data from 466 respondents. The level of ebook and its service awareness appears to be low, and only about 30% of the students have used ebooks in the past. Students access ebooks primarily through the library homepage. 73% of the users read 3 ebooks and below. The subject and area of reading is fairly spread, however literary works and genre fiction were most popular. And the purpose is split into academic and private reading. Most of the users lack of knowledge about additional functions. Overall satisfaction level is low. Discomfort and ebooks illiteracy constitute the major reasons of nonuse, however about 88% of the nonusers show willingness to use in the future. According to the interview, active users are familiar with the screen reading as well as perceived advantages of ebooks. Nontheless, their satisfaction level is still low. Based on the results, recommendations for creating awareness, education, production development and service evaluation are suggested to promote the ebooks use.

33

Topic Maps를 이용한 MARC데이터의 FRBR모델 구현에 관한 연구

이현실(원광대학교) ; 한성국(원광대학교) 2005, Vol.22, No.3, pp.289-306 https://doi.org/10.3743/KOSIM.2005.22.3.289

초록보기

초록

FRBR 모델에서는 서지 요소와 관계를 중심으로 ER 모델링 방식을 제공하고 있지만, 단지 구조적 프레임워크로서 FRBR 모델을 효율적으로 구현할 수 있는 도구가 필요하다. 본 연구에서는 Topic Maps를 이용하여 FRBR 모델을 구현하는 방법을 제시한다. Topic Maps 기반의 FRBR 모델 구현의 유효성을 설계하였고, Topic Maps를 이용하여 이를 구현하였다. 연구 결과, FRBR의 entity-relation 과 Topic Maps의 topic-asociation이 개념적으로 동일하기 때문에 FRBR 모델 개발의 적합함을 알 수 있었 다. FRBR 구조는 Topic Maps 패러다임과 그대로 일치하기 때문에 FRBR 모델은 Topic Maps로 구현함이 바람직하다.

Abstract

As FRBR defines structural framework based on ER modeling for b ibliographic data elements, an effective tool is required to implement FRBR model. In this implementation of FRBR model based on Topic Maps. To show the e ffectiveness of Topic Maps as the implantation language of FRBR, we implement FRBR mo del of topic-association of Topic Maps conceptually harmonize with entity-relation of FRBR, which means that Topic Maps is suitable for the implementation of FRBR model.

34

독자 추천도서 정보를 이용한 작가 이미지 분석 연구

최상희(대구가톨릭대학교) 2017, Vol.34, No.4, pp.153-171 https://doi.org/10.3743/KOSIM.2017.34.4.153

초록보기

초록

여가를 위해 독서를 하는 독자는 특정 작가를 선호하는 경우가 많은데 독서분야를 확장할 때에도 자신이 선호하는 작가와 연관된 작가나 장르로 독서분야를 확장하는 성향이 있다. 이 연구에서는 중심작가로 에드거 앨런 포를 선정한 후 독자들이 에드거 앨런 포와 연관하여 다른 독자에게 추천하는 작가와 작품정보를 기반으로 작가 이미지를 분석하였다. 에드거 앨런 포와 동시출현한 작가와 작품의 빈도수를 분석하고 추천작가간, 작품간 관계를 네트워크 기법으로 분석하였다. 분석결과 에드거 앨런 포의 장르적 이미지와 연관된 작가군, 작가들 간의 관계, 연관 도서가 파악되었다, 이 연구에서 제시한 특정 작가의 이미지, 연관 작가 및 작품 정보를 도출하는 방안은 특정 작가를 중심으로 도서관 독서 프로그램이나 문화 프로그램, 북 큐레이션을 하게 될 경우 활용할 수 있는 도구가 될 것이다.

Abstract

Many readers tend to read books of a specific author and to expand their reading areas according to the author. This study chose Edgar Allan Poe and analyzed the image of the author using co-recommended authors and books by other readers. The frequencies of co-occurred authors and books were investigated and the relations of authors and books were analyzed with network analysis methods. As a result, genre images of Poe, related authors, and related books are discovered. This study also suggested the methods to identify the image of a author, related author groups, and related books for libraries’ reading programs and book curation.

35

사회 네트워크 분석에 기반한 도서관 학술DB 이용 패턴 연구: K대학도서관 학술DB 이용 사례

최일영(경희대학교) ; 이용성(경희대학교) ; 김재경(경희대학교) 2010, Vol.27, No.1, pp.25-40 https://doi.org/10.3743/KOSIM.2010.27.1.025

초록보기

초록

본 연구는 사회 네트워크 분석 기법을 통하여 K대학도서관의 학술DB 이용현황을 분석하고 이용자의 요구에 적합한 서비스를 개발 및 제공하고자 하는데 그 목적을 두고 있다. 이를 위하여 K대학 도서관의 학술DB 로그 데이터를 이용하여 학문분야별, 신분별, 학문분야 및 신분별 학술DB 네트워크를 구성하고 실증 분석하였다. 본 연구의 결과, 전임교원의 학술DB 네트워크와 박사과정의 학술DB 네트워크는 특화된 학술DB를 중심으로 강한 결속력을 보이고 있으며 다른 신분의 학술DB 네트워크보다 밀도, 연결정도 집중도 및 연결정도 중심성이 높게 나타났다.

Abstract

The purpose of this study is to analyze the usage pattern between each academic database through social network analysis, and to support the academic database for users's needs. For this purpose, we have extracted log data to construct the academic database networks in the proxy server of K university library and have analyzed the usage pattern among each research area and among each social position. Our results indicate that the specialized academic database for the research area has more cohesion than the generalized academic database in the full-time professors' network and the doctoral students' network, and the density, degree centrality and degree centralization of the full-time professors' network and the doctoral students' network are higher than those of the other social position networks.

36

노드정보를 이용한 문서검색의 성능에 관한 연구

윤소영(국사편찬위원회) 2007, Vol.24, No.1, pp.103-120 https://doi.org/10.3743/KOSIM.2007.24.1.103

초록보기

초록

통신기술과 정보기기의 발달로 대학에서 교육과정에 정보를 활용하는 방식이 급격히 변화하고 있어 저작권이 있는 정보를 윤리적, 합법적으로 교육 자료로 사용하게 될 경우 인지해야할 사항들이 점점 늘어나고 있다. 이 연구에서는 대학에 필요한 교육적 목적의 정보 공정사용에 관련한 저작권 법과 각종 지침을 분석한 후 대학도서관에서 교육자와 학생에게 인지시켜야 할 주요 개념을 도출하여 대학 도서관이 정보 공정사용 지침에 포함해야 할 주요 영역을 제시하고자 하였다. 또한 영역별로 도출된 주요 개념들이 국내 대학 도서관 사이트에서 적절하게 교육자와 학생들에게 제공되고 있는지를 조사하였다.

Abstract

Due to the radical changes of information technology, it becomes indispensable for educators and students of university to learn how to use copyrighted works ethically and legally without violating the copyright law. As a result, academic libraries need to take responsibilities to inform them fair use criteria and to provide proper fair use guidelines. This study analysed various fair use guidelines for them and copyright law to identify key areas of fair use guideline for the academic libraries. It also investigated 10 university libraries' web sites to find that the identified key areas are delivered to the educators and students.

37

분포유사도를 이용한 문헌클러스터링의 성능향상에 대한 연구

이재윤(경기대학교) 2007, Vol.24, No.4, pp.267-283 https://doi.org/10.3743/KOSIM.2007.24.4.267

초록보기

초록

이 연구에서는 분포 유사도를 문헌 클러스터링에 적용하여 전통적인 코사인 유사도 공식을 대체할 수 있는 가능성을 모색해보았다. 대표적인 분포 유사도인 KL 다이버전스 공식을 변형한 Jansen-Shannon 다이버전스, 대칭적 스큐 다이버전스, 최소 스큐 다이버전스의 세 가지 공식을 문헌 벡터에 적용하는 방안을 고안하였다. 분포 유사도를 적용한 문헌 클러스터링 성능을 검증하기 위해서 세 실험 집단을 대상으로 두 가지 실험을 준비하여 실행하였다. 첫 번째 문헌 클러스터링 실험에서는 최소 스큐 다이버전스가 코사인 유사도 뿐만 아니라 다른 다이버전스 공식의 성능도 확연히 앞서는 뛰어난 성능을 보였다. 두 번째 실험에서는 피어슨 상관계수를 이용하여 1차 유사도 행렬로부터 2차 분포 유사도를 산출하여 문헌 클러스터링을 수행하였다. 실험 결과는 2차 분포 유사도가 전반적으로 더 좋은 문헌 클러스터링 성능을 보이는 것으로 나타났다. 문헌 클러스터링에서 처리 시간과 분류 성능을 함께 고려한다면 이 연구에서 제안한 최소 스큐 다이버전스 공식을 사용하고, 분류 성능만 고려할 경우에는 2차 분포 유사도 방식을 사용하는 것이 바람직하다고 판단된다.

Abstract

In this study, measures of distributional similarity such as KL-divergence are applied to cluster documents instead of traditional cosine measure, which is the most prevalent vector similarity measure for document clustering. Three variations of KL-divergence are investigated; Jansen-Shannon divergence, symmetric skew divergence, and minimum skew divergence. In order to verify the contribution of distributional similarities to document clustering, two experiments are designed and carried out on three test collections. In the first experiment the clustering performances of the three divergence measures are compared to that of cosine measure. The result showed that minimum skew divergence outperformed the other divergence measures as well as cosine measure. In the second experiment second-order distributional similarities are calculated with Pearson correlation coefficient from the first-order similarity matrixes. From the result of the second experiment, second-order distributional similarities were found to improve the overall performance of document clustering. These results suggest that minimum skew divergence must be selected as document vector similarity measure when considering both time and accuracy, and second-order similarity is a good choice for considering clustering accuracy only.

38

디지털 시대의 회색문헌 이용 활성화에 관한 연구

남영준(전주대학교) 2002, Vol.19, No.4, pp.234-255 https://doi.org/10.3743/KOSIM.2002.19.4.234

초록보기

초록

본 연구는 회색문헌에 대한 정확한 정의와 회색문헌이 학술적으로 어느 정도의 가치를 갖고 있는지에 대한 분석을 하였다. 이를 위해 기존 선행 연구결과를 재해석하여 회색문헌의 표준적 정의를 도출하였다. 또한 회색문헌의 가치를 측정하기 위해 선행 연구의 이용자 설문을 재해석하였으며, 정보관리담당자를 대상으로 회색문헌 활성화를 위한 집중관리기관 설립에 대한 설문을 실시하였다. 그 결과 아직도 국내에서는 이용자와 정보관리담당자들은 회색문헌의 경계와 인식이 명확하지 않았다. 또한 회색문헌의 활용도는 이용자의 인지여부와 정보관리 담당자의 인지여부에 크게 좌우되었다. 회색문헌의 수집과 관리, 배포를 위해서는 회색문헌집중관리기관이 필요하다.

Abstract

Despite of the importances, grey literatures are often overlooked as a resource in libraries because they have always had the reputation of being obscure and difficult to locate. Thus, an information system and network for the management of grey literature is essential to the improvement of Korean R&D in science and technology. The Results of the research are as follow:- Declare the clear definitions and types of grey literatures- Suggest the utilizing methods of grey literatures- Identify the major functions or requirements of National Information Center on Grey Literature

39

기계학습을 이용한 기록 텍스트 자동분류 사례 연구

김해찬솔(아카이브랩) ; 안대진(명지대학교 기록정보과학전문대학원, (주)아카이브랩 대표) ; 임진희(서울특별시청) ; 이해영(명지대학교) 2017, Vol.34, No.4, pp.321-344 https://doi.org/10.3743/KOSIM.2017.34.4.321

초록보기

초록

기록이나 문헌의 자동분류에 관한 연구는 오래 전부터 시작되었다. 최근에는 인공지능 기술이 발전하면서 기계학습이나 딥러닝을 접목한 연구로 발전되고 있다. 이 연구에서는 우선 문헌의 자동분류와 인공지능의 학습방식이 발전해 온 과정을 살펴보았다. 또 기계학습 중 특히 지도학습 방식의 특징과 다양한 사례를 통해 기록관리 분야에 인공지능 기술을 적용해야 할 필요성에 대해 알아보았다. 그리고 실제로 지도학습 방식으로 서울시의 결재문서를 ETRI의 엑소브레인을 통해 정부기능분류체계로 자동분류해 보았다. 이를 통해 기록을 다양한 방식의 분류체계로 자동분류하기 위한 각 과정의 고려사항을 도출하였다.

Abstract

Research on automatic classification of records and documents has been conducted for a long time. Recently, artificial intelligence technology has been developed to combine machine learning and deep learning. In this study, we first looked at the process of automatic classification of documents and learning method of artificial intelligence. We also discussed the necessity of applying artificial intelligence technology to records management using various cases of machine learning, especially supervised methods. And we conducted a test to automatically classify the public records of the Seoul metropolitan government into BRM using ETRI’s Exobrain, based on supervised machine learning method. Through this, we have drawn up issues to be considered in each step in records management agencies to automatically classify the records into various classification schemes.

40

텍스트 마이닝 기법을 이용한 연관용어 선정에 관한 실험적 연구

김수연(연세대학교) ; 정영미(연세대학교) 2006, Vol.23, No.3, pp.147-165 https://doi.org/10.3743/KOSIM.2006.23.3.147

초록보기

초록

이 연구에서는 전체 문헌집단으로부터 초기 질의어에 대한 연관용어 선정 시 사용할 수 있는 최적의 기법을 찾기 위해 연관규칙 마이닝과 용어 클러스터링 기법을 이용하여 연관용어 선정 실험을 수행하였다. 연관규칙 마이닝 기법에서는 Apriori 알고리즘을 사용하였으며, 용어 클러스터링 기법에서는 연관성 척도로 GSS 계수, 자카드계수, 코사인계수, 소칼 & 스니스 5, 상호정보량을 사용하였다. 성능평가 척도로는 연관용어 정확률과 연관용어 일치율을 사용하였으며, 실험결과 Apriori 알고리즘과 GSS 계수가 가장 좋은 성능을 나타냈다.

Abstract

In this study, experiments for selection of association terms were conducted in order to discover the optimum method in selecting additional terms that are related to an initial query term. Association term sets were generated by using support, confidence, and lift measures of the Apriori algorithm, and also by using the similarity measures such as GSS, Jaccard coefficient, cosine coefficient, and Sokal & Sneath 5, and mutual information. In performance evaluation of term selection methods, precision of association terms as well as the overlap ratio of association terms and relevant documents' indexing terms were used. It was found that Apriori algorithm and GSS achieved the highest level of performances.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지