정보관리학회지, 한국정보관리학회

PDF Citation 내용기반 음악정보 검색시스템을 위한 이용자 중심의 질의 인터페이스 설계에 관한 연구

이윤주(Yonsei University) ; 문성빈(연세대학교) pp.5-19 https://doi.org/10.3743/KOSIM.2006.23.2.005

초록보기

초록

본 연구에서는 기존의 시스템 중심의 방식에서 벗어나 각 이용자 집단에게 효율적이고 만족스러운 내용기반 음악정보검색(Music Information Retrieval: MIR)의 질의인터페이스를 설계하고자 각 집단의 음악정보탐색행위를 연구하였다. 연구대상 집단은 음악분야의 전문 지식 여부에 따라 2개의 전공자 집단(작곡전공, 성악/기악전공)과 2개의 비전문가 집단(아마추어 비전문가, 순수 비전문가)으로 구분하여 모집하였다. 모집방법은 연구대상자 선정과정을 포함한 누증표집(snowball sampling) 기법과 이론적 샘플링(theoretical sampling) 기법을 이용하였고 최종적으로 전공자 집단 7명, 비전문가 집단 7명 모두 14명이 실험에 참가하였다. 탐색실험, think-aloud, 참여관찰, 탐색후 질문지법과 심층 인터뷰를 통해 얻은 자료를 분석 및 통합하여 결과를 도출하였다. 작곡 전공의 전문가 집단은 정확한 음렬로 질의를 입력할 수 있는 인터페이스(건반, 텍스트, 악보 입력)를 선호하였고, 비작곡 전공의 전문가 집단과 비전문가 집단은 허밍 질의 인터페이스를 선호하였다. 각 질의 방법의 입력오류를 최소화시키기 위한연구가 더 필요하다.

Abstract

The purpose of this study is to observe and analyze information searching behaviors of various user groups in different access modes for designing user-centered query interface of content-based Music Information Retrieval System(MIRS). Two expert groups and two non-expert groups were recruited for this research. The data gathering techniques employed in this study were in-depth interviewing, participant observation, searching task experiments, think-aloud protocols, and post-search surveys. Expert users, especially majoring in music theory, preferred to input exact notes one by one using such devices as a keyboard and a musical score. On the other hand, non-expert users preferred to input melodic contours by humming.

PDF Citation 문헌정보학 연구논문의 이론체계 현황분석 연구

김성진(Syarcuse University, USA) ; 정동열(이화여자대학교) pp.21-37 https://doi.org/10.3743/KOSIM.2006.23.2.021

초록보기

초록

학문의 지식체계를 형성하는데 있어서 하나의 일련과정으로 밀접하게 상호연관된 이론개발 및 이론활용 연구가 뒷받침되어야 한다는 전제 하에, 본 연구는 문헌정보학 연구논문의 이론개발 및 이론활용 사례의 양적/질적 측면을 조사함으로써 문헌정보학의 이론적 기반을 분석하고자 하였다. 특히, 본 연구는 세부주제영역에 따른 이론개발 및 이론활용 연구의 특성에 주목함으로써 문헌정보학의 이론적 기반 형성에 기여한 세부주제영역을 보다 구체적으로 파악하고자 하였다. 이를 위해 1984년부터 2003년에 출판된 1,661편의 연구논문을 대상으로 내용분석을 실시하였다. 분석결과, 22개의 세부주제영역 중에서 정보이용탐색, 정보검색, 도서관경영, 학술커뮤니케이션 영역이 이론개발과 이론활용의 모든 측면에 가장 큰 기여를 한것으로 나타났다. 또한 주제영역별로 생산된 논문 수에 따른 이론적 연구의 비율을 살펴보았을 때, 계량정보학과 전문직 영역에 대한 연구가 매우 이론적인 특성을 보였다. 이 외에 각 세부주제영역별로 사용된 이론을 분석하였을 때, 일부 세부주제영역 간에 유사한 이론적 기반을 공유하고 있는 것으로 나타났다.

Abstract

Based upon the assumption that both theory building and theory use are intertwined to construct a cohesive body of knowledge in the filed, this study attempts to identify the state of theoretical framework by examining the number and the quality of theoretical articles by subfield. Theoretical article is characterized as an incident in which in which the author contributes to the development or the use of theory in his/her own paper. Theoretical incidents were identified by a content analysis of 1,661 articles in four LIS journals from 1984 to 2003. The findings suggest that the four subfields, such as information seeking/use, information retrieval, library management, and scholar communication had great contributions to both theory building and theory use. Also, two research areas such as bibliometrics and professionals are very likely to be theoretical. Further, the analysis of the name of theories used by subfields could give an insight into the understanding of how the theoretical frameworks of each subfield are related.

PDF Citation 대학 웹사이트의 정보구조 및 레이블링 시스템 분석

이승민(Indiana University) ; 남태우(중앙대학교) ; 김성희(중앙대학교) pp.39-59 https://doi.org/10.3743/KOSIM.2006.23.2.039

초록보기

초록

본 연구에서는 효율적인 정보접근 도구로서의 대학 웹사이트 설계를 위한 정보구조 및 카테고리 레이블을 마련하기 위해 현재 미국 문헌정보학과 웹사이트 17개를 메인메뉴구조, 하부 카테고리, 레이블링을 기준으로 분석하였다. 분석결과 메인메뉴구조는 현재 17개 조사대상 웹사이트에서 모두 공통으로 제공하고 있는 9개 카테고리로 구성하는 것이 바람직한 것으로 나타났으며 둘째, 그 다음 수준의 서브 카테고리는 9개의 카테고리의 내용의 의미를 고려해서 35개 카테고리로 나누는 것이 바람직한 것으로 나타났다. 마지막으로 카테고리 레이블로 사용되는 용어는 17개 웹사이트에서 가장 많이 사용하고 있는 용어를 사용하는 것이 바람직한 것으로 나타났다.

Abstract

In this study we proposed a new informational structure and category labels to fully support the functions of school websites as an access tool to its contents. The proposed model was divided into three main aspects. First, main menu structure was the primary guideline to access information embedded in a website. Therefore, The proposed main menu structure consisted of 9 categories that are commonly provided by 17 existing school websites. Second, first-level categories consisted of total 35 categories under 9 main menu categories. Each category was placed under certain categories in main menu based on the relationships with the meaning of the upper level categories. Third, the proposed model adopted general and comprehensive terms as category labels. The terms used as category labels were based on the analysis of existing category labels, and the most frequently used terms were selected from the current school websites.

PDF Citation 질의로그 데이터에 기반한 특허 및 상표검색에 관한 연구

이지연(연세대학교) ; 백우진(건국대학교) pp.61-79 https://doi.org/10.3743/KOSIM.2006.23.2.061

초록보기

초록

본 연구는 특허 및 상표 검색 개선을 위한 방법을 제안하고자 하는 목적에서 출발하였다. 이를 위해 193일간 한국특허정보원의 특허기술정보서비스를 이용한 17,559명의 이용자가 작성한 100,016개의 질의문에 대한 로그 데이터를 분석하였다. 개별적인 질의로그 분석 이외에, 2,202개의 복수 질의문을 이용한 탐색세션을 분석함으로써 검색 개선을 위한 추가적인 단서를 발견하였다. 분석결과에 의하면, 특허 및 상표검색은 일반적인 웹 검색의 유형과 유사한데, 특히 질의문의 길이가 짧다는 측면에서 매우 흡사하다. 그러나 특허 및 상표검색의 경우, 일반 웹 검색보다 불리언 연산자를 많이 사용하고 있었다. 복수 질의문 분석을 통해 이용자들이 질의문을 재작성하는데 도움이 될 수 있는 탐색기능을 제안할 수 있었다. 복수의 질의문으로 구성된 탐색세션을 분석한 결과, 이용자들은 질의문을 재작성하기 위하여 부연하기, 특정화하기, 일반화하기, 교체하기, 중단하기와 같은 방법을 사용하고 있음을 알 수 있었다.

Abstract

To come up with the recommendations to improve the patent & trademark retrieval efficiency, 100,016 patent & trademark search requests by 17,559 unique users over a period of 193 days were analyzed. By analyzing 2,202 multi-query sessions, where one user issuing two or more queries consecutively, we discovered a number of retrieval efficiency improvements clues. The session analysis result also led to suggestions for new system features to help users reformulating queries. The patent & trademark retrieval users were found to be similar to the typical web users in certain aspects especially in issuing short queries. However, we also found that the patent & trademark retrieval users used Boolean operators more than the typical web search users. By analyzing the multi-query sessions, we found that the users had five intentions in reformulating queries such as paraphrasing, specialization, generalization, alternation, and interruption, which were also used by the web search engine users.

PDF Citation 지식정보 공유를 위한 전자원문서비스의 주요 이슈와 사례 분석

유수연(Korea Institute of Science and Technology Information) ; 최희윤(한국과학기술정보연구원) pp.81-96 https://doi.org/10.3743/KOSIM.2006.23.2.081

초록보기

초록

웹기반 학술정보 커뮤니케이션이 보편화되고 정보공급자 및 이용자와의 직접적인 커뮤니케이션이 확산되는 등 원문서비스 환경의 변화는 원문서비스 기관에 적지 않은 영향을 미치고 있다. 특히 웹을 통하여 이용자에게 원문을 제공하는 전자원문서비스의 등장은 전자형태 정보의 신속하고 용이한 복제 및 배포로 인하여 그 운영에 있어서 저작권과의 마찰을 피할 수 없다. 이 연구에서는 원문서비스 환경의 주요 변화와 동향을 검토하고, 해외 전자원문서비스 사례를 파악함으로써 국내 웹기반 원문서비스인 e-DDS가 국내 저작권법에서 이슈가 되는 부분 및 향후 해결해 나가야 할 부분들을 검토하고자 한다.

Abstract

Changes in document delivery service environment such as spread of web-based research information communication and direct communication between users and information providers have considerable effects on document delivery service institutes. Swift advances in information technology have allowed users to receive information on their desktops via web. Web-based document delivery makes the massive scale of reproduction and distribution possible so it need to protect the copyright holders' rights. This study identifies the current trends and issues of document delivery service environment and reviews electronic document delivery services of foreign countries. Also this study introduces the domestic electronic document delivery service, e-DDS, and evaluates the copyright issues for the service.

PDF Citation 럿거스 정보검색 평가 프로젝트에 관한 연구

이혁진(Texas Woman’s University) pp.97-111 https://doi.org/10.3743/KOSIM.2006.23.2.097

초록보기

초록

이 논문의 주요목적은 정보이용자들이 어떤 수준의 정확률 차이에서 유의미하게 차이를 인지하는지를 알아보고자 하는 것이다. 그에 관련한 몇 가지 흥미 있는 결과가 도출되었다. 그 외에 적합성 판정은 이용자의 판정시간과 관계가 없는 것으로 나타났다. 그리고 주제에 대한 이용자의 배경지식과 적합성 판정의 관계가 두드러졌다. 또한, 적합문서의 숫자가 적었을 때 이용자들은 적합성 판정에 더욱 어려움을 겪었다. 마지막으로, 검색결과리스트중 상위 N 문서의 적합성 판정에 대한 중요성을 확인할 수 있었다.

Abstract

The purpose of this study is to investigate what level of difference in precision would be significantly perceived by a human user of an information retrieval system. Not many researches have been conducted with regards to this issue in information retrieval field. Despite the non-significant results, there were several interesting findings in recognizing different levels of precision rates. The correctness of relevance task had little to do with the taken time for the task. In addition, the strong relationship between the subjects' topic familiarity and rate of correct judgments is one of the most interesting results in this study. It turned out that the subjects have more difficulty in a situation they have to judge between the two lists having more non-relevant documents than in a situation they do between the lists having more relevant documents. Finally, the serious influence from the first top N documents in a list for relevance judgment task has been confirmed.

PDF Citation 세분화된 전자문헌 접근 및 인터페이스에 대한 연구 : 전자 저널을 중심으로

오정선(University of North Carolina at chapel Hill) pp.113-127 https://doi.org/10.3743/KOSIM.2006.23.2.113

초록보기

초록

디지털 환경에서 물리적인 문헌 단위가 아닌 문헌의 논리적 하위 구성 요소의 접근과 활용이 기술적으로 가능해짐에 따라, 이용자의 전자문헌 활용을 지원할 수 있는 효과적인 인터페이스의 개발이 요청된다. 이 논문은 문헌의 하위 단위로의 접근을 지원하는 이용자 인터페이스 개발을 염두에 두고, 이용자의 정보이용행태, 전자 문헌 모델, 그리고 새로운 상호작용 메커니즘의 세가지 측면에서의 관련 연구를 조사하고 문헌 조사를 바탕으로 새로운 인터페이스 개발을 위한 기본적 요구 사항을 결론으로 도출한다.주제어전자 문헌, 정보 단위, 이용자 행태, 디지털 도서관 인터페이스

Abstract

This paper addresses issues related to the design of an interface supporting fine grained interaction with documents, focusing on a particular type of

PDF Citation 학교 도서관을 위한 전자책 구매 모델 개발에 관한 연구

김성혁(숙명여자대학교) ; 김진숙(Korea Education & Research Information Service) pp.129-145 https://doi.org/10.3743/KOSIM.2006.23.2.129

초록보기

초록

본 논문은 초중등 학교도서관의 전자책 적정 구매 모델을 제안하기 위하여 전자책의 특성, 전자책 가격 결정에 영향을 미치는 요인 분석, 그리고 가격결정모델과 구매방법에 대한 사례조사, 환경, 정부의 역할 등을 연구하였다. 이를 기반으로 전자책 구매를 위한 적정 모델을 제안하였고, 제안된 모델의 현실적 활용을 위해 공청회, 전국교육청 학교도서관 담당자 및 전문가 등의 의견을 수렴하였다.

Abstract

This paper is studied in order to develop the models that are appropriate for the procurement of electronic books at the school libraries. To develop the model, the following factors were reviewed: the characteristics of an electronic book, the factor analysis that affect the electronic book price, use case, environment, the role of government for the price decision model and procurement method. The models were proposed based on the above analysis and review. In addition, the proposed models reflect various opinions that the school libraries can apply.

PDF Citation 사회적 네비게이션 기반 사회적 검색

안재욱(University of Pittsburgh) ; Peter Brusilovsky(University of Pittsburgh) ; Rosta Farzan(University of Pittsburgh) pp.147-165 https://doi.org/10.3743/KOSIM.2006.23.2.147

초록보기

초록

웹기반 교육 자료들이 폭발적으로 증가함에 따라 적합한 자료들에 보다 효과적으로 접근할 수 있는 방법이 요구되고 있다. 이러한 새로운 방법들 중의 하나로 사회적 네비게이션(social navigation) 기반의 사회적 검색(social searching)이 정보 검색 분야에서 제시되었는데, 이는 동료 이용자들로부터 제공된 정보를 바탕으로 검색 결과의 향상을 추구하는 기법이다. 본 연구에서는 개인화와 사회적 네비게이션에 근거한 웹 기반 사회적 검색 시스템을 구축하였으며 이용자 연구를 통해 이용자에게 적합하고 필수적인 정보를 제공할 수 있는 방법이라는 것을 검증하려 하였다.

Abstract

The explosive growth of Web-based educational resources requires a new approach for accessing relevant information effectively. Social searching in the context of social navigation is one of several answers to this problem, in the domain of information retrieval. It provides users with not merely a traditional ranked list, but also with visual hints which can guide users to information provided by their colleagues. A personalized and context-dependent social searching system has been implemented on a platform called KnowledgeSea II, an open-corpus Web-based educational support system with multiple access methods. Validity tests were run on a variety of aspects and results have shown that this is an effective way to help users access relevant, essential information.

PDF Citation 이메일에 포함된 감성정보 관련 메타데이터 추출에 관한 연구

백우진(건국대학교) pp.167-183 https://doi.org/10.3743/KOSIM.2006.23.2.167

초록보기

초록

본 연구는 이메일에 나타난 감성정보 메타데이터 추출에 있어 자연언어처리에 기반한 방식을 적용하였다. 투자분석가와 고객 사이에 주고받은 이메일을 통하여 개인화 정보를 추출하였다. 개인화란 이용자에게 개인적으로 의미 있는 방식으로 컨텐츠를 제공함으로써 온라인 상에서 관계를 생성하고, 성장시키고, 지속시키는 것을 의미한다. 전자상거래나 온라인 상의 비즈니스 경우, 본 연구는 대량의 정보에서 개인에게 의미 있는 정보를 선별하여 개인화 서비스에 활용할 수 있도록, 이메일이나 토론게시판 게시물, 채팅기록 등의 텍스트를 자연언어처리 기법에 의하여 자동적으로 메타데이터를 추출할 수 있는 시스템을 구현하였다. 구현된 시스템은 온라인 비즈니스와 같이 커뮤니케이션이 중요하고, 상호 교환되는 메시지의 의도나 상대방의 감정을 파악하는 것이 중요한 경우에 그러한 감성정보 관련 메타데이터를 자동으로 추출하는 시도를 했다는 점에서 연구의 가치를 찾을 수 있다.

Abstract

This paper describes a metadata extraction technique based on natural language processing (NLP) which extracts personalized information from email communications between financial analysts and their clients. Personalized means connecting users with content in a personally meaningful way to create, grow, and retain online relationships. Personalization often results in the creation of user profiles that store individuals preferences regarding goods or services offered by various e-commerce merchants. We developed an automatic metadata extraction system designed to process textual data such as emails, discussion group postings, or chat group transcriptions. The focus of this paper is the recognition of emotional contents such as mood and urgency, which are embedded in the business communications, as metadata.

PDF Citation 한국 문헌정보학 교육 및 연구의 발전 동향에 대한 연구

오경묵(숙명여자대학교) ; 장윤금(숙명여자대학교) pp.185-206 https://doi.org/10.3743/KOSIM.2006.23.2.185

초록보기

초록

본 논문은 우리나라에서 문헌정보학 교육과 연구 분야에서의 활동과 실적을 조사 및 정리해 보고, 이 분야 발전을 위하여 앞으로의 방향을 모색해보는데 연구 의의가 있다. 연구결과 문헌정보학 분야에서의 주요 이슈는 다음과 같이 나타났다. 1) 국내 문헌정보학과들은 교과과정을 꾸준히 재구성 중에 있다 2) IT 교육을 강화해나가고 있다 3) (주제)전문직 사서를 배출하기 위하여 학부과정에서 복수전공 과정을 도입하고 있다 4) 연구영역에서 활발한 분야는 도서관경영, 정보조직, 정보학 등으로 나타났다 5) 최근 들어 연구 분야가 다양화되고 있는데 정보검색, 기록관리, 출판 분야로 영역이 확장되고 있다.

Abstract

This study examines the history and issues of the LIS field in Korea in order to identify problems of the current librarian education & research areas and provide a new direction for development in this field. As the result of the research, the issues in the department of LIS are found as follows: 1) the LIS departments are restructuring their curriculums 2) the departments are strengthening IT education 3) the foundation for producing professional librarians with area expertise is established by double major programs 4) most popular research areas are library management, organization of information, information sciences etc. 5) the research areas have been diversified as well including information search, record management and publishing.

PDF Citation 온라인 학생의 비공식 정보 추구 행태

김성언(Rutgers University) pp.207-227 https://doi.org/10.3743/KOSIM.2006.23.2.207

초록보기

초록

이 연구는 온라인 학습과정 중 학생들의 비공식 정보 추구 행태와 그들의 비공식 정보 요구를 지원하는 온라인 학습 환경을 알고자 한다. 연구 참여자는 미국 럿거스 대학 평생 교육 프로그램의 온라인 학생 29명이고, 설문지를 통해 수집한 데이터는 내용분석과 기술적 통계를 통해 분석되었다. 이 연구의 초점은 온라인 학생들이 학습 문제를 해결하기 위해 비공식 정보를 필요로 하는 이유와 그들이 온라인 학습 환경에서 구성원간의 의사소통을 통해 이를 해결하는 방식에 있다. 결론에서는 연구 결과에 기초하여 온라인 학생들의 비공식 정보 추구 행태를 지원하기 위하여 고려해야 할 사항들이 제안된다.

Abstract

This study aims to examine online students informal information seeking behavior during their learning process and online learning environments to support their informal information needs. The participants of the study were 29 online students in the Professional Development Studies of Rutgers University. Data was collected by the questionnaire and was analyzed with content analysis and descriptive statistics. This study focuses on when and why online students need human interaction to solve their learning problems and how they communicate with others to meet their informal information needs. Moreover, how online students think about their personal communication opportunities and the functions of their online learning system to support their learning problems is also examined. Finally, online students suggest the ways to effectively support personal communication needed during learning process in online learning systems. Based on the findings of this study, a few considerations are suggested in conclusions.

PDF Citation 질의응답문서 검색에서 문서구조를 이용한 질의재생성에 관한 연구

최상희(대구가톨릭대학교) ; 서은경(한성대학교) pp.229-243 https://doi.org/10.3743/KOSIM.2006.23.2.229

초록보기

초록

질의응답문서는 이용자가 입력한 질의, 질의설명, 답을 아는 다른 이용자가 제시한 응답으로 구성된 구조화된 문서로서, 최근 웹 문서처럼 검색이 일반적으로 일어나고 있는 정보원이다. 이 연구에서는 질의응답문서의 구조적 특성을 기반으로 질의를 재생성하여 질의응답문서의 검색효율을 향상시키고자 하였다. 질의재생성 실험에서 성능이 비교된 문서구조는 질의와 응답내용이다. 질의를 기반으로 질의를 재생성하는 방식에서는 질의응답검색 시스템에 입력되어 있는 유사질의를 활용하여 클러스터링하는 기법이 적용되었다. 응답정보를 기반으로 질의를 재생성하는 방식에서는 가장 유사한 기존 질의에 대해 응답된 내용에서 단락검색으로 적합한 문장들을 선정하여 활용하는 기법이 적용되었다. 실험 결과 응답정보를 활용하여 질의를 재생성하는 방식이 정확률은 유지하면서 더 다양한 검색결과를 제공하는 것으로 나타났다.

Abstract

This study aims to suggest an effective way to enhance question-answer(QA) document retrieval performance by reconstructing queries based on the structural features in the QA documents. QA documents are a structured document which consists of three components: question from a questioner, short description on the question, answers chosen by the questioner. The study proposes the methods to reconstruct a new query using by two major structural parts, question and answer, and examines which component of a QA document could contribute to improve query performance. The major finding in this study is that to use answer document set is the most effective for reconstructing a new query. That is, queries reconstructed based on terms appeared on the answer document set provide the most relevant search results with reducing redundancy of retrieved documents.

PDF Citation 의학분야 학술잡지 선택에 영향을 미치는 요인 연구

김기영(Rutgers University) pp.245-263 https://doi.org/10.3743/KOSIM.2006.23.2.245

초록보기

초록

학술잡지 구입 예산의 구입비용의 상승에 따른 압력으로 지난 수십년간 학술잡지의 선택에 영향을 미치는 요인들에 대한 연구가 활발히 진행되어 왔지만, 학술잡지의 선택에 대한 만족할만한 이론적 틀이 제시되지 못하였다. 이에 따라 본 연구에서는 의학도서관에서 의학분야의 학술잡지의 선택에 영향을 미치는 요인들을 확인하여 이러한 이론적 틀을 제시할 수 있는 근거를 마련코자 한다. 본 연구는 상관관계 분석과 로지스틱회귀분석을 통해 학술잡지선택의 분산을 설명하고, 나아가 예측하는 통계적 모델들을 여러 변수조합을 이용해 제시한다. 또한 이러한 모델의 실제적 적용과 향후 연구방향을 논의한다.

Abstract

Since the beginning of discussions on serial collection management, as budgets have waxed and waned over the ensuing decades, a number of key variables affecting selection/deselection have emerged but without the framework of a coherent and accepted theoretical model. This study is an effort to identify variables which affect the serial collection decision with special attention to selection/deselection in the context of an academic health science library. Based on results from correlation analyses and logistic regression analyses, the serial collection decision can be explained and predicted using various combinations of a reduced set of objective variables. Applications of the results to libraries are discussed, and further research is proposed.

PDF Citation 학습문헌집합에 기 부여된 범주의 정확성과 문헌 범주화 성능

심경(Systems R&D Center, Iris.Net) ; 정영미(연세대학교) pp.265-285 https://doi.org/10.3743/KOSIM.2006.23.2.265

초록보기

초록

문헌범주화에서는 학습문헌집합에 부여된 주제범주의 정확성이 일정 수준을 가진다고 가정한다. 그러나, 이는 실제 문헌집단에 대한 지식이 없이 이루어진 가정이다. 본 연구는 실제 문헌집단에서 기 부여된 주제범주의 정확성의 수준을 알아보고, 학습문헌집합에 기 부여된 주제범주의 정확도와 문헌범주화 성능과의 관계를 확인하려고 시도하였다. 특히, 학습문헌집합에 부여된 주제범주의 질을 수작업 재색인을 통하여 향상시킴으로써 어느 정도까지 범주화 성능을 향상시킬 수 있는가를 파악하고자 하였다. 이를 위하여 과학기술분야의 1,150 초록 레코드 1,150건을 전문가 집단을 활용하여 재색인한 후, 15개의 중복문헌을 제거하고 907개의 학습문헌집합과 227개의 실험문헌집합으로 나누었다. 이들을 초기문헌집단, Recat-1, Recat-2의 재 색인 이전과 이후 문헌집단의 범주화 성능을 kNN 분류기를 이용하여 비교하였다. 초기문헌집단의 범주부여 평균 정확성은 16%였으며, 이 문헌집단의 범주화 성능은 F1값으로 17%였다. 반면, 주제범주의 정확성을 향상시킨 Recat-1 집단은 F1값 61%로 초기문헌집단의 성능을 3.6배나 향상시켰다.

Abstract

In text categorization a certain level of correctness of labels assigned to training documents is assumed without solid knowledge on that of real-world collections. Our research attempts to explore the quality of pre-assigned subject categories in a real-world collection, and to identify the relationship between the quality of category assignment in training set and text categorization performance. Particularly, we are interested in to what extent the performance can be improved by enhancing the quality (i.e., correctness) of category assignment in training documents. A collection of 1,150 abstracts in computer science is re-classified by an expert group, and divided into 907 training documents and 227 test documents (15 duplicates are removed). The performances of before and after re-classification groups, called Initial set and Recat-1/Recat-2 sets respectively, are compared using a kNN classifier. The average correctness of subject categories in the Initial set is 16%, and the categorization performance with the Initial set shows 17% in F1 value. On the other hand, the Recat-1 set scores F1 value of 61%, which is 3.6 times higher than that of the Initial set.

PDF Citation 웹 검색어 선택과정에서의 이용자 불확실성의 유형 : 자연과학연구자들의 정보탐색환경에 대한 고찰

김양우(한성대학교) pp.287-309 https://doi.org/10.3743/KOSIM.2006.23.2.287

초록보기

초록

다수의 연구에서 정보추구 과정상 불 확신성(Uncertainty) 의 중요성이 지적되었지만, 실제 정보검색시스템을 이용한 탐색과정에서 이용자들의 불 확신성에 대한 연구는 많지 않았다. 본 연구는 실제로 정보를 추구하는 이용자들의 웹 검색어 선정과정에서의 불 확신성 인식을 조사하여, 정보탐색 과정에서의 다양한 불 확신성 유형을 식별하였다. 불 확신성 유형에 입각하여 발견된 불 확신성의 주요 원인(Origins)은 정보검색시스템 및 서비스 발전을 위한 시사점을 제시하여준다.

Abstract

While numerous studies have suggested the significance of uncertainty during the process of information-seeking, less research has investigated user uncertainty in the actual search process using a real system. This study investigated user perceptions of uncertainty in the process of the selection of Web search terms in the real information-seeking process. The subjects at the doctoral or post-doctoral level were limited to the discipline of science in order to understand user perceptions in this field. The findings revealed various dimensions, types, and incidents of uncertainty. The typology of uncertainty facilitated an understanding of the subjects' information-seeking context by identifying various aspects of the context that constituted the subjects’ uncertainty. The identification of two principal origins of uncertainty based on the different types of uncertainty generated implications to improve information systems and services.

바로가기메뉴

권호 목록

23권 2호

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

정보관리학회지