A Study on Information Resource Evaluation for Text Categorization

정은경

doi:10.3743/KOSIM.2007.24.4.305

P-ISSN1013-0799
E-ISSN2586-2073
KCI

Home

OA Policy

Article Contents

Prev Next

e-Submission

Vol.24 No.4

Citation Share

A Study on Information Resource Evaluation for Text Categorization

Journal of the Korean Society for Information Management / Journal of the Korean Society for Information Management, (P)1013-0799; (E)2586-2073

2007, v.24 no.4, pp.305-321

https://doi.org/10.3743/KOSIM.2007.24.4.305

(2007). A Study on Information Resource Evaluation for Text Categorization. Journal of the Korean Society for Information Management, 24(4), 305-321, https://doi.org/10.3743/KOSIM.2007.24.4.305

copy

Abstract

The purpose of this study is to examine whether the information resources referenced by human indexers during indexing process are effective on Text Categorization. More specifically, information resources from bibliographic information as well as full text information were explored in the context of a typical scientific journal article data set. The experiment results pointed out that information resources such as citation, source title, and title were not significantly different with full text. Whereas keyword was found to be significantly different with full text. The findings of this study identify that information resources referenced by human indexers can be considered good candidates for text categorization for automatic subject term assignment.

keywords: Text Categorization, 문서범주화, 자동색인, 정보원, Text Categorization, 주제색인과정, Text Categorization

Reference

Chan, L.M. (1981). Cataloging and classification: An introduction. , -.

Chan, L.M. (1987). Instructional materials used in teaching cataloging and classification. , 131-144.

Chu, C.M. (1993). Subject analysis: The critical first stage in indexing. , 439-454.

Cunningham, S.J. (1999). Applications of machine learning in information retrieval. 34, 341-384.

Diaz, I. (2004). Improving performance of text categorization by combining filtering and support vector machines. 55(7), 579-592.

Efron, M. (2004). Machine learning for information architecture in a large governmental website. , 151-159.

(2006). Engineering Village. 2, -.

Foskett, A.C.. (1996). The Subject Approach to Information. , -.

(1985). Documentation-methods for examining documents: Determining their subjects and selecting indexing terms. , -5963.

10.

Jeng, L.H.. (1996). Using verbal reports to understand cataloging expertise: Two cases. 40(4), 343-358.

11.

Joachims, T. (1998). Text categorization with support vector machine: Learning with many relevant features. , 137-142.

12.

Larkey, L.S.. (1999). A patent search and classification system. , 179-187.

13.

Lewis, D.D. (1995). Evaluating and optimizing autonomous text categorization systems. , -.

14.

Mai, J.E.. (2005). Analysis in indexing: document and domain centered approaches. 41, 599-611.

15.

Mitchell, J.S. (2003). Dewey Decimal Classification and Relative Index. , -.

16.

Moens, M.F.. (2000). Automatic Indexing and Abstracting of Document Texts. , -.

17.

O′Connor, B.C.. (1996). Explorations in Indexing and Abstracting: pointing, virtue, and power. , -.

18.

Porter, M.F. (1980). An algorithm for suffix stripping. , 130-137.

19.

Sauperl, A. (2002). Subject determination during the cataloging process. , -.

20.

Sauperl, A. (2004). Catalogers′ common ground and shared knowledge. 55(1), 55-63.

21.

Sebastiani, F.. (2002). Hypertext categorization. , 109-129.

22.

Sebastiani, F. (2005). Text categorization. , 109-129.

23.

Slattery, S.. (2002). Hypertext categorization. , -.

24.

Taylor, A.G. (2003). The organization of information. , -.

25.

van Rijsbergen, C.J. (1979). Information Retrieval. , -.

26.

Witten, I.H. (2000). Data Mining: Practical Machine Learning Tools and Techniques with JAVA Implementations. , -.

27.

Yang, Y.. (1999). An evaluation of statistical approaches to text categorization. 1, 69-90.

28.

Zhang, B. (2004). Combining structural and citation-based evidence for text categorization. , 162-163.

바로가기메뉴

Article Contents

Vol.24 No.4

A Study on Information Resource Evaluation for Text Categorization

Abstract

Reference

Journal of the Korean Society for Information Management