A Rule-based Approach to Identifying Citation Text from Korean Academic Literature

강인수

doi:10.3743/KOSIM.2012.29.4.043

P-ISSN1013-0799
E-ISSN2586-2073
KCI

Home

OA Policy

Article Contents

Prev Next

e-Submission

Vol.29 No.4

Citation Share

A Rule-based Approach to Identifying Citation Text from Korean Academic Literature

Journal of the Korean Society for Information Management / Journal of the Korean Society for Information Management, (P)1013-0799; (E)2586-2073

2012, v.29 no.4, pp.43-60

https://doi.org/10.3743/KOSIM.2012.29.4.043

(2012). A Rule-based Approach to Identifying Citation Text from Korean Academic Literature. Journal of the Korean Society for Information Management, 29(4), 43-60, https://doi.org/10.3743/KOSIM.2012.29.4.043

copy

Abstract

Identifying citing sentences from article full-text is a prerequisite for creating a variety of future academic information services such as citation-based automatic summarization, automatic generation of review articles, sentiment analysis of citing statements, information retrieval based on citation contexts, etc. However, finding citing sentences is not easy due to the existence of implicit citing sentences which do not have explicit citation markers. While several methods have been proposed to attack this problem for English, it is difficult to find such automatic methods for Korean academic literature. This article presents a rule-based approach to identifying Korean citing sentences. Experiments show that the proposed method could find 30% of implicit citing sentences in our test data in nearly 70% precision.

keywords: citing sentences, citing sentence identification, implicit citing sentences, rules for identifying citing sentences, cue phrases for citing sentences, 인용문, 인용문 인식, 암묵 인용문, 인용문 인식 규칙, 인용문 단서 어구

Reference

강인수. (2011). 표절 예방을 위한 본문 인용 태깅 지침서:한국과학기술정보연구원.

김세종. (2012). KLE 연구실의 언어처리 기반 기술 소개. 포항공과대학교 지식 및 언어 공학연구실.

Abu-Jbara, A.. (2011). Coherent citation-based summarization of scientific papers (500-509). Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL).

Abu-Jbara, A.. (2012). Reference scope identification in citing sentences (80-90). Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL).

Athar, A.. (2012). Detection of implicit citations for sentiment detection (18-26). Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL).

Bradshaw, S.. (2003). Reference directed indexing: Redeeming relevance for subject search in citation indexes (499-510). Proceedings of the European Conference on Research and Advanced Technology for Digital Libraries (ECDL).

Councill, I.. (2008). Parscit : An open-source CRF reference string parsing package (661-667). Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC).

Kang, I.. (2012). Characteristics of citation scopes: A preliminary study to detect citing sentences (80-85). Proceedings of the 2011 International Conference on u- and e-Service, Science and Technology (UNESST).

Kaplan, D.. (2009). Automatic extraction of citation contexts for research paper summarization : A coreference-chain based approach (88-95). Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries.

10.

Nanba, H.. (2000). Classification of research papers using citation links and citation types : Towards automatic review article generation (117-134). Proceedings of the 11th SIG Classification Research Workshop.

11.

O’Connor, J.. (1982). Citing statements : Computer recognition and use to improve retrieval. Information Processing and Management, 18(3), 125-131.

12.

Qazvinian, V.. (2010). Identifying non-explicit citing sentences for citation-based summarization (555-564). Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics(ACL).

13.

Ritchie, A.. (2008). Comparing citation contexts for information retrieval (213-222). Proceedings of the 17th ACM Conference on Information and Knowledge Management(CIKM).

14.

Singhal, A.. (1996). Length normalization in degraded text collections (149-162). Proceedings of the 5th Annual Symposium on Document Analysis and Information Retrieval(SDAIR).

15.

Teufel, S.. (2006). Automatic classification of citation function (103-110). Proceedings of 2006 Conference on Empirical Methods in Natural Language Processing(EMNLP).

바로가기메뉴

Article Contents

Vol.29 No.4

A Rule-based Approach to Identifying Citation Text from Korean Academic Literature

Abstract

Reference

Journal of the Korean Society for Information Management