바로가기메뉴

본문 바로가기 주메뉴 바로가기

logo

  • P-ISSN1013-0799
  • E-ISSN2586-2073
  • KCI

An Experimental Study on an Effective Word Sense Disambiguation Model Based on Automatic Sense Tagging Using Dictionary Information

Journal of the Korean Society for Information Management / Journal of the Korean Society for Information Management, (P)1013-0799; (E)2586-2073
2007, v.24 no.1, pp.321-342
https://doi.org/10.3743/KOSIM.2007.24.1.321


Abstract

This study presents an effective word sense disambiguation model that does not require manual sense tagging process by automatically tagging the right sense using a machine-readable dictionary, and attempts to classify the senses of those words using a classifier built from the training data. The automatic tagging technique was implemnted by the dictionary information-based and the collocation co-occurrence-based methods. The dictionary information-based method that applied multiple feature selection showed the tagging accuracy of 70.06%, and the collocation co-occurrence-based method 56.33%. The sense classifier using the dictionary information-based tagging method showed the classification accuracy of 68.11%, and that using the collocation co-occurrence-based tagging method 62.09%. The combined tagging method applying data fusion technique achieved a greater performance of 76.09% resulting in the classification accuracy of 76.16%.

keywords
단어 중의성 해소, 자동 태깅, 의미 분류, 사전 추출 정보 기반 태깅, 연어 공기 기반 태깅, word sense disambiguation, automatic tagging, sense classification, dictionary information-based tagging, collocation co-occurrence-based tagging

Reference

1.

(1999). 표준국어대사전. , -.

2.

(1998). 연세한국어사전. , -.

3.

(2005). 정보검색 성능 향상을 위한 단어 중의성 해소모형에 관한 연구. 22(2), 125-145.

4.

(1996). Word sense disambiguation using conceptual density. , 16-22.

5.

(2001). SENSEVAL-2: Overview. , 1-5.

6.

A Method for Disambiguating Word Sense in a Large Corpus. , 415-439.

7.

Estimating upper and lower bounds on the performance of word sense disambiguation Programs Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics. , 249-256.

8.

Proceedings of the Speech and Natural Language Workshop. , 233-237.

9.

A method for disambiguating word senses in a large corpus. , 5 415-6 439.

10.

(2000). English Framework and Results. 34(1-2), 1 -13.

11.

Proceedings of the 12th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. , -136.

12.

(m.1986). How to Tell a Pine Cone from an Ice Cream Cone Proceedings of the 1986 SIGDOC Conference. , 24-26.

13.

(1999). Foundations of Statistical Natural Language Processing. , -.

14.

(1995). Disambiguating Noun Groupings with Respect to WordNet Senses. , 54-68.

15.

(2003). Word Sense Disambiguation: the Case for Combinations for Knowledge Sources. , -.

16.

(454-460.). Word sense disambiguation using statistical models of Roget's categories trained on large corpora. , -.

17.

(266-271.). One sense per collocation. , -.

18.

(1995). Unsupervised word sense disambiguation rivaling supervised methods. , 189-196.

Journal of the Korean Society for Information Management