A Study on Statistical Feature Selection with Supervised Learning for Word Sense Disambiguation

Lee, Yong-Gu; 이용구

doi:10.14699/kbiblia.2011.22.2.005

한국비블리아학회지

Apply for Authority
P-ISSN1229-2435
E-ISSN2799-4767

Home

OA Policy

Article Contents

e-Submission

Vol.22 No.2

Citation Share

A Study on Statistical Feature Selection with Supervised Learning for Word Sense Disambiguation

한국비블리아학회지 / 한국비블리아학회지, (P)1229-2435; (E)2799-4767

2011, v.22 no.2, pp.5-25

https://doi.org/10.14699/kbiblia.2011.22.2.005

Lee, Yong-Gu

Lee,, Y. (2011). A Study on Statistical Feature Selection with Supervised Learning for Word Sense Disambiguation. 한국비블리아학회지, 22(2), 5-25, https://doi.org/10.14699/kbiblia.2011.22.2.005

copy

Downloaded
Viewed

PDF Download

Abstract

This study aims to identify the most effective statistical feature selecting method and context window size for word sense disambiguation using supervised methods. In this study, features were selected by four different methods: information gain, document frequency, chi-square, and relevancy. The result of weight comparison showed that identifying the most appropriate features could improve word sense disambiguation performance. Information gain was the highest. SVM classifier was not affected by feature selection and showed better performance in a larger feature set and context size. Naive Bayes classifier was the best performance on 10 percent of feature set size. kNN classifier on under 10 percent of feature set size. When feature selection methods are applied to word sense disambiguation, combinations of a small set of features and larger context window size, or a large set of features and small context windows size can make best performance improvements.

keywords: 단어 중의성 해소, 통계적 자질선정, 문맥 크기, 나이브 베이즈 분류기, kNN 분류기

바로가기메뉴

Article Contents

Vol.22 No.2

A Study on Statistical Feature Selection with Supervised Learning for Word Sense Disambiguation

Abstract

한국비블리아학회지