바로가기메뉴

본문 바로가기 주메뉴 바로가기

logo

  • P-ISSN1013-0799
  • E-ISSN2586-2073
  • KCI

Selection of Cluster Hierarchy Depth and Initial Centroids in Hierarchical Clustering using K-Means Algorithm

Journal of the Korean Society for Information Management / Journal of the Korean Society for Information Management, (P)1013-0799; (E)2586-2073
2004, v.21 no.4, pp.173-185
https://doi.org/10.3743/KOSIM.2004.21.4.173



Abstract

Fast and high-quality document clustering algorithms play an important role in providing data exploration by organizing large amounts of information into a small number of meaningful clusters. Many papers have shown that the hierarchical clustering method takes good-performance, but is limited because of its quadratic time complexity. In contrast, with a large number of variables, K-means has a time complexity that is linear in the number of documents, but is thought to produce inferior clusters. In this paper, Condor system using K-Means algorithm Compares with regular method that the initial centroids have been established in advance, our method performance has been improved a lot.

keywords
문서 클러스터링, K-Means 알고리즘, 클러스터 계층 깊이, 클러스터 초기값, 계층적 클러스터링, 클러스터 중심document clustering, K-Means algorithm, cluster hierarchy depth, cluster initial value, hierarchical clustering, cluster centroid, 문서 클러스터링, K-Means 알고리즘, 클러스터 계층 깊이, 클러스터 초기값, 계층적 클러스터링, 클러스터 중심document clustering, K-Means algorithm, cluster hierarchy depth, cluster initial value, hierarchical clustering, cluster centroid

Reference

1.

김해남,. (2004). 계층적 클러스터링에서 분류 계층 깊이에 관한 연구 (-). 한국정보처리학회 춘계학술발표대회.

2.

박순철. (2003). 콘도르 정보 검색 시스템. 한국산업정보학회지, 8(4), 31-37.

3.

오형진. (2002). 클러스터 중심 결정 방법을 개선한 K-Means Algorithm의 구현.

4.

오형진. (2003). 색인어 가중치 부여 방법에 따른 K-Means 문서 클러스터링의 LSI 분석. 한국정보처리학회지, 10(7), 735-742.

5.

이경순. (2001). 정보검색에서 벡터공간 검색과 클러스터 분석을 통한 문서 순위 결정 모델.

6.

이상선. (2004). 계층적 클러스터링에서 분류 대표어 선정에 관한 연구. 한국정보처리학회 춘계학술발표대회, , -.

7.

Baeza-Yates. (1999). Modern Information Retrieval:Addison- Wesley.

8.

Khaled Alsabti. (1998). An Efficient K- Means Clustering Algorithm (-). IPPS/SPDP Workshop on High Performance Data Mining.

9.

Michael Steinbach. (2000). A Comparison of Document Clustering Techniques". Technical Report #00_034:University of Minnesota.

10.

Patrice Bellot. (1999). A Clustering Method for Information Retrieval. .

11.

Qin He. (1999). A Review of Clustering Algorithms as Applied in IR:UIUCLIS1999/6+IRG.

12.

Ramon A. (2000). A relative approach to hierarchical clustering (12-14). ACM symposium of Computa- tional geometry.

13.

Tapas Kanung. (2000). The Analysis of a Simple k-Means Clustering Algorithms (12-14). ACM symposium on Computational geometr.

Journal of the Korean Society for Information Management