Fast and high-quality document clustering algorithms play an important role in providing data exploration by organizing large amounts of information into a small number of meaningful clusters. Many papers have shown that the hierarchical clustering method takes good-performance, but is limited because of its quadratic time complexity. In contrast, with a large number of variables, K-means has a time complexity that is linear in the number of documents, but is thought to produce inferior clusters. In this paper, Condor system using K-Means algorithm Compares with regular method that the initial centroids have been established in advance, our method performance has been improved a lot.
김해남,. (2004). 계층적 클러스터링에서 분류 계층 깊이에 관한 연구 (-). 한국정보처리학회 춘계학술발표대회.
박순철. (2003). 콘도르 정보 검색 시스템. 한국산업정보학회지, 8(4), 31-37.
오형진. (2002). 클러스터 중심 결정 방법을 개선한 K-Means Algorithm의 구현.
오형진. (2003). 색인어 가중치 부여 방법에 따른 K-Means 문서 클러스터링의 LSI 분석. 한국정보처리학회지, 10(7), 735-742.
이경순. (2001). 정보검색에서 벡터공간 검색과 클러스터 분석을 통한 문서 순위 결정 모델.
이상선. (2004). 계층적 클러스터링에서 분류 대표어 선정에 관한 연구. 한국정보처리학회 춘계학술발표대회, , -.
Baeza-Yates. (1999). Modern Information Retrieval:Addison- Wesley.
Khaled Alsabti. (1998). An Efficient K- Means Clustering Algorithm (-). IPPS/SPDP Workshop on High Performance Data Mining.
Michael Steinbach. (2000). A Comparison of Document Clustering Techniques". Technical Report #00_034:University of Minnesota.
Patrice Bellot. (1999). A Clustering Method for Information Retrieval. .
Qin He. (1999). A Review of Clustering Algorithms as Applied in IR:UIUCLIS1999/6+IRG.
Ramon A. (2000). A relative approach to hierarchical clustering (12-14). ACM symposium of Computa- tional geometry.
Tapas Kanung. (2000). The Analysis of a Simple k-Means Clustering Algorithms (12-14). ACM symposium on Computational geometr.