바로가기메뉴

본문 바로가기 주메뉴 바로가기

logo

자동요약시스템 구축에 대한 연구 - 웹 상의 보도기사를 중심으로 -

A Study on the Construction of the Automatic Summaries - on the basis of Straight News in the Web -

정보관리학회지 / Journal of the Korean Society for Information Management, (P)1013-0799; (E)2586-2073
2006, v.23 no.4, pp.41-67
https://doi.org/10.3743/KOSIM.2006.23.4.041
이태영 (전북대학교)
  • 다운로드 수
  • 조회수

초록

웹의 보도기사에 관한 자동요약시스템을 구축하기 위하여 담화구조와 지식기반 기법을 적용한 글구조 프레임과 제 규칙들을 작성하였다. 프레임에는 문단과 문장 및 절의 역할, 문단과 문장의 성질, 역할을 구분하는 판별규칙, 주요문장 발췌규칙, 그리고 요약문작성규칙 슬롯이 포함되었다. 문맥정의, 고유명사 등을 안내하는 ‘if-needed'와 변화된 슬롯 값을 알려주는 if-changed 패싯도 구비되었다. 슬롯이나 패싯의 실제 값들을 추출 표현하는 과정에서 문구의 수사적 역할과 단어 최상위 범주 및 줄거리 단위를 참조하였다. 의미흐름의 연결성을 유지하면서 요약 문장들을 통합, 분리, 합성하는 재구성은 유사도공식, 구문정보, 담화구조와 지식기반 방법에서 도출한 제 규칙 및 문맥정의를 이용하였고 비평과 같은 새로운 문장을 생성하였다.

keywords
문맥정의, 보도기사, 역할, 웹, 의미범주, 자동요약, automatic summarization, concept plot, rhetorical role, semantic category, straight news, web, automatic summarization, concept plot, rhetorical role, semantic category, straight news, web

Abstract

The writings frame and various rules based on discourse structure and knowledge-based methods were applied to construct the automatic Ext/Sums (extracts & summaries) system from the straight news in web. The frame contains the slot and facet represented by the role of paragraphs, sentences, and clauses in news and the rules determining the type of slot. Rearrangement like Unification, separation, and synthesis of the candidate sentences to summary, maintaining the coherence of meanings, were also used the rules derived from similar degree measurement, syntactic information, discourse structure, and knowledge-based methods and the context plots defined with the syntactic/semantic signature of noun and verb and category of verb suffix. The critic sentence were tried to insert into summary

keywords
문맥정의, 보도기사, 역할, 웹, 의미범주, 자동요약, automatic summarization, concept plot, rhetorical role, semantic category, straight news, web, automatic summarization, concept plot, rhetorical role, semantic category, straight news, web

참고문헌

1.

(1994.). 설명적 텍스트의 내용 구조 분석방법과 교육적 적용 연구. , -.

2.

이태영. (2005). 자동 발췌문/요약 시스템 구축에 관한 연구-학술지 논문기사를 중심으 로. 39(3), 139-163.

3.

Allen, E.S. (1999). How reliable is science information on the Web? 722. quoted in E.T. Jepsen Characteristics of Scientific Web Publications. 402(722), -.

4.

Boguraev,. (1999). Salience-based Content Characterization of Text Documents quoted in I. Mani and M.T. Maybury . 1999. Advanced in Automatic Text Summarization. Cambridge the MIT Press.. , -.

5.

Drott, M.. (2002). Indexing aids at corporate Websites The use of robots.txt and META tags 209-219. quoted in E.T. Jepsen Characteristics of Scientific Web Publications. 38, 209-219.

6.

Hovy, E.. (1999). Automated Text Summarization in SUMMARIST. , -.

7.

Jepsen, E.T. (2004). Characteristics of Scientific Web Publications: Preliminary Data Gathering and Analysis. 55(14), 1239-1249.

8.

Jones, K. S.. (1999). Automatic summarizing: factors and directions. , -.

9.

Lawrence, S.. (1999). Accessiblity of information on the Web. Nature 107-109. quoted in E.T. Jepsen Characteristics of Scientific Web Publications. 55(14), 1239-1249.

10.

Lawrence, S. (1999). Indexing and retrieval of scientific literature" quoted in E.T. Jepsen, P. Seiden, P. Ingwersen, & L. Bjorneborn. 2004"Characteristics of Scientific Web Publications: Preliminary Data Gathering and Analysis. 55(14), 1239-1249.

11.

Lehnert W.G. (1999). Plot Unit: A Narrative Summarization Strategy. , -.

12.

Mani, I. (1999). Advanced in Automatic Text Summarization. , -.

13.

Mani, I.. (2001). Automatic Summarization. , -.

14.

Mann W. toward a functional theory of text organization 243-281 quoted in D. Marcu. 1999 Discourse trees are good indicators of importance in text quoted in I. Mani and M.T. Maybury . 1999. Advanced in Automatic Text Summarization. Cambridge the MIT Press.. , -8.

15.

Marcu, D.. (1999). Discourse trees are good indicators of importance in text. , -.

16.

Moens, M-F. (1999). Abstracting of Legal Cases The Pontential of Clustering Based on the Selection of Representative Objects. 50, 151-161.

17.

Moens, M-F.Boston. (2000). Automatic Indexing and Abstracting of Document Texts. , -.

18.

Radev, D.R. (1998). Generating Natural Language Summaries from Mutiple On-line Sources. 24, 469-500.

19.

Salton, G., J. Allen,. (1996). Automatic text decomposition and structuring. 32, 127-138.

20.

Strzalkowski, T. (1999). A Robust Practical Text Summarizer. , -.

21.

Schutze, H. (1998). Automatic word sense discrimination. 24, 97-123.

22.

Talja, S. (2005). The Social and Discourse Construction of Computing Skills. 56(1), 13-22.

23.

Teufel, S.. (1999). Argumentive classification of extracted sentences as a first step towards flexible abstracting quoted in I. Mani and M.T. Maybury . 1999. Advanced in Automatic Text Summarization. Cambridge the MIT Press.. , -.

정보관리학회지