바로가기메뉴

본문 바로가기 주메뉴 바로가기

A Comparative Research on End-to-End Clinical Entity and Relation Extraction using Deep Neural Networks: Pipeline vs. Joint Models

Journal of the Korean Society for Library and Information Science / Journal of the Korean Society for Library and Information Science, (P)1225-598X; (E)2982-6292
2023, v.57 no.1, pp.93-114
https://doi.org/10.4275/KSLIS.2023.57.1.093

Abstract

Information extraction can facilitate the intensive analysis of documents by providing semantic triples which consist of named entities and their relations recognized in the texts. However, most of the research so far has been carried out separately for named entity recognition and relation extraction as individual studies, and as a result, the effective performance evaluation of the entire information extraction systems was not performed properly. This paper introduces two models of end-to-end information extraction that can extract various entity names in clinical records and their relationships in the form of semantic triples, namely pipeline and joint models and compares their performances in depth. The pipeline model consists of an entity recognition sub-system based on bidirectional GRU-CRFs and a relation extraction module using multiple encoding scheme, whereas the joint model was implemented with a single bidirectional GRU-CRFs equipped with multi-head labeling method. In the experiments using i2b2/VA 2010, the performance of the pipeline model was 5.5% (F-measure) higher. In addition, through a comparative experiment with existing state-of-the-art systems using large-scale neural language models and manually constructed features, the objective performance level of the end-to-end models implemented in this paper could be identified properly.

keywords
개체명 인식, 관계추출, 종단형 정보추출, 심층 신경망, 텍스트마이닝, Named-entity recognition, Relation extraction, End-to-end information extraction, Deep neural network, Text mining

Journal of the Korean Society for Library and Information Science