한국기록관리학회지, 한국기록관리학회

ACOMS+ 및 학술지 리포지터리 설명회

한국과학기술정보연구원(KISTI) 서울분원 대회의실(별관 3층)
2024년 07월 03일(수) 13:30

사전등록 바로가기

오늘 하루 그만보기

권한신청
P-ISSN1598-1487
E-ISSN2671-7247

홈으로

OA 정책

검색어: 재사용성, 검색결과: 4

자연어 처리의 개체명 인식을 통한 기록집합체의 메타데이터 추출 방안

송치호((사)한국국가기록연구원 원장) 2024, Vol.24, No.2, pp.65-88 https://doi.org/10.14404/JKSARM.2024.24.2.065

초록보기

초록

Abstract

본 연구는 인공지능의 하위분야인 자연어 처리(NLP)의 개체명 인식(NER)을 통하여 기록에 내재된 메타데이터 값과 기술 정보를 추출하는 방안에 대한 시험적 연구이다. 연구 대상은 1960~1970년대에 생산된 구로공단 수기 기록물(약 1,200 쪽, 8만여 단어)을 대상으로 하였다. 디지털화를 포함하는 전처리 과정과 함께 기록 텍스트에 대해서 구글의 BERT 언어 모델에 기반하여 구현되어 공개된 언어 API를 사용하여 개체명을 인식하였다. 그 결과로 구로공단의 과거 기록에 포함된 173개의 인명과 314개의 조직 및 기관 개체명을 추출할 수 있었고, 이는 기록의 내용에 대한 직접적인 검색어로 사용될 수 있다고 기대된다. 그리고 자연어 처리의 이론적 방법론을 반·비정형의 텍스트로 이루어진 실제 기록물에 적용할 때 발생하는 문제점을 파악하여 해결 방안과 고려해야 할 시사점을 제시했다.

RiC-O(Records in Contexts - Ontology)를 활용한 국가기록원 기록물 생산기관 변천정보 서비스 개선방안

김현채(명지대학교 기록정보과학전문대학원 석사) ; 강성희(명지대학교 기록정보과학전문대학원 교수) ; 이해영(명지대학교 기록정보과학전문대학원 교수) 2024, Vol.24, No.1, pp.47-72 https://doi.org/10.14404/JKSARM.2024.24.1.047

초록보기

초록

본 연구에서는 국가기록원의 기록물 생산기관 변천정보 서비스를 분석하여, 기관 관계 구조 파악 문제 등을 확인하고, RiC-O의 적용을 통한 개선 가능성을 알아보고자 하였다. 개선을 위한 참고 사례로 RiC-O를 기반으로 한 프랑스의 PIAAF 프로젝트 사례를 분석하여, RiC-O를 사용함으로써 기록물과 기록 생산자에 관련된 정보를 통합하고, 데이터 개체 간의 관계를 명확하게 표현할 수 있으며, 링크드 데이터와 같은 새로운 기술과의 연계를 통해 전거레코드의 상호운용성을 확장할 수 있다는 점에서 기록물 생산기관 변천정보 서비스의 개선에 상당히 기여할 수 있음을 확인할 수 있었다. 이러한 분석 결과를 바탕으로 국가기록원 기록물 생산기관 변천정보 서비스가 갖는 문제점을 개선하고 사용자 경험 향상을 위해 RiC-O에 기반한 전거레코드 서비스를 제안하고 프로토타입을 설계 및 구현하였다.

Abstract

This study delves into the National Archives of Korea's service that provides information on changes in records-creating agencies, identifying the problems in an organizational relationship structure and exploring potential enhancements using the RiC-O. Drawing insights from the French PIAAF project, we applied RiC-O to integrate information on records and records creators, elucidating relationships between data entities. Our analysis demonstrated that leveraging RiC-O, coupled with technologies like linked data, amplifies the interoperability of authority records, substantially enhancing the service providing information on changes in records-creating agencies. Based on these findings, we propose an authority record service based on RiC-O, presenting a prototype designed to improve the National Archives of Korea's change information service and enhance user experience.

법령 기반 분류체계의 유형 분석을 통한 BRM 기반 기록분류 개선 방안 연구

박지영(한성대학교 인문학부 교수) 2024, Vol.24, No.2, pp.139-163 https://doi.org/10.14404/JKSARM.2024.24.2.139

초록보기

초록

본 연구의 목적은 공공영역에서 활용되고 있는 분류체계를 법령을 기반으로 수집하여 분석하고, 그 결과를 바탕으로 공공기록물의 분류체계를 개선하는 것이다. 이를 위해 국가법령정보센터에서 검색된 375건의 법령 조문에서 80여 개의 분류체계를 추출하여 분석했다. 먼저 분류체계의 형식을 리스트, 표, 계층분류로 구분했으며, 분류체계의 관리 유형 3가지와 기능 2가지를 조합하여 6가지 분류체계 활용 유형을 제시했다. 활용 유형별 모델 중에서도 공공기관의 핵심 업무에 활용되는 분류체계는 개발주체와 활용주체가 동일한 경우가 많았으며, 타 기관의 분류체계를 도입한 경우에도 활용 기관의 필요에 따라 일부를 수정할 수 있었다. 타 기관의 분류체계를 그대로 사용하는 경우의 대부분은 핵심 업무 보다는 참조업무에 활용하고 있었다. 그런데 기록관리 분야에서는 기록분류나 처분과 같은 핵심 업무에 타 기관에서 개발하여 관리하는 분류체계 항목을 수정하지 않고 그대로 적용하고 있었다. 이에 본 연구에서는 기록분류체계의 구조적인 개선이 필요하다고 판단하였다. 핵심 업무를 지원하기 위한 별도의 기록분류체계를 개발하거나 기존 분류체계를 수정·보완할 수 있어야 할 것이다. 더불어 기록관리 분야의 기록처분 기준 및 지침을 타 법령의 관련 조문에도 적용할 필요가 있음을 제안하였다.

Abstract

This study aims to analyze classification systems used in the public sector, collected based on legislation, and to improve the classification system for public records. From the Korean Law Information Center, 375 legislative clauses were searched, revealing about 80 classification systems. These systems were initially divided into lists, tables, and hierarchical classifications. Six types of classification system uses were proposed after combining three management types and two system functions. Among these models, classification systems used for core operations in public agencies often had the same entity as both developer and user. While systems adopted from other institutions were often modified as needed, they were predominantly used for reference tasks rather than core operations. However, in records management, crucial tasks such as record classification and disposal commonly use unmodified classification system items developed and managed by other agencies. Consequently, this study proposes that structural improvements are necessary for the record classification system. It suggests developing dedicated classification systems to support core functions or modifying existing systems and also applying records management disposal standards and guidelines to other relevant legislative provisions.

챗GPT를 활용한 기록관리 메타데이터 추출 사례연구

김민지(명지대학교 기록정보과학전문대학원 데이터기록전공 석사) ; 강성희(명지대학교 기록정보과학전문대학원 데이터기록전공 교수) ; 이해영(명지대학교교 기록정보과학전문대학원 기록관리전공 교수) 2024, Vol.24, No.2, pp.89-112 https://doi.org/10.14404/JKSARM.2024.24.2.089

초록보기

초록

기록관리에서 메타데이터는 기록을 구성하는 필수 요소 중 하나로 기록물을 적절하게 관리하고 이해하도록 하는데 매우 중요한 역할을 한다. 기록관리 업무에서 메타데이터 요소들의 자동 부여가 불가능할 경우에는 기록전문가가 메타데이터 값을 직접 입력해야 한다. 이러한 업무의 불편함을 개선하기 위해 본 연구에서는 신기술인 챗GPT를 활용하여 기록관리 메타데이터 요소의 추출 방안을 제시하고자 하였다. 챗GPT 기술을 활용하기 위해 파이썬 프로그램과 랭체인 라이브러리를 이용하여 PDF 문서를 제시하고 질문을 통해 기록물의 메타데이터를 추출해보았고, 챗GPT 온라인 서비스를 통해 여러 건의 PDF 문서를 첨부하여 기록물의 메타데이터 요소를 추출해보았다. 그 결과 챗GPT-3.5 turbo를 사용한 랭체인에서는 보안상으로는 안전한 추출 방법이긴 하나 메타데이터의 정확한 요소를 얻기에는 다소 한계가 있었고, 챗GPT-4 온라인 서비스에서는 보안상 중요 문서를 첨부할 수 없지만 비교적 정확한 결과를 추출하였다. 이를 통해 기록관리에서의 메타데이터 추출을 위한 챗GPT 기술 활용의 가능성을 타진할 수 있었고, 챗GPT 관련 기술의 발달에 따라 좀 더 안전하고 정확한 결과 추출이 가능해질 것이다. 이러한 챗GPT의 장점을 활용함으로써 기록관에서 기록 및 메타데이터의 관리적 측면에서 업무의 효율성 및 생산성을 증대시키는데 도움을 줄 수 있을 것이라 기대한다.

Abstract

Metadata is a crucial component of record management, playing a vital role in properly managing and understanding the record. In cases where automatic metadata assignment is not feasible, manual input by records professionals becomes necessary. This study aims to alleviate the challenges associated with manual entry by proposing a method that harnesses ChatGPT technology for extracting records management metadata elements. To employ ChatGPT technology, a Python program utilizing the LangChain library was developed. This program was designed to analyze PDF documents and extract metadata from records through questions, both with a locally installed instance of ChatGPT and the ChatGPT online service. Multiple PDF documents were subjected to this process to test the effectiveness of metadata extraction. The results revealed that while using LangChain with ChatGPT-3.5 turbo provided a secure environment, it exhibited some limitations in accurately retrieving metadata elements. Conversely, the ChatGPT-4 online service yielded relatively accurate results despite being unable to handle sensitive documents for security reasons. This exploration underscores the potential of utilizing ChatGPT technology to extract metadata in records management. With advancements in ChatGPT-related technologies, safer and more accurate results are expected to be achieved. Leveraging these advantages can significantly enhance the efficiency and productivity of tasks associated with managing records and metadata in archives.

바로가기메뉴

초록

Abstract

초록

Abstract

초록

Abstract

초록

Abstract

한국기록관리학회지