바로가기메뉴

본문 바로가기 주메뉴 바로가기

logo

  • P-ISSN1013-0799
  • E-ISSN2586-2073
  • KCI

A Proposal of Evaluation of Large Language Models Built Based on Research Data

Journal of the Korean Society for Information Management / Journal of the Korean Society for Information Management, (P)1013-0799; (E)2586-2073
2023, v.40 no.3, pp.77-98
https://doi.org/10.3743/KOSIM.2023.40.3.077
Na-eun Han (KISTI)
Sujeong Seo (KISTI)
Jung-ho Um (KISTI)

Abstract

Large Language Models (LLMs) are becoming the major trend in the natural language processing field. These models were built based on research data, but information such as types, limitations, and risks of using research data are unknown. This research would present how to analyze and evaluate the LLMs that were built with research data: LLaMA or LLaMA base models such as Alpaca of Stanford, Vicuna of the large model systems organization, and ChatGPT from OpenAI from the perspective of research data. This quality evaluation focuses on the validity, functionality, and reliability of Data Quality Management (DQM). Furthermore, we adopted the Holistic Evaluation of Language Models (HELM) to understand its evaluation criteria and then discussed its limitations. This study presents quality evaluation criteria for LLMs using research data and future development directions.

keywords
Large Language Model (LLM), Quality Evaluation for LLM, Research Data Quality Management (DQM), evaluation criteria for LLM
Submission Date
2023-08-16
Revised Date
2023-09-04
Accepted Date
2023-09-18

Journal of the Korean Society for Information Management