The purpose of this study is to analyze the association among the subject areas of big data research papers. The subject group of the units of analysis was extracted by applying co-citation networks, and the rules of association were analyzed using Apriori algorithm of R program, and visualized using the arulesViz package of R program. As a result of the study, 22 subject areas were extracted and these subjects were divided into three clusters. As a result of analyzing the association type of the subject, it was classified into ‘professional type’, ‘general type’, ‘expanded type’ depending on the complexity of association. The professional type included library and information science and journalism. The general type included politics & diplomacy, trade, and tourism. The expanded types included other humanities, general social sciences, and general tourism. This association networks show a tendency to cite other subject areas that are relevant when citing a subject field, and the library should consider services that use the association for academic information services.
This study draws on the current momentum to diversify open government data research through multidimensional scaling and model development. It formulates a quality assessment model applicable to library open data, taking into consideration the paucity of such research in the field. The model was developed using the Delphi method and verified for validity and reliability on the basis of a survey administered to library open data users. The results of the fourth round exhibited an average of 4.00 for all measured elements and a minimum validity of .75, rendering the model appropriate for use in quality assessments of library open data. The convergence and stability results provided by the expert panel fell below .50, confirming that there was no need to conduct further surveys in order to establish the validity of the Delphi method. The model's reliability likewise garnered results of .60 and above in all three dimensions. This Model completed with the input of the Delphi panel was put through a verification process in which library open data users such as domestic and international librarians, developers, and open data activists reviewed the model for validity and reliability. The model scored low on validity on account of its failure to load all measure factors and elements pertaining to the three dimensions. Reliability results, on the other hand, were at 0.6 and above for all dimensions and measured elements.
Fiction is a collection that most students read and borrow in school libraries. KDC has several limitations when students look for fiction books they need. In line with this, we surveyed various cases of fiction classifications used in libraries, bookstores, and publishers and use behaviors of fiction of middle school students. Based upon the result of the surveys, we proposed a better way of classifying fiction books according to user needs. In addition to the KDC number, color bands were attached according to genres so that users could easily find the desired books. These suggestions and other information will enhance the accessibility and discoverability to fiction books for middle school students and may be used as reference materials for fiction classification in libraries, bookstores, and publishers in the future.
This study aims to present a model of digital archiving based on E-ARK. It analyzed the international standards and technological specifications designed for digital archiving. The analysis employed in study explored the common specifications including core processes, information packages, and metadata structure needed for digital archiving. Based on the analysis and reviews, this study developed a model for digital archiving, in order to achieve interoperability of information packages throughout the process.
This study examined differences in job stresses, depression and state anxiety levels relating to sociological characteristics of records managers and studying whether the mediating effect of state anxiety levels significantly occurs on the depression followed by their job stresses. So we distributed questionnaires of 9 factors including sociological characteristics to record managers, asking them job stresses, depression, and state anxiety levels, and collected 98 questionnaires finally. We analyzed the effect of mediation on the surveyed data using Regression Analysis. As a result, it was found that there is a full mediating effect of state anxiety level between job stress and depression, and that, therefore, state anxiety levels of the record managers must be managed to lower their depression levels.
This research aims to discover various aspects of the user studies and the research in practice and also to propose collaboration methods by empirical analysis of the data. To determine the application applicability of the user studies in other subject areas, the degree of keyword overlap between the user studies and the User Experience (UX), one of the research in practice discipline, was measured. The quantitative information science methods including simple frequency analysis were applied to more than ten thousand published papers to generate the network mapping and ranking as well as comparative analysis by time. The analysis result showed that there were slightly lesser overlap between the user studies and the UX in the domestically published articles than the international ones. It also revealed that there is a relationship between the actual occurrences of collaboration and the keyword overlap. The temporal analysis showed that there is increasingly more keyword overlap between two disciplines and thus it is possible to predict the active convergence in the future.
As data management and processing techniques have been developed rapidly in the era of big data, nowadays a lot of business companies and researchers have been interested in long tail data which were ignored in the past. This study proposes methods for generating and controlling a network of technical terms based on text mining technique to enhance data utilization in the distribution of long tail theory. Especially, an edit distance technique of text mining has given us efficient methods to automatically create an interlinking network of technical terms in the scholarly field. We have also used linked open data system to gather experimental data to improve data utilization and proposed effective methods to use data of LOD systems and algorithm to recognize patterns of terms. Finally, the performance evaluation test of the network of technical terms has shown that the proposed methods were useful to enhance the rate of data utilization.
As the users’ information use environment changes to the Web, the archives are providing more services on the Web than before. This study analyzes the users’ recent inflow route and the highly ranked 100 search terms of each month for 10 and half years in the Web site of National Archives of Korea, and suggests suitable information services. As a result of the analysis, it was found out that the inflow route could be divided into access from portal site, by country, from related institutions, and via mobile platform. As a result of analyzing the search terms of users for the last 10 and half years, the most frequently searched term turned out to be ‘Land Survey Register’, which was also the search term that was searched for with steady interests for 10 and half years. Also, other government documents or official gazettes were of great interests to users. As results of identifying the most frequently searched and steadily searched terms, we were able to categorize the search terms largely in terms of land, Japanese colonial period, the Korean war and relationship of North Korea and South Korea, and records management and use. Based on the results of the analysis, we suggested strengthening connection of the National Archives Web site with portal sites and mobile, and upgrading and improving search services of the National Archives. This study confirmed that the analysis of Web log and user search terms would yield meaningful results that could enhance the user services in archives.
This study analyzed Korean university libraries’ holdings of Western language books published in 2003 and 2013 using the KERIS union catalog with a view to investigating the changes in collection development of Western language books in the libraries. To do that, new collection indexes - holding h-index, CUI (Collection Uniqueness Index), and CCHR (Common Collection Holding Ratio) - were suggested, and they were used with basic indexes such as the number of titles, the number of books, and the number of books per title. The analysis reveals that compared to those published in 2003, the number of titles was decreased by 16.1% with those published in 2013, and the number of books dropped more sharply, by 42.2%. Also, in 2013, CCHR was decreased while CUI was increased. In terms of subject, among DDC main classes, 0XX (Generalities) showed the greatest decrease rate in both the number of titles and books because of the radical reduction of computer-related books. In terms of each library’s holdings, the number of Western language books held by top libraries has been increased with those published in 2013.
In this paper, we propose a new methodology for extracting and formalizing subjective topics at a specific time using a set of keywords extracted automatically from online news articles. To do this, we first extracted a set of keywords by applying TF-IDF methods selected by a series of comparative experiments on various statistical weighting schemes that can measure the importance of individual words in a large set of texts. In order to effectively calculate the semantic relation between extracted keywords, a set of word embedding vectors was constructed by using about 1,000,000 news articles collected separately. Individual keywords extracted were quantified in the form of numerical vectors and clustered by K-means algorithm. As a result of qualitative in-depth analysis of each keyword cluster finally obtained, we witnessed that most of the clusters were evaluated as appropriate topics with sufficient semantic concentration for us to easily assign labels to them.