9papers in this issue.
In this article, I introduce a method called ‘digital curation’ designed for humanities students who are part of the digital native generation. Digital curation involves systematically organizing digital resources to create informative and narrative-rich digital content. This process utilizes ‘semantic data,’ which is information structured to clearly define its own meaning and its relationships with other data. Semantic data usually involves entities such as people, places, or things, and delineates their interactions and connections. The primary outcome of digital curation is the creation of ‘digital archives’ that rely on semantic data. These archives serve as repositories for humanistic resources and effectively replicate the subjects studied in the humanities through structured data. The potential benefits of these semantic data-based archives include: 1. Enhancing interdisciplinary integration, which facilitates communication and collaboration across various fields, enriching our understanding of human knowledge and culture. 2. Supporting educational uses, where educators and students can utilize the archives to explore the interconnectedness of different cultural elements. 3. Preparing for future integration with AI, as the data format is well-suited for AI processing and utilization, enabling a collaborative future between humanities and artificial intelligence.
This paper explores the significance of Linked Data and the Semantic Web in the evolution of the web, specifically examining the implications of Linked Data technology and its implementation. Linked Data is viewed as a key technical approach for realizing the Semantic Web, transforming the web into an extensive database accessible to all. This technology fosters the creation of a knowledge network that facilitates interdisciplinary research and generates novel insights. However, its effectiveness relies heavily on the quality and consistency of the data, and it inherently involves the subjectivity of data processing. As we move forward, Linked Data must focus on the standardization and openness of data alongside technological advancements, while balancing these aspects with privacy concerns. Moreover, the adoption of this technology demands rigorous consideration of its ethical and social implications. The integration of Linked Data represents more than a mere technological shift; it promises substantial contributions to both academic research and industrial development. Nevertheless, it is crucial to acknowledge that the information society has not entirely supplanted traditional information access methods, emphasizing the need for critical engagement with and secure deployment of new technologies.
This paper discusses how the scholarly genre of Indian literature known as śāstra can be encoded according to TEI guidelines. While projects such as SARIT have proposed text encoding standards for Indian literature, they mainly focus on encoding philological information. To better serve the need of interpretatively oriented researchers, interpretive information is to also be defined as data. This paper proposes to consider reference information (mentions, quotations) and concept-word usage information as data and to develop a schema that is TEI-conformant and SARIT-compatible. The resulting dataset will place the study of Indian philosophy on a more solid historical footing.
This paper traces the history and development of the collaborative #DHmakes initiative to bring crafts into the mainstream of Anglophone digital humanities, starting in 2022 but building upon earlier calls to bridge the maker / craft divide. It argues for the importance of this work within digital humanities as a way of reconnecting technology with its textile roots, while also recognizing the creative and intellectual potential found in feminine-coded craft praxis. The history of different phases within digital humanities is often poorly documented. By recording the development of this recent movement – already a challenge following a mass exodus from Twitter, where much of these conversations initially took place – this paper captures the origins of an important moment in the field, as “digital humanities” was coming to understand itself more capaciously than the “digital” alone would imply.
One of the challenges faced by models in Korean morphological analysis is ambiguity. This arises because different combinations of morphemes with completely different base forms can share the same surface form in Korean, necessitating the model's ability to consider context for accurate analysis. The morphological analyzer Kiwi addresses this issue by proposing a combination of a statistical language model that considers local context and a Skip-Bigram model that considers global context. This proposed method achieved an average accuracy of 86.7% in resolving ambiguities, outperforming existing open-source morphological analyzers, particularly deep learning-based ones, which typically achieve between 50-70%. Additionally, thanks to the optimized lightweight model, Kiwi shows faster speeds compared to other analyzers, making it useful for analyzing large volumes of text. Kiwi, released as open source, is widely used in various fields such as text mining, natural language processing, and the humanities due to these features. Although this study improved both the accuracy and efficiency of morphological analysis, it shows limitations in handling out-of-vocabulary problem and analyzing Korean dialects, necessitating further improvements in these areas.
The “Shakespearean Character Network” dataset leverages XML editions of Shakespeare’s plays from the Folger Shakespeare Library to analyze character interactions and dynamics within the plays. These XML files, containing detailed textual data such as dialogue and stage directions, are processed using the Python script in the repository. The script generates matrices that document character presence on stage and their verbal exchanges, stored in various directories such as output_onstage and output_exchange. Additionally, visualizations like heatmaps and network graphs offer visual and quantifiable insights into character co-presence and communication patterns. Centrality measures and clustering indices, computed for these interaction networks, further enhance the analysis by quantifying the degree of character clustering and the intensity of their interactions. The dataset aims to provide a comprehensive view of the structural relationships in Shakespeare’s plays. This resource is for researchers aiming to explore the dynamics of Shakespearean characters through a combination of computational methods and literary analysis.
This paper acquires and explores the data schema of the RAWDATA of Korean Modern and Contemporary Magazine Materials from the National Institute of Korean History (NIKH). To acquire the RAWDATA, a request for public data provision was submitted through the Public Data Portal, and a request for data provision was submitted through Docu24. As of March 27, 2024, the RAWDATA of Korean Modern and Contemporary Magazine Materials was acquired. The RAWDATA of Korean Modern and Contemporary Magazine Materials basically follows the NIKH standard XML schema (history.dtd). <Level1> deals with magazine information, <Level2> deals with volume information, and <Level3> deals with individual article information. The body of each article is divided into <paragraph> units. Contextual elements include index (object name), emph(emphasis), pTitle(title), name (author name), illustration(figure), and tableGroup(table), but they are currently only available for data that has body text, as not all magazines currently provide body text information. The RAWDATA of Korean Modern and Contemporary Magazine Materials can be used for analysis of modern literary language and modern literary social networks. It is also expected to be used in various fields such as the foundation data for morpheme analysis tools for modern literature and translation into modern Korean. We hope that the RAWDATA of Korean Modern and Contemporary Magazine Materials will become even richer through the collective intelligence of literary scholars.