This paper is the first step toward developing a system that will help researchers analyze data by utilizing data visualization, a new technology for for the Collections of Korean literary works in the Korean Classics Database.
The Korea Classics DB p...
This paper is the first step toward developing a system that will help researchers analyze data by utilizing data visualization, a new technology for for the Collections of Korean literary works in the Korean Classics Database.
The Korea Classics DB provides various classical materials in the form of an API. By crawling the data provided by the API and storing it in the database, the data can be processed and visualized in an intuitive way. By analyzing high volumes of data using the computer's fast computing power, it is possible, as well as to visualize a collection of literary works, to compare and analyze two collections and. further, multiple collections of literary works at the same time.
One of the most representative examples of visualizing data is Word Cloud, in which displays the output in size or color depending on the distribution of words used. In the case of Hangeul, Word Cloud is based on the King Sejong dictionary, which is used as the standard Korean dictionary. However, the data provided by the API in the Korea Classics DB is expressed in Chinese character, which are structurally different from the existing Hangul or English. Although the system was in a rudimentary stage, it attempted to develop a collection analysis system that could visualize the data by applying analysis algorithms suitable for the collections contained in the Korea Classics DB, and then find the meaning based on the context of the humanities.
The data of the collections of literary works in the Korean Classics DB are provided in an API with a certain format. Thus, if they are processed into higher levels of XML and provided theoretically, these literature analysis systems are sure to improve performance. And, if numerous experts join this research, all the collections in the Korean Classics DB can be analyzed.
In fact this research has failed to produce satisfactory results in visualization under limited conditions. However, it is considered to be a very meaningful one, since we have identified numerous problems in the process of visualization, and also have set a new direction of research in visualization. We hope to increase the possibilities and overcome the limitations found in this research, which will help a lot to the future Korea Classics research.