In this study, two experiments were conducted to classify articles related to Korean Studies from Korean newspaper articles from 1920 to 1939. In the first experiment, the performance of a classification model trained on 1920s, 1930s, and 1920s-1930s ...
In this study, two experiments were conducted to classify articles related to Korean Studies from Korean newspaper articles from 1920 to 1939. In the first experiment, the performance of a classification model trained on 1920s, 1930s, and 1920s-1930s data was evaluated, and the findings showed that the use of pre-processing method converting Chinese characters to Hangul and the character n-gram led to relatively high performance. In the second experiment, the top-performing models from each period were employed to classify data from various sub-periods. The results confirm that the model trained on 1920s and 1930s data is the best performing general-purpose model for most sub-periods.