http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Low-resource YouTube comment encoding for Luganda sentiment classification performance
Abdul Male Ssentumbwe,YuChul Jung(정유철),Hyunah Lee(이현아),Byeong Man Kim(김병만) 한국디지털콘텐츠학회 2020 한국디지털콘텐츠학회논문지 Vol.21 No.5
The recent boom in social networks usage has generated some multilingual opinion data for low-resource languages. Luganda is one of the major languages in Uganda, thus it is a low-resource language and Luganda corpora for sentiment analysis especially for YouTube is not easily available. In this paper, we propose assumptions to guide collection of Luganda comments using Luganda YouTube video opinions for sentiment analysis. We evaluate the suitability of our clean YouTube comments (158) dataset for sentiment analysis using selected machine learning and deep learning classification algorithms. Given the low-resource setting, the dataset performs best with Gaussian Naive Bayes for machine learning (55%) and deep learning Multilayer Perceptron sequential model scoring (68.8%) when dataset splitting is at 10% for test set with Luganda comment segmentation.