This study aimed to build a system that could automate students" English essay evaluation by using Recurrence over BERT (RoBERT), a state-of-the-art deep learning model. English essay evaluation is inherently time-consuming. It may reflect teacher bia...
This study aimed to build a system that could automate students" English essay evaluation by using Recurrence over BERT (RoBERT), a state-of-the-art deep learning model. English essay evaluation is inherently time-consuming. It may reflect teacher bias. English teachers are usually burdened with the task of evaluating many essays in a short period of time. Automated essay scoring (AES) can solve these problems. It has the advantage of being able to evaluate essays in a short time and without bias. In this paper, the RoBERT model was trained and evaluated on Essay Set #8 of the Automated Student Assessment Prize (ASAP) dataset. The 5-fold cross validation evaluation method was used for fair comparison with the previously suggested AES models. As a result, the RoBERT model showed the highest agreement with the human raters’ resolved scores in 5 out of 6 trait scores than the previous evaluation models. The advantage of it is that it can use the pre-trained BERT model and deal with long inputs, overcoming the input size limit of the BERT model. It was confirmed that the RoBERT model works well for trait-specific evaluation of long essays. Thus, the RoBERT model can be used as an auxiliary means to automate the evaluation of students" essays and reduce the excessive work of English teachers.