1 Schneider, S., "wav2vec:Unsupervised pre-training for speech recognition"
2 Baevski, A., "wav2vec 2.0:A framework for self-supervised learning of speech representations"
3 Baevski, A., "vq-wav2vec:Self-supervised learning of discrete speech representations, arXiv.org > cs >"
4 Karlgren, J., "Usefulness of sentiment analysis" 2012
5 Wu, M., "Transformer Based End-to-End Mispronunciation Detection and Diagnosis" 2021 : 3954-3958, 2021
6 Paleari, M., "Towards multimodal emotion recognition: A new approach" 174-181, 2010
7 Jiao, X., "TinyBERT: Distilling BERT for Natural Language Understanding" 2020
8 Seehapoch, T., "Speech emotion recognition using support vector machines" 2013
9 Nwe, T. L., "Speech emotion recognition using hidden Markov models" 41 (41): 603-623, 2003
10 Han, K., "Speech emotion recognition using deep neural network and extreme learning machine" 2014
1 Schneider, S., "wav2vec:Unsupervised pre-training for speech recognition"
2 Baevski, A., "wav2vec 2.0:A framework for self-supervised learning of speech representations"
3 Baevski, A., "vq-wav2vec:Self-supervised learning of discrete speech representations, arXiv.org > cs >"
4 Karlgren, J., "Usefulness of sentiment analysis" 2012
5 Wu, M., "Transformer Based End-to-End Mispronunciation Detection and Diagnosis" 2021 : 3954-3958, 2021
6 Paleari, M., "Towards multimodal emotion recognition: A new approach" 174-181, 2010
7 Jiao, X., "TinyBERT: Distilling BERT for Natural Language Understanding" 2020
8 Seehapoch, T., "Speech emotion recognition using support vector machines" 2013
9 Nwe, T. L., "Speech emotion recognition using hidden Markov models" 41 (41): 603-623, 2003
10 Han, K., "Speech emotion recognition using deep neural network and extreme learning machine" 2014
11 Peng, Z., "Shrinking Bigfoot: Reducing wav2vec 2.0footprint"
12 Liu, Y., "Roberta: A robustly optimized bert pretraining approach"
13 Kolesnikov, A., "Revisiting self-supervised visual representation learning" 2019
14 Kim, K. -H., "Predicting the success of bank telemarketing using deep convolutional neural network" 2015
15 Tsai, Y. -H. H., "Multimodal transformer for unaligned multimodal language sequences" 2019
16 Yoon, S., "Multimodal speech emotion recognition using audio and text" 2018
17 Majumder, N., "Multimodal sentiment analysis using hierarchical fusion with context modeling" 161 : 124-133, 2018
18 Gu, Y., "Multimodal affective analysis using hierarchical attention strategy with word-level alignment, Proceedings of the Conference, Association for Computational Linguistics" 2225-2235, 2018
19 Siriwardhana, S., "Multimodal Emotion Recognition With Transformer-Based Self Supervised Feature Fusion" 8 : 176274-176285, 2020
20 Khan, A. U., "Mmft-bert : Multimodal fusion transformer with bert encodings for visual question answering"
21 Tsai, Y. H. H., "Learning factorized multimodal representations"
22 Xu, H., "Learning Alignment for Multimodal Emotion Recognition from Speech"
23 Bang, J. -U., "Ksponspeech : Korean spontaneous speech corpus for automatic speech recognition" 10 (10): 6936-, 2020
24 Siriwardhana, S., "Jointly Fine-Tuning"BERT-like"Self Supervised Models to Improve Multimodal Speech Emotion Recognition"
25 Park, J., "HanBERT: Pretrained BERT Model for Korean"
26 Selvaraju, R. R., "Grad-cam: Visual explanations from deep networks via gradient-based localization" 2017
27 Fayek, H. M., "Evaluating deep learning architectures for Speech Emotion Recognition" 92 : 60-68, 2017
28 Bojanowski, P., "Enriching word vectors with subword information" 5 : 135-146, 2017
29 Kwon, O. -W., "Emotion recognition by speech signals" 2003
30 Pepino, L., "Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings"
31 Sanh, V., "DistilBERT, a distilled version of BERT: Smaller, faster, Cheaper and Lighter"
32 Rawat, S., "Digital life assistant using automated speech recognition" 2014
33 Devlin, J., "Bert:Pre-training of deep bidirectional transformers for language understanding"
34 Pantic, M., "Affective multimodal human-computer interaction" 669-676, 2005
35 Kingma, D. P., "Adam : A Method for Stochastic Optimization" 2015
36 Chen, T., "A simple framework for contrastive learning of visual representations" 119 : 1597-1160, 2020
37 McDuff, D., "A multimodal emotion sensing platform for building emotion-aware applications"