이미지와 같은 비정형 데이터의 불균형 클래스 문제 해결에 있어 생산적 적대 신경망(generative adversarial network)에 기반한 오버샘플링 기법의 우수성이 알려짐에 따라 다양한 연구들이 이를 정...
http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
https://www.riss.kr/link?id=A108377370
2022
Korean
KCI등재
학술저널
97-118(22쪽)
0
상세조회0
다운로드국문 초록 (Abstract)
이미지와 같은 비정형 데이터의 불균형 클래스 문제 해결에 있어 생산적 적대 신경망(generative adversarial network)에 기반한 오버샘플링 기법의 우수성이 알려짐에 따라 다양한 연구들이 이를 정...
이미지와 같은 비정형 데이터의 불균형 클래스 문제 해결에 있어 생산적 적대 신경망(generative adversarial network)에 기반한 오버샘플링 기법의 우수성이 알려짐에 따라 다양한 연구들이 이를 정형데이터의 불균형 문제 해결에도 적용하기 시작하였다. 그러나 이러한 연구들은 데이터의 형태를 비정형데이터 구조로 변경함으로써 정형 데이터의 특징을 정확하게 반영하지 못한다는 점이 문제로 지적되고있다. 본 연구에서는 이를 해결하기 위해 순환 생산적 적대 신경망(cycle GAN)을 정형 데이터의 구조에맞게 재구성하고 이를 SMOTE(synthetic minority oversampling technique) 기법과 결합한 하이브리드오버샘플링 기법을 제안하였다. 특히 기존 연구와 달리 생산적 적대 신경망을 구성함에 있어 1차원합성곱 신경망(1D-convolutional neural network)을 사용함으로써 기존 연구의 한계를 극복하고자 하였다.
본 연구에서 제안한 기법의 성능 비교를 위해 불균형 정형 데이터를 기반으로 오버샘플링을 진행하고그 결과를 SMOTE, ADASYN(adaptive synthetic sampling) 등과 같은 기존 기법과 비교하였다. 비교결과 차원이 많을수록, 불균형 정도가 심할수록 제안된 모형이 우수한 성능을 보이는 것으로 나타났다.
본 연구는 기존 연구와 달리 정형 데이터의 구조를 유지하면서 소수 클래스의 특징을 반영한오버샘플링을 통해 분류의 성능을 향상시켰다는 점에서 의의가 있다.
1 김예원 ; 유예림 ; 최홍용, "생성적 적대 신경망과 딥러닝을 활용한 이상거래탐지 시스템 모형" 한국경영정보학회 22 (22): 59-72, 2020
2 Gangwar, A. K, "WiP: Generative adversarial network for oversampling data in credit card fraud detection" 123-134, 2019
3 Arjovsky, M, "Wasserstein generative adversarial networks" 214-223, 2017
4 Statista, "Volume of data/information created, captured, copied, and consumed worldwide from 2010 to 2025"
5 Fiore, U, "Using generative adversarial networks for improving classification effectiveness in credit card fraud detection" 479 : 448-455, 2019
6 Radford, A, "Unsupervised representation learning with deep convolutional generative adversarial networks" 2016
7 Wang, J, "Unrolled GAN-based oversampling of credit card dataset for fraud detection" 858-861, 2022
8 Zhu, J. Y, "Unpaired image-to-image translation using cycle-consistent adversarial networks" 2223-2232, 2017
9 Tomek, I, "Two modifications of CNN" 6 (6): 769-772, 1976
10 Quintana, M, "Towards class-balancing human comfort datasets with GANs" 391-392, 2019
1 김예원 ; 유예림 ; 최홍용, "생성적 적대 신경망과 딥러닝을 활용한 이상거래탐지 시스템 모형" 한국경영정보학회 22 (22): 59-72, 2020
2 Gangwar, A. K, "WiP: Generative adversarial network for oversampling data in credit card fraud detection" 123-134, 2019
3 Arjovsky, M, "Wasserstein generative adversarial networks" 214-223, 2017
4 Statista, "Volume of data/information created, captured, copied, and consumed worldwide from 2010 to 2025"
5 Fiore, U, "Using generative adversarial networks for improving classification effectiveness in credit card fraud detection" 479 : 448-455, 2019
6 Radford, A, "Unsupervised representation learning with deep convolutional generative adversarial networks" 2016
7 Wang, J, "Unrolled GAN-based oversampling of credit card dataset for fraud detection" 858-861, 2022
8 Zhu, J. Y, "Unpaired image-to-image translation using cycle-consistent adversarial networks" 2223-2232, 2017
9 Tomek, I, "Two modifications of CNN" 6 (6): 769-772, 1976
10 Quintana, M, "Towards class-balancing human comfort datasets with GANs" 391-392, 2019
11 Xu, L, "Synthesizing Tabular Data using Conditional GAN" Massachusetts Institute of Technology 2020
12 Johnson, J. M, "Survey on deep learning with class imbalance" 6 : 27-, 2019
13 Refinitive, "Smarter humans. Smarter machines"
14 Sharma, A, "SMOTified-GAN for class imbalanced pattern classification problems" 10 : 30655-30665, 2022
15 Chawla, N. V, "SMOTE: Synthetic minority over-sampling technique" 16 : 321-357, 2002
16 Sáez, J. A, "SMOTE-IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering" 291 : 184-203, 2015
17 Fangyu, W, "Research on imbalanced data set preprocessing based on deep learning" 75-79, 2021
18 Soltanzadeh, P, "RCSMOTE: Range-controlled synthetic minority over-sampling technique for handling the class imbalance problem" 542 : 92-111, 2020
19 Tek, F. B, "Parasite detection and identification for automated thin blood film malaria diagnosis" 114 : 21-32, 2010
20 Nazari, E, "On oversampling via generative adversarial networks under different data difficulty factors" 154 : 76-89, 2021
21 Gazzah, S, "New oversampling approaches based on polynomial fitting for imbalanced data sets" 677-684, 2008
22 Yang, Y, "Network intrusion detection based on supervised adversarial variational auto-encoder with regularization" 8 : 42169-42184, 2020
23 Mohammed, R, "Machine learning with oversampling and undersampling techniques: Overview study and experimental results" 243-248, 2020
24 Krawczyk, B, "Learning from imbalanced data: Open challenges and future directions" 5 (5): 221-232, 2016
25 He, H, "Learning from imbalanced data" 21 (21): 1263-1284, 2009
26 Islam, A, "KNNOR: An oversampling technique for imbalanced datasets" 115 : 108288-, 2022
27 IBM, "Inforgraphic-Extracting business value form the 4Vs of big data"
28 Ba, H, "Improving detection of credit card fraudulent transactions using generative adversarial networks"
29 Liu, Y, "Imbalanced text classification: A term weighting approach" 36 : 690-701, 2009
30 Krizhevsky, A, "ImageNet classification with deep convolutional neural networks" 60 (60): 84-90, 2017
31 Wise, J, "How much data is created every day in 2022?"
32 Kingma, D. P, "Glow: Generative flow with invertible 1x1 convolutions" 31 : 2018
33 Zhu, J.-Y, "Generative visual manipulation on the natural image manifold" 597-613, 2016
34 Wang, Z, "Generative adversarial networks: A survey and taxonomy" 54 (54): 1-38, 2022
35 Saxena, D, "Generative adversarial networks (GANs) challenges, solutions, and future directions" 54 (54): 1-42, 2022
36 Goodfellow, I. J, "Generative adversarial nets" 27 : 2672-2680, 2014
37 Mullick, S. S, "Generative adversarial minority oversampling" 1695-1704, 2019
38 Kate, P, "FinGAN: Generative adversarial network for analytical customer relationship management in banking and insurance"
39 Sambasivan, N, "Everyone wants to do the model work, not the data work”: Data cascades in high-stakes AI" 1-15, 2021
40 Khoshgoftaar, T. M, "Ensemble vs. data sampling: Which option is best suited to improve classification performance of imbalanced bioinformatics data?" 705-712, 2015
41 Douzas, G, "Effective data generation for imbalanced learning using conditional generative adversarial networks" 91 : 464-471, 2018
42 Chawla, N. V, "Editorial: Special issue on learning from imbalanced data sets" 6 (6): 1-6, 2004
43 Fernández-Delgado, M, "Do we need hundreds of classifiers to solve real world classification problems?" 15 (15): 3133-3181, 2014
44 Zhou, F, "Deep learning fault diagnosis method based on global optimization GAN for unbalanced data" 187 : 104837-, 2020
45 Ling, C. X, "Data mining for direct marketing: Problems and solutions" 73-79, 1998
46 Chen, H, "Data evaluation and enhancement for quality improvement of machine learning" 70 (70): 831-847, 2021
47 Dlamini, G, "DGM: A data generative model to improve minority class presence in anomaly detection domain" 33 (33): 13635-13646, 2021
48 최형욱 ; 이승현 ; 김형훈 ; 서용철, "CycleGAN을 활용한 항공영상 학습 데이터 셋 보완 기법에 관한 연구" 한국측량학회 38 (38): 499-509, 2020
49 Pathak, D, "Context encoders: Feature learning by inpainting" 2536-2544, 2016
50 Mirza, M, "Conditional generative adversarial nets"
51 Engelmann, J, "Conditional Wasserstein GAN-based oversampling of tabular data for imbalanced learning" 174 : 114582-, 2021
52 Han, H, "Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning" 3644 (3644): 878-887, 2005
53 Wilson, D. L, "Asymptotic properties of nearest neighbor rules using edited data" 2 (2): 408-421, 1972
54 Cao, Q, "Applying over-sampling technique based on data density and cost-sensitive SVM to imbalanced learning" 543-548, 2011
55 Chandola, V, "Anomaly detection: A survey" 41 (41): 1-58, 2009
56 Fernández, A, "An insight into imbalanced big data classification: Outcomes and challenges" 3 : 105-120, 2017
57 Thejas G. S, "An extension of synthetic minority oversampling technique based on Kalman filter for imbalanced datasets" 8 : 100267-, 2022
58 Bai, S, "An empirical evaluation of generic convolutional and recurrent networks for sequence modeling"
59 Kovács, G, "An empirical comparison and evaluation of minority oversampling techniques on a large number of imbalanced datasets" 83 : 105662-, 2019
60 Yap, B. W, "An application of oversampling, undersampling, bagging and boosting in handling imbalanced datasets" 13-22, 2013
61 Deepa, T, "An E-SMOTE technique for feature selection in high-dimensional imbalanced dataset" 2 : 322-324, 2011
62 He, H, "ADASYN: Adaptive synthetic sampling approach for imbalanced learning" 1322-1328, 2008
63 Bosu, M. F, "A taxonomy of data quality challenges in empirical software engineering" 97-106, 2013
64 Leevy, J. L, "A survey on addressing high-class imbalance in big data" 5 : 42-, 2018
65 Gui, J, "A review on generative adversarial networks: Algorithms, theory, and applications"
66 Zhou, B, "A quasi-linear SVM combined with assembled SMOTE for imbalanced data classification" 1-7, 2013
67 Aydilek, I. B, "A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm" 233 : 25-35, 2013
68 Silver, D, "A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play" 362 (362): 1140-1144, 2018
69 Zhu, B, "A GAN-based hybrid sampling method for imbalanced customer classification" 609 : 1397-1411, 2022
윤리의식: SNS상의 수동적 개인정보 침해와 능동적 개인정보 침해
업무 환경의 디지털 전환에서 업무 특성, IT 특성, 조직 특성이 업무 프로세스 가상성에 미치는 영향 연구
비대칭 마진 SVM 최적화 모델을 이용한 기업부실 예측모형의 범주 불균형 문제 해결
라이브 커머스 및 쇼호스트 특성이 소비자의 충동구매가능성에 미치는 영향: 시나리오 기반 실험연구