RISS 검색 - 학위논문 상세보기

국문 초록 (Abstract)

데이터 마이닝의 분석을 진행하는 단계에서 발생할 수 있는 문제 중 하나가 불균형 데이터(Imbalanced Data) 문제이다. 데이터 불균형의 문제점은 지도학습 기법을 적용할 때, 한 쪽의 범주가 비정상적으로 큰 경우, 지도 학습 모델은 전체적인 오분류를 작게 하기 위해서, 다수의 범주로 패턴 분류를 많이 하게 되고, 이 경우 소수의 범주는 다수의 범주로 취급되어 올바른 분석을 진행할 수 없다.
본 논문에서는 프리드만(1984), 장영재(2008) 모형에 대하여 데이터를 형성하고 Over Sampling, Under Sampling, Over+Under Sampling, SMOTE 방법의 Sampling 기법을 적용하여 결과를 확인한다. 또한, Page Blocks Classification Data를 다양한 Sampling 기법을 적용하여 확인한 결과 Over Sampling과 Under Sampling을 이용한 모형의 결과가 가장 우수하였다.

번역하기

데이터 마이닝의 분석을 진행하는 단계에서 발생할 수 있는 문제 중 하나가 불균형 데이터(Imbalanced Data) 문제이다. 데이터 불균형의 문제점은 지도학습 기법을 적용할 때, 한 쪽의 범주가 비...

다국어 초록 (Multilingual Abstract)

In the data mining step, we are faced with Imbalanced data problem. Applying the supervised learning technique with Imbalanced data set, the supervised learning model predicts that all of the datas are major group to reduce rate of the misclassification. In this case, we can not do appropriate analysis.
In this paper, we generate simulation data sets(Freideman, Jang Yeoung-jae). To classify accurately, we use sampling methods; Over Sampling, Under Sampling, Over+Under Sampling and SMOTE. Also, we analyze the Page Blocks Classification Data using those methods. Consequentially, we can find that Over Sampling and Under Sampling have very excellent results.

번역하기

목차 (Table of Contents)

1. 서론 1
2. 연구배경 3
3. Sampling Methods 6
3.1 Under Sampling 6
3.2 Over Sampling 6

1. 서론 1
2. 연구배경 3
3. Sampling Methods 6
3.1 Under Sampling 6
3.2 Over Sampling 6
3.3 Under + Over Sampling 6
3.4 SMOTE 7
4. 시뮬레이션을 통한 모형 적합 결과 비교 9
4.1 프리드만 모형 9
4.2 장영재의 모형 18
5. Page Blocks Classification Data 분석 28
5.1 데이터 설명 28
5.2 SVM 적합 결과 29
6. 결론 30
참고문헌 31

상세검색

RISS 보유자료

상세검색

해외전자자료

Sampling 방법을 이용한 불균형 데이터 분석

부가정보

분석정보

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료