RISS 검색 - 학위논문

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

1
BPNN의 효율적인 개선방법 및 개념에 기초한 문서분류 시스템 응용

리청화 전북대학교 대학원 2009 국내박사

RANK : 248655
유용한 디지털 형태의 문서들이 증가하고 그것들을 체계화할 필요성이 생기면서 자동 분류는 정보 시스템과 데이터 마이닝 분야에서 중요한 위치를 가지게 되었다. 많은 기계 학습 알고리즘들은 문서 분류 기능을 적용하고 있다. 대부분의 전통적인 문서 분류 시스템들은 단순한 단어들의 묶음(벡터)에 기초하고 있다. 그러나 이런 방법은 고차원의 특징 공간을 가지며 단어들 사이와 문서들 사이의 관계성을 무시하게 되어 분류의 효율성과 정확성을 떨어뜨리게 된다. 본 논문에서는 통계와 시소러스를 사용하여 분류 성능을 향상시킨 개념 기반 분류 알고리즘을 사용하였다. 본 논문에서 문서 분류기로 사용된 알고리즘은 K-Nearest Neighbor (KNN)와 역전파 신경망 알고리즘(Back Propagatrion Neural Network, BPNN)이다. BPNN은 분류와 패턴 인식 분야에서 광범위하게 사용되어 왔다. 그러나 표준 BPNN은 일반적으로 느린 학습 속도와 쉽게 지역 최소값에 빠진다는 단점을 가지고 있다. 본 논문에서는 BPNN 알고리즘의 두 가지 유효한 정밀화 방법을 제안하고 개념기반 분류 시스템에 적용했다. 제안된 방법들은 지역최소값에 빠지는 것을 개선하면서 신경망의 학습속도를 빠르게 만들 수 있다. 실험을 위하여 reuter-21578과 20 news group 데이터셋을 사용하였다. 실험 결과로 측정된 정확률, 재현율, F-measure 값을 통하여 본 논문이 제안한 분류 알고리즘이 높은 성능을 가지게 되었음을 알 수 있을 것이다. Due to the increased availability of documents in digital form and the ensuing need to organize them, automatic text categorization has gained a prominent status in the information systems and data mining field. Many machine learning algorithms have been applied to text categorization tasks. Traditional text categorization systems are mostly based on bag of words. But this method using high dimensional feature space and ignoring relationships between terms and documents is decreased categorization efficiency and accuracy. In this dissertation, we use concept based text categorization which is based on statistic method and thesaurus based method to improve the categorization performance. We also employ K Nearest Neighbor (KNN) and Back Propagation Neural Network (BPNN) as text classifier. KNN is a simple and famous approach for text categorization. BPNN has been widely used in classification and pattern recognition. However the standard BPNN has some generally acknowledged limitations such as slow training speed and easily trap into local minimum. This dissertation proposes two effective refinement strategies for BPNN and applies them to concept based text categorization systems. These methods can speed up neural network training as well as alleviate the problem of being trapped in a local minimum. We conduct the experiments on the standard reuter-21578 and 20 news group data sets. Experimental results show that our proposed methods are able to achieve high categorization effectiveness as measured by precision, recall and F-measure.
2
인공신경망을 이용한 문서 분류 : Text Categorization based on Artificial Neural Networks (ANN)

리청화 전북대학교 대학원 2006 국내석사

RANK : 248639
- 원문보기
- 음성듣기
Abstract Li Chenghua Department of Information and Communication Chebuk National University Text categorization is an important application of machine learning to the field of document information retrieval. This thesis described two kinds of neural networks for text categorization, multi-output perceptron learning (MOPL) and back propagation neural network (BPNN). BPNN has been widely used in classification and pattern recognition. However it has some generally acknowledged defects, usually these defects evolve from some morbidity neurons In this thesis I proposed a novel adaptive learning approach for text categorization using improved back propagation neural network. This algorithm can overcome some shortcomings in traditional back propagation neural network such as slow training speed and easy to get into local minimum. We compared the training time and performance and test the three methods on the standard Reuter-21578. The results show that the proposed algorithm is able to achieve high categorization effectiveness as measured by precision, recall and F-measure. 요약 문서분류는 정보검색에서 기계학습을 응용하는 중요한 분야이다. 본 논문에서는 다중출력 퍼셉트론 학습(Multi-Output Perceptron Learning:MOPL)과 백 프로퍼게이션 신경망(Back Propagation Neural Network:BPNN) 두 가지의 신경망 이론을 문서분류에 적용하였다. BPNN은 분류와 패턴인식에 많이 사용되고 있지만, 치명적인 신경을 포함하는 몇 가지 결점이 있다. 본 논문에서는 향상된 백 프로퍼게이션 신경망이론을 사용한 새로운 학습법을 제안할 것이다. 이 알고리즘은 기존의 백 프로퍼게이션 신경망의 느린 학습 속도와 쉽게 국소적인 제한치로 빠지는 문제를 개선할 수 있다. 로이터 자료(Reuter-21578)을 이용하여 세 가지 방법을 테스트하고, 학습시간과 성능을 비교하였다. 정확율, 재현율, 그리고 F-mesure를 통하여 본 논문에서 제안한 문서분류 알고리즘의 높은 성능을 확인할 수 있다.

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

맨처음 페이지로 1 맨끝 페이지로

상세검색

RISS 보유자료

상세검색

해외전자자료

연관 검색어 추천