RISS 검색 - 국내학술지논문 상세보기

다국어 초록 (Multilingual Abstract)

Distributed computing helps to efficiently store and process large data on a cluster of multiple machines. The performance of distributed computing is greatly influenced depending on the state of the servers constituting the distributed system. In this paper, we propose a self-diagnosis system that collects log data in a distributed system, detects anomalies and visualizes the results in real time. First, we divide the self-diagnosis process into five stages: collecting, delivering, analyzing, storing, and visualizing stages. Next, we design a real-time self-diagnosis system that meets the goals of real-time, scalability, and high availability. The proposed system is based on Apache Flume, Apache Kafka, and Apache Storm, which are representative real-time distributed techniques. In addition, we use simple but effective moving average and 3-sigma based anomaly detection technique to minimize the delay of log data processing during the self-diagnosis process. Through the results of this paper, we can construct a distributed real-time self-diagnosis solution that can diagnose server status in real time in a complicated distributed system.

국문 초록 (Abstract)

분산 컴퓨팅이란 다수의 서버로 구성된 분산 시스템에서 데이터를 효율적으로 저장 및 처리하는 기술이다. 따라서 분산 시스템을 구성하는 서버의 상태에 따라 분산 컴퓨팅의 성능에 큰 영...

분산 컴퓨팅이란 다수의 서버로 구성된 분산 시스템에서 데이터를 효율적으로 저장 및 처리하는 기술이다. 따라서 분산 시스템을 구성하는 서버의 상태에 따라 분산 컴퓨팅의 성능에 큰 영향을 미친다. 본 논문은 분산 시스템에서 실시간으로 발생하는 시스템 자원의 로그 데이터를 수집하고 이상을 탐지하여 결과를 시각화하는 자가 진단 시스템을 제안한다. 먼저, 자가 진단 과정을 수집, 전달, 분석, 저장, 시각화의 다섯 단계로 구분한다. 다음으로, 자가 진단 과정이 실시간성, 확장성, 고가용성의 목표를 만족하도록 실시간 자가 진단 시스템을 설계한다. 본 시스템은 대표적인 실시간 분산 기술인 Apache Flume, Apache Kafka, Apache Storm을 기반으로 구현되어 실시간성, 확장성, 고가용성의 세 가지 목표를 만족할 수 있다. 또한, 자가 진단 과정에서 로그 데이터 처리의 지연을 최소화하도록 간단하지만 효과적인 이동 평균 및 3-시그마 기반 이상 탐지 기법을 사용한다. 본 논문의 결과를 통해, 분산 시스템 내에서 서버 상태를 실시간으로 진단할 수 있는 분산 실시간 자가 진단 시스템을 구축할 수 있다.

참고문헌 (Reference)

1 P. Hunt, "Zookeeper: Wait-free Coordination for Internet-scale Systems" 1-6, 2010

2 K. Shvachko, "The Hadoop Distributed File System" 1-10, 2010

3 A. Toshniwal, "Storm@ Twitter" 147-156, 2014

4 P. Goetz, "Storm Blueprints: Patterns for Distributed Real-time Computation" Packt Publishing 2014

5 M. Zaharia, "Spark: Cluster Computing with Working Sets" 10-, 2010

6 "MariaDB"

7 J. Dean, "MapReduce: Simplified Data Processing on Large Clusters" 51 (51): 107-113, 2008

8 J. Kreps, "Kafka: a Distributed Messaging System for Log Processing" 2011

9 "EsperTech Esper"

10 "D3. js"

1 P. Hunt, "Zookeeper: Wait-free Coordination for Internet-scale Systems" 1-6, 2010

2 K. Shvachko, "The Hadoop Distributed File System" 1-10, 2010

3 A. Toshniwal, "Storm@ Twitter" 147-156, 2014

4 P. Goetz, "Storm Blueprints: Patterns for Distributed Real-time Computation" Packt Publishing 2014

5 M. Zaharia, "Spark: Cluster Computing with Working Sets" 10-, 2010

6 "MariaDB"

7 J. Dean, "MapReduce: Simplified Data Processing on Large Clusters" 51 (51): 107-113, 2008

8 J. Kreps, "Kafka: a Distributed Messaging System for Log Processing" 2011

9 "EsperTech Esper"

10 "D3. js"

11 J. Manyika, "Big Data: The Next Frontier for Innovation, Competition, and Productivity" McKinsey Global Institute 2011

12 "Apache Storm"

13 "Apache Kafka Cluster"

14 "Apache Flume"

15 S. Son, "Anomaly Detection for Big Log Data Using a Hadoop Ecosystem" 377-380, 2017

16 Y. Jeong, "An Integrated Self-Diagnosis System for an Autonomous Vehicle Based on an IoT Gateway and Deep Learning" 2018

17 W. Hu, "A Knowledge-Based Real-Time Diagnostic System for PLC Controlled Manufacturing Systems" 1999

연월일	이력구분	이력상세
2026	평가예정	재인증평가 신청대상 (재인증)
2020-01-01	평가	등재학술지 유지 (재인증)
2017-01-01	평가	등재학술지 유지 (계속평가)
2013-01-01	평가	등재학술지 유지 (등재유지)
2010-01-01	평가	등재학술지 선정 (등재후보2차)
2009-01-01	평가	등재후보 1차 PASS (등재후보1차)
2007-01-01	평가	등재후보학술지 선정 (신규평가)

기준연도	WOS-KCI 통합IF(2년)	KCIF(2년)	KCIF(3년)
2016	0.02	0.02	0.01
KCIF(4년)	KCIF(5년)	중심성지수(3년)	즉시성지수
0.02	0.02	0.183	0.03

상세검색

RISS 보유자료

상세검색

해외전자자료

대용량 로그 데이터 처리를 위한 분산 실시간 자가 진단 시스템 = A Distributed Real-time Self-Diagnosis System for Processing Large Amounts of Log Data

부가정보

동일학술지(권/호) 다른 논문

분석정보

인용정보 인용지수 설명보기

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료