RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제
      • 좁혀본 항목 보기순서

        • 원문유무
        • 원문제공처
          펼치기
        • 등재정보
          펼치기
        • 학술지명
          펼치기
        • 주제분류
          펼치기
        • 발행연도
          펼치기
        • 작성언어

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • Stream Data Mining: Platforms, Algorithms, Performance Evaluators and Research Trends

        Bakshi Rohit Prasad,Sonali Agarwal 보안공학연구지원센터 2016 International Journal of Database Theory and Appli Vol.9 No.9

        Streaming data are potentially infinite sequence of incoming data at very high speed and may evolve over the time. This causes several challenges in mining large scale high speed data streams in real time. Hence, this field has gained a lot of attention of researchers in previous years. This paper discusses various challenges associated with mining such data streams. Several available stream data mining algorithms of classification and clustering are specified along with their key features and significance. Also, the significant performance evaluation measures relevant in streaming data classification and clustering are explained and their comparative significance is discussed. The paper illustrates various streaming data computation platforms that are developed and discusses each of them chronologically along with their major capabilities. This paper clearly specifies the potential research directions open in high speed large scale data stream mining from algorithmic, evolving nature and performance evaluation measurement point of view. Finally, Massive Online Analysis (MOA) framework is used as a use case to show the result of key streaming data classification and clustering algorithms on the sample benchmark dataset and their performances are critically compared and analyzed based on the performance evaluation parameters specific to streaming data mining.

      • A Hybrid Clustering Algorithm for Outlier Detection in Data Streams

        S.Vijayarani,Ms. P.Jothi 보안공학연구지원센터 2016 International Journal of Grid and Distributed Comp Vol.9 No.11

        In current years, data streams have been gradually turn into most important research area in the field of computer science. Data streams are defined as fast, limitless, unbounded, river flow, continuous, stop less, massive, tremendous unremitting, immediate, stream flow, arrival of ordered and unordered data. Data streams are divided into two types, they are online and offline streams. Online data streams are mainly used for real world applications like face book, twitter, network traffic monitoring, intrusion detection and credit card processes. Offline data streams are mainly used for manipulating the information which is based on web log streams. In data streams, data size is extremely huge and potentially infinite and it is not possible to lay up all the data, so it leads to a mining challenge where shortage of limitations has occur in hardware and software. Data mining techniques such as clustering, load shedding, classification and frequent pattern mining are to be applied in data streams to get useful knowledge. But, the existing algorithms are not suitable for performing the data mining process in data streams; hence there is a need for new techniques and algorithms. The main objective of this research work is to perform the clustering process in data streams and detecting the outliers in data streams. New hybrid approach is proposed which combines the hierarchical clustering algorithm and partitioning clustering algorithm. In hierarchical clustering, CURE algorithm is used and enhanced (E-CURE) and in partitioning clustering, CLARANS algorithm is used and enhanced (E-CLARANS). In this research work, the two algorithms E-CURE and E-CLARANS are combined (Hybrid) for performing a clustering process and finding the outliers in data streams. The performance of this hybrid clustering algorithm is compared with the existing hybrid clustering algorithms namely BIRCH with CLARANS and CURE with CLARANS. The performance factors used in this analysis are clustering accuracy and outlier detection accuracy. By analyzing the experimental results, it is observed that the proposed hybrid clustering approach E-CURE with E-CLARANS performance is more accurate than the existing hybrid clustering algorithms.

      • An effective handling of secure data stream in IoT

        Jang, Jaejin,Jung, Im.Y,Park, Jong Hyuk Elsevier 2018 Applied soft computing Vol.68 No.-

        <P><B>Abstract</B></P> <P>Internet-of-Things (IoT) applications are a primary domain for data streams, which travel through a heterogeneous network consisting of the Internet and low-speed IoT to be sent to a data collector. However, when a large data stream is transmitted, overhead results from the difference between the speed and maximum transfer unit of IoT and the existing Internet. The overhead increases as data size increases. This problem is a critical factor for IoT devices that are sensitive to power consumption and data streams that must be handled in real time. To solve this problem, we compressed the data stream using a low-density parity check (LDPC) code. Since compression using the LDPC code can be applied even when the data stream is encrypted, the compression can be used in applications requiring privacy or confidentiality. Therefore, this study proposes a method to improve the usability of encrypted data streams in the IoT environment. We implemented IoT devices that generated data streams using Raspberry Pi, a desktop computer, and collectors that collect data streams. The results of experiments using temperature sensor data show that the communication time for data stream transmission decreased by 56.1–75.5%. In addition, the power consumption of IoT devices for data transmission decreased by 54.8–75.3%. In order to perform compression handling by the IoT device, the maximum memory usage and CPU usage increased by 0.3% and 10.1%, respectively. As a result of this research, it is expected that the transmission time to collectors, as well as the power consumption of IoT devices, can be reduced while securing data streams generated by IoT devices.</P> <P><B>Highlights</B></P> <P> <UL> <LI> A compression using LDPC code is proposed for encrypted data streams. </LI> <LI> The compression was investigated for the payload instead of the header of packets. </LI> <LI> The overhead such as transmission time and power consumption was reasonable. </LI> </UL> </P> <P><B>Graphical abstract</B></P> <P>[DISPLAY OMISSION]</P>

      • KCI등재

        발생 간격 기반 가중치 부여 기법을 활용한 데이터 스트림에서 가중치 순차패턴 탐색

        장중혁(Joong Hyuk Chang) 한국지능정보시스템학회 2010 지능정보연구 Vol.16 No.3

        Sequential pattern mining aims to discover interesting sequential patterns in a sequence database, and it is one of the essential data mining tasks widely used in various application fields such as Web access pattern analysis, customer purchase pattern analysis, and DNA sequence analysis. In general sequential pattern mining, only the generation order of data element in a sequence is considered, so that it can easily find simple sequential patterns, but has a limit to find more interesting sequential patterns being widely used in real world applications. One of the essential research topics to compensate the limit is a topic of weighted sequential pattern mining. In weighted sequential pattern mining, not only the generation order of data element but also its weight is considered to get more interesting sequential patterns. In recent, data has been increasingly taking the form of continuous data streams rather than finite stored data sets in various application fields, the database research community has begun focusing its attention on processing over data streams. The data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. In data stream processing, each data element should be examined at most once to analyze the data stream, and the memory usage for data stream analysis should be restricted finitely although new data elements are continuously generated in a data stream. Moreover, newly generated data elements should be processed as fast as possible to produce the up-to-date analysis result of a data stream, so that it can be instantly utilized upon request. To satisfy these requirements, data stream processing sacrifices the correctness of its analysis result by allowing some error. Considering the changes in the form of data generated in real world application fields, many researches have been actively performed to find various kinds of knowledgeembedded in data streams. They mainly focus on efficient mining of frequent itemsets and sequential patterns over data streams, which have been proven to be useful in conventional data mining for a finite data set. In addition, mining algorithms have also been proposed to efficiently reflect the changes of data streams over time into their mining results. However, they have been targeting on finding naively interesting patterns such as frequent patterns and simple sequential patterns, which are found intuitively, taking no interest in mining novel interesting patterns that express the characteristics of target data streams better. Therefore, it can be a valuable research topic in the field of mining data streams to define novel interesting patterns and develop a mining method finding the novel patterns, which will be effectively used to analyze recent data streams. This paper proposes a gap-based weighting approach for a sequential pattern and amining method of weighted sequential patterns over sequence data streams via the weighting approach. A gap-based weight of a sequential pattern can be computed from the gaps of data elements in the sequential pattern without any pre-defined weight information. That is, in the approach, the gaps of data elements in each sequential pattern as well as their generation orders are used to get the weight of the sequential pattern, therefore it can help to get more interesting and useful sequential patterns. Recently most of computer application fields generate data as a form of data streams rather than a finite data set. Considering the change of data, the proposed method is mainly focus on sequence data streams.

      • KCI등재

        ISVS: 대화형 스트리밍 데이터 시각화 시스템

        선동한,최경륜,황수찬,백중환 한국차세대컴퓨팅학회 2018 한국차세대컴퓨팅학회 논문지 Vol.14 No.2

        Recently, there have been many studies on real-time processing and the visualization of a large stream data. However, in the existing studies, the changes which occurred in the input stream structure, in the data to be visualized, and in the processing methods, can not be reflected in visualization in real time. In this paper, we propose an Interactive Streaming Data Visualization System, ISVS, which provides visualization capabilities of streaming data covering collection and processing capabilities of visualized data. ISVS makes it easy for users to interactively express various visualization-relevant requirements of modulating input stream structures, filtering streams to be visualized, creating and modifying visualization functions. ISVS reflects them in real time. In addition, this paper presents the implementation of ISVS using the distributed streaming system. We show that it is capable of providing various real-time visualization processing by applying it to a stereo sound technology based video streaming service. 최근, 대량의 스트림 데이터에 대한 실시간 처리와 그 결과의 시각화에 관한 연구가 활발히 진행되고 있다. 그러나기존 연구는 입력 스트림 구조의 변경이나, 시각화할 데이터와 그 처리방식의 변경 등을 실시간으로 시각화에 반영하지 못하고 있다. 본 논문에서는 스트림 데이터의 시각화 기능과 시각화를 위한 데이터의 수집 및 처리 기능을 통합적으로 제공하는 대화형 스트림 시각화 시스템, ISVS(Interactive Streaming Data Visualization System) 를 제안한다. ISVS는 입력 스트림 구조 변경, 시각화할 스트림의 필터링, 시각화 함수의 생성, 수정 등 시각화에관련된 다양한 요구사항을 사용자가 쉽게 대화식으로 표현하고 이를 실시간으로 시각화에 반영할 수 있도록 한 시스템이다. 또한, 본 논문에서는 분산 스트리밍 시스템을 이용한 ISVS 시스템의 구현과 이 시스템을 입체음향 기술 기반의 동영상 스트리밍 서비스에 적용하여 실시간으로 다양한 시각화 처리가 가능함을 보였다.

      • KCI등재

        에스퍼 기반 실시간 필터링 시스템

        박세빈,길명선,최형진,문양세 한국정보과학회 2017 데이타베이스 연구 Vol.33 No.2

        In this paper, we address the filtering problem of data stream data. In general, stream data are continuously generated, and the volume are very large. Therefore, in order to process such a large-capacity data stream in real time, a filtering algorithm should be used to sufficiently remove unnecessary data for analysis. However, since existing filtering algorithms are optimized for a single data type, it is difficult to precisely filter different data types with only one filtering algorithm. To solve this problem, in this paper we propose an Esper-based real-time filtering system that applies various filtering algorithms to Esper, a typical data stream management system (DSMS). The proposed system is designed and implemented based on the client-server model. First, the client transmits the user’s input for filtering to the server and displays the filtering results received from the server in real time. The server filters the data stream in real-time according to the user input and returns the results to the client in real time. Experimental results on real data stream show that the proposed system filters data stream accurately and quickly. In addition, because it supports various filtering algorithms, we can easily filter various types of streams and compare the results. As a result, the proposed filtering system is considered to be an excellent system for extracting data more accurately and efficiently by selectively supporting a filtering algorithm suitable for the type of input data streams. 본 논문에서는 데이터 스트림 대상의 필터링 문제를 다룬다. 데이터 스트림은 지속적으로 생성되며 수집하게 되면 크기 또한 거대해진다. 따라서, 이러한 대용량 데이터 스트림을 실시간 처리하기 위해서는 필터링 알고리즘을 사용하여 분석에 불필요한 데이터를 충분히 제거해야 한다. 그러나, 기존 필터링 알고리즘은 하나의 데이터 타입에 최적화 되어있기 때문에 각기 다른 데이터 타입에 대한 정확한 필터링이 어렵다. 본 논문에서는 이 같은 문제를 해결하기 위해 대표적인 데이터 스트림 처리 시스템(DSMS: data stream management system)인 에스퍼(Esper)에 여러 필터링 알고리즘을 적용한 에스퍼 기반 실시간 필터링 시스템을 클라이언트-서버 모델 기반으로 설계 및 구현한다. 먼저, 클라이언트는 필터링을 위한 사용자의 입력을 서버에게 전달하고, 서버로부터 전달받은 필터링 결과를 실시간 보여준다. 그리고, 서버는 클라이언트로부터 전달받은 사용자 입력에 따라 데이터 스트림을 실시간 필터링하고 그 결과를 클라이언트에게 실시간으로 반환한다. 실제 데이터 스트림을 활용한 실험 결과, 제안 시스템은 데이터 스트림을 정확하고 빠르게 필터링함을 확인하였다. 또한, 여러가지 필터링 알고리즘을 통합 지원하기 때문에, 다양한 타입의 스트림을 쉽게 필터링하고, 그 결과를 비교할 수 있다. 이 같은 결과로 보아, 제안하는 필터링 시스템은 입력 데이터 스트림 타입에 적합한 필터링 알고리즘을 선택적으로 지원함으로써, 보다 정확하고 효율적으로 데이터를 추출하는 우수한 시스템으로 사료된다.

      • KCI등재

        Plan Manager: Apache Storm 기반 동적 데이터 스트림 정제 플랫폼

        김영국,손시운,문양세,최형진 한국정보과학회 2019 데이타베이스 연구 Vol.35 No.1

        In this paper, we address sampling and filtering problems of big data streams. Since data streams are being generated constantly and continuously, it is not practical to collect and process the entire stream of big data. Thus, there are many demands for sampling methods for extracting appropriate samples and/or filtering methods for extracting necessary data from the entire data stream. In addition, there is a critical limitation in processing big data streams in a single server. Apache Storm is a real-time distributed parallel processing framework for processing large data streams efficiently. However, Storm needs to modify the source code, redistribute the code, and restart the process when changing the input data structures or processing algorithms. In this paper, we describe the problems of sampling and filtering techniques in the Storm-based distributed environment, and define the requirements to solve those problems. We then design a novel plan model consisting of the input, processing, and output modules in the data stream. Plan Manager is a dynamic platform that manages Storm, Kafka, and databases in an integrated framework, and it can dynamically apply sampling and filtering techniques to rapid data streams through the plan. In addition, the Plan Manager has a useful feature that can visually create, execute, and monitor the plan through the Web client. In this paper, we present the design, implementation, and experimental results of the proposed Plan Manager in order to demonstrate its usefulness and effectiveness. 데이터 스트림은 지속적으로 끊임없이 생성되므로, 이러한 데이터 스트림 전체를 수집하고 분석하는 것은 매우 비효율적이다. 본 논문에서는 대용량 데이터 스트림의 정제를 위해, 실시간 분산 병렬 프레임워크인 아파치 스톰(Apache Storm)에서 샘플링 및 필터링 알고리즘을 구현하고자 한다. 그런데, 스톰은 입력 데이터구조나 처리 알고리즘의 변경 시 작업 전체를 중단하고, 코드를 수정하고, 재배포하고, 재시작해야만 한다. 본 논문에서는 이와 같이 기존 스톰 환경에서 샘플링 및 필터링 기능을 개발했을 때의 문제점을 도출하고, 이를 해결하기 위한 요구사항을 정의한다. 또한, 데이터 스트림의 입력, 처리, 출력 모듈의 조합으로 구성되는 플랜(plan) 모델을 제시한다. 플랜 매니저(Plan Manager)는 스톰, 카프카, 데이터베이스를 하나의 시스템처럼 운용하며, 플랜을 통해 데이터 스트림에 대한 샘플링과 필터링을 동적으로 적용할 수 있는 통합 플랫폼이다. 특히, 웹 클라이언트를 통해 플랜을 시각적으로 생성, 실행, 모니터링할 수 있는 특징을 가진다. 본 논문에서는 제안하는 플랜 매니저에 대한 설계와 구현 결과를 차례로 제시하여 그 유용성을 보인다.

      • Challenges and Issues in DATA Stream: A Review

        보안공학연구지원센터(IJHIT) 보안공학연구지원센터 2015 International Journal of Hybrid Information Techno Vol.8 No.3

        Data stream is a continuous, time varying, massive and infinitely ordered sequence of data elements. The streaming data are fast changing with time, it is impossible to acquire all the elements in a data stream. Therefore, each data element should be examined at most once in data streams. Memory usage for mining data stream should be limited due to the new data elements are continuously generated from the streams. It is essential to ensure that newly arrived stream should be immediately available whenever it is requested made this task much challenging and necessary for fraud detection in stream, taking out knowledge, for business improvement and other applications where data arrived in stream. This paper tries to highlight important issues and research challenges of data stream by means of a comprehensive review.

      • KCI등재

        무질서 데이터를 위한 데이터 스트림 분리 처리 모델

        선동한,장준혁,황수찬,이정구,고은비,정국식 한국차세대컴퓨팅학회 2023 한국차세대컴퓨팅학회 논문지 Vol.19 No.1

        It is crucial to process the data stream in event-time ordering when real-time processing a great amount of data. Data stream however, through various causes may create disorientation of the input order. These out-of-order data may induce data latency and reduce the correctness of the result. We propose a one such approach, a DSSP(Data Stream Split Processing) model that can reduce the latency of the data processing while increasing the accuracy of out-of-order data process results. To prevent the delay of response time caused by late data, our DSSP model splits input data stream into normal data stream that arrives on time and late data stream that does not. This model provides the result of normal data stream first, then applies the late data result that has been separately cached to assure the correctness of the data processing. We analyze the efficiency and performance of our DSSP model through experimental implementation. 대량의 데이터를 실시간으로 처리하는 스트리밍 처리에서는 데이터가 발생한 순서에 따라 처리하는 것이 중요하다. 그러나 데이터 스트림은 다양한 원인으로 인해 입력 순서가 뒤바뀔 수 있으며, 이런 무질서 데이터는 데이터 처리 시간을 지연시키거나, 처리 정확도를 떨어트리는 문제를 초래한다. 본 논문에서는 무질서 데이터 처리결과의 정확성을 높이면서 이로 인해 발생하는 데이터 처리 지연을 줄일 수 있는 데이터 스트림 분리 처리 방법인 DSSP(Data Stream Split Processing) 모델을 제안한다. DSSP 모델은 지연 데이터로 인한 전체 처리시간의 지연을 방지하기 위해 입력 데이터 스트림을 처리시간 내에 입력되는 정상 데이터 스트림과 처리시간 내에 입력되지 못한 지연 데이터 스트림으로 분리하여 처리한다. 정상적으로 입력된 데이터는 처리결과를 우선 신속히 제공하고, 지연 입력된 데이터는 별도로 저장하였다가 그 처리결과를 추가로 반영하여 데이터의 정확성을 보장할 수 있도록 한다. 본 논문에서는 DSSP 시스템의 실험적 구현을 통해 그 효율과 성능을 분석한다.

      • KCI등재

        하이브리드 질의를 위한 데이터 스트림 저장 기술

        신재진(Jae-Jyn Shin),유병섭(Byeong-Seob You),어상훈(Sang-Hun Eo),이동욱(Dong-Wook Lee),배해영(Hae-Young Bae) 한국멀티미디어학회 2007 멀티미디어학회논문지 Vol.10 No.11

        본 논문은 데이터 스트림의 하이브리드 질의를 위한 빠른 저장 방법을 제안한다. 빠르고 많은 입력을 가지는 데이터 스트림의 처리를 위해 DSMS(Data Stream Management System)란 새로운 시스템에 대한 연구가 활발히 진행되고 있다. 현재 입력되고 있는 데이터 스트림과 과거에 발생했던 데이터 스트림를 동시에 검색하는 하이브리드 질의를 위해서는 데이터 스트림이 디스크에 저장되어져야 한다. 그러나 데이터 스트림의 빠른 입력 속도와 메모리와 디스크 공간의 한계 때문에 저장된 데이터 스트림에 대한 질의보다는, 현재 입력되고 있는 데이터 스트림에 대한 질의에 대한 연구들이 주로 이루어졌다. 본 논문에서는 데이터 스트림의 입력을 받을 때 순환버퍼를 이용하여 메모리 이용률을 최대화하고 블록킹 없는 데이터 스트림의 입력을 가능하게 한다. 또한 최대한 많은 양의 데이터를 디스크에 저장하기 위하여 디스크에 있는 데이터를 압축한다. 실험을 통하여 제안되는 기술이 대량으로 입력되는 데이터 스트림을 빠르게 저장시킬 수 있다는 것을 보인다. This paper proposes fast storage techniques for hybrid query of data streams. DSMS(Data Stream Management System) have been researched for processing data streams that have busting income. To process hybrid query that retrieve both current incoming data streams and past data streams data streams have to be stored into disk. But due to fast input speed of data stream and memory and disk space limitation, the main research is not about querying to stored data streams but about querying to current incoming data streams. Proposed techniques of this paper use circular buffer for maximizing memory utility and for make non blocking insertion possible. Data in a disk is compressed to maximize the number of data in the disk. Through experiences, proposed technique show that bursting insertion is stored fast.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼