RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제
      • 좁혀본 항목 보기순서

        • 원문유무
        • 원문제공처
          펼치기
        • 등재정보
          펼치기
        • 학술지명
          펼치기
        • 주제분류
          펼치기
        • 발행연도
          펼치기
        • 작성언어

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • KCI등재

        Self-Imitation Learning을 이용한 개선된 Deep Q-Network 알고리즘

        선우영민(Yung-Min Sunwoo),이원창(Won-Chang Lee) 한국전기전자학회 2021 전기전자학회논문지 Vol.25 No.4

        Self-Imitation Learning은 간단한 비활성 정책 actor-critic 알고리즘으로써 에이전트가 과거의 좋은 경험을 활용하여 최적의 정책을 찾을 수 있도록 해준다. 그리고 actor-critic 구조를 갖는 강화학습 알고리즘에 결합되어 다양한 환경들에서 알고리즘의 상당한 개선을 보여주었다. 하지만 Self-Imitation Learning이 강화학습에 큰 도움을 준다고 하더라도 그 적용 분야는 actor-critic architecture를 가지는 강화학습 알고리즘으로 제한되어 있다. 본 논문에서 Self-Imitation Learning의 알고리즘을 가치 기반 강화학습 알고리즘인 DQN에 적용하는 방법을 제안하고, Self-Imitation Learning이 적용된 DQN 알고리즘의 학습을 다양한 환경에서 진행한다. 아울러 그 결과를 기존의 결과와 비교함으로써 Self-Imitation Leaning이 DQN에도 적용될 수 있으며 DQN의 성능을 개선할 수 있음을 보인다. Self-Imitation Learning is a simple off-policy actor-critic algorithm that makes an agent find an optimal policy by using past good experiences. In case that Self-Imitation Learning is combined with reinforcement learning algorithms that have actor-critic architecture, it shows performance improvement in various game environments. However, its applications are limited to reinforcement learning algorithms that have actor-critic architecture. In this paper, we propose a method of applying Self-Imitation Learning to Deep Q-Network which is a value-based deep reinforcement learning algorithm and train it in various game environments. We also show that Self-Imitation Learning can be applied to Deep Q-Network to improve the performance of Deep Q-Network by comparing the proposed algorithm and ordinary Deep Q-Network training results.

      • KCI등재

        심층 Q-학습을 활용한 8-퍼즐 문제의 해결

        황준하(Junha Hwang) 한국지능시스템학회 2021 한국지능시스템학회논문지 Vol.31 No.1

        일반적으로 최적화 문제는 다양한 탐색 알고리즘을 통해 해결되어 왔다. 본 연구는 “학습을 통해 최적해를 찾을 수 있을까?”라는 질문으로부터 출발하였으며, 기계 학습을 활용한 최적화 문제의 해결 방향을 제시한다. 이를 위해 본 논문에서는 심층 강화 학습을 활용하여 대표적인 최적화 문제인 8-퍼즐 문제를 해결하는 세 가지 방안을 제시한다. 첫 번째는 모든 상태에 대해 최적의 행동을 학습하는 것이고, 두 번째는 A* 탐색을 위한 휴리스틱 함수를 학습하는 것이다. 마지막으로 세 번째는 특정 상태에서 목표 상태까지의 최적 경로를 학습한다. 세 가지 학습 모두 대표적인 심층 강화 학습 알고리즘인 심층 Q-학습을 기반으로 학습을 수행한다. 실험 결과에 의하면 모든 상태에 대한 최적의 행동을 학습하는 것은 쉽지 않은 것으로 보인다. 그러나 A* 탐색을 위한 휴리스틱 함수의 학습뿐만 아니라 특정 상태에서 목표 상태까지의 최적 경로를 학습하는 것 또한 8-퍼즐 문제 해결을 위해 매우 효과적임을 확인하였다. In general, optimization problems have been solved by various search algorithms. This research started from the question “Can we find optimal solutions through learning?,” and suggests the direction to solve optimization problems using machine learning. To this end, this paper proposes three ways to solve the 8-puzzle problem, which is a representative optimization problem, by using deep reinforcement learning. The first is learning optimal actions for all states, and the second is learning a heuristic function for A* search. Finally, the third one learns an optimal path from a specific state to the goal state. All three types of learning are preformed based on deep Q-learning, a representative deep reinforcement learning algorithm. Experimental results show that learning optimal actions for all states seems not easy. However, it was confirmed that not only learning a heuristic function for A* search but also learning an optimal path from a specific state to the goal state is very effective for solving the 8-puzzle problem.

      • KCI등재

        심층강화학습 기반 자율주행차량의 차로변경 방법론

        박다윤,배상훈,TRINH TUAN HUNG,박부기,정보경 한국ITS학회 2023 한국ITS학회논문지 Vol.22 No.1

        Several efforts in Korea are currently underway with the goal of commercializing autonomous vehicles. Hence, various studies are emerging on autonomous vehicles that drive safely and quickly according to operating guidelines. The current study examines the path search of an autonomous vehicle from a microscopic viewpoint and tries to prove the efficiency required by learning the lane change of an autonomous vehicle through Deep Q-Learning. A SUMO was used to achieve this purpose. The scenario was set to start with a random lane at the starting point and make a right turn through a lane change to the third lane at the destination. As a result of the study, the analysis was divided into simulation-based lane change and simulation-based lane change applied with Deep Q-Learning. The average traffic speed was improved by about 40% in the case of simulation with Deep Q-Learning applied, compared to the case without application, and the average waiting time was reduced by about 2 seconds and the average queue length by about 2.3 vehicles. 현재 국내에서는 자율주행차량의 상용화를 목표로 다양한 노력을 기울이고 있으며 자율주행차량이 운영 가이드라인에 따라 안전하고 신속하게 주행할 수 있는 연구들이 대두되고 있다. 본 연구는 자율주행차량의 경로탐색을 미시적인 관점으로 바라보며 Deep Q-Learning을 통해 자율주행차량의 차로변경을 학습시켜 효율성을 입증하고자 한다. 이를 위해 SUMO를 사용하였으며, 시나리오는 출발지에서 랜덤 차로로 출발하여 목적지의 3차로까지 차로변경을 통해우회전하는 것으로 설정하였다. 연구 결과 시뮬레이션 기반의 차로변경과 Deep Q-Learning을적용한 시뮬레이션 기반의 차로변경으로 구분하여 분석하였다. 평균 통행 속도는 Deep Q-Learning을 적용한 시뮬레이션의 경우가 적용하지 않은 경우에 비해 약 40% 향상되었으며평균 대기 시간은 약 2초, 평균 대기 행렬 길이는 약 2.3대 감소하였다.

      • Applying CEE (CrossEntropyError) to improve performance of Q-Learning algorithm

        Hyun-Gu Kang,서동성,Byeong-seok Lee,Kang Min Soo 한국인공지능학회 2017 인공지능연구 (KJAI) Vol.5 No.1

        Recently, the Q-Learning algorithm, which is one kind of reinforcement learning, is mainly used to implement artificial intelligence system in combination with deep learning. Many research is going on to improve the performance of Q-Learning. Therefore, purpose of theory try to improve the performance of Q-Learning algorithm. This Theory apply Cross Entropy Error to the loss function of Q-Learning algorithm. Since the mean squared error used in Q-Learning is difficult to measure the exact error rate, the Cross Entropy Error, known to be highly accurate, is applied to the loss function. Experimental results show that the success rate of the Mean Squared Error used in the existing reinforcement learning was about 12% and the Cross Entropy Error used in the deep learning was about 36%. The success rate was shown.

      • KCI등재

        Enhanced Machine Learning Algorithms: Deep Learning, Reinforcement Learning, and Q-Learning

        박지수,박종혁 한국정보처리학회 2020 Journal of information processing systems Vol.16 No.5

        In recent years, machine learning algorithms are continuously being used and expanded in various fields, suchas facial recognition, signal processing, personal authentication, and stock prediction. In particular, variousalgorithms, such as deep learning, reinforcement learning, and Q-learning, are continuously being improved. Among these algorithms, the expansion of deep learning is rapidly changing. Nevertheless, machine learningalgorithms have not yet been applied in several fields, such as personal authentication technology. Thistechnology is an essential tool in the digital information era, walking recognition technology as promisingbiometrics, and technology for solving state-space problems. Therefore, algorithm technologies of deeplearning, reinforcement learning, and Q-learning, which are typical machine learning algorithms in variousfields, such as agricultural technology, personal authentication, wireless network, game, biometric recognition,and image recognition, are being improved and expanded in this paper.

      • Improving financial trading decisions using deep Q-learning: Predicting the number of shares, action strategies, and transfer learning

        Jeong, Gyeeun,Kim, Ha Young Elsevier 2019 expert systems with applications Vol.117 No.-

        <P><B>Abstract</B></P> <P>We study trading systems using reinforcement learning with three newly proposed methods to maximize total profits and reflect real financial market situations while overcoming the limitations of financial data. First, we propose a trading system that can predict the number of shares to trade. Specifically, we design an automated system that predicts the number of shares by adding a deep neural network (DNN) regressor to a deep Q-network, thereby combining reinforcement learning and a DNN. Second, we study various action strategies that use Q-values to analyze which action strategies are beneficial for profits in a confused market. Finally, we propose transfer learning approaches to prevent overfitting from insufficient financial data. We use four different stock indices—the S&P500, KOSPI, HSI, and EuroStoxx50—to experimentally verify our proposed methods and then conduct extensive research. The proposed automated trading system, which enables us to predict the number of shares with the DNN regressor, increases total profits by four times in S&P500, five times in KOSPI, 12 times in HSI, and six times in EuroStoxx50 compared with the fixed-number trading system. When the market situation is confused, delaying the decision to buy or sell increases total profits by 18% in S&P500, 24% in KOSPI, and 49% in EuroStoxx50. Further, transfer learning increases total profits by twofold in S&P500, 3 times in KOSPI, twofold in HSI, and 2.5 times in EuroStoxx50. The trading system with all three proposed methods increases total profits by 13 times in S&P500, 24 times in KOSPI, 30 times in HSI, and 18 times in EuroStoxx50, outperforming the market and the reinforcement learning model.</P> <P><B>Highlights</B></P> <P> <UL> <LI> A financial trading system is proposed to improve traders’ profits. </LI> <LI> The system uses the number of shares, action strategies, and transfer learning. </LI> <LI> The number of shares is determined by using a DNN regressor. </LI> <LI> When confusion exists, postponing a financial decision is the best policy. </LI> <LI> Transfer learning can address problems of insufficient financial data. </LI> </UL> </P>

      • KCI등재

        확정적 네트워크에서의 동적 처리순위를 활용한 강화학습 기반 스케줄러

        류지혜,박규동,권주혁,정진우 한국통신학회 2023 韓國通信學會論文誌 Vol.48 No.4

        Smart industry, metaverse, digital-twin, and military applications require deterministic data delivery in large scale networks. This paper proposes reinforcement learning-based scheduling that assigns dynamically different precedences to the flows, in addition to the flow's class or priority, and determines the scheduling algorithm according to the flow's precedence. In the proposed reinforcement learning-based scheduling algorithm with two precedence queues, the reinforcement learning agent takes two actions that assigns the precedence of flows according to a specified criterion and selects a scheduling algorithm. Depending on the purpose of the network, any factor with high importance could be a criterion for determining the precedence. In this study, the deadline required by the flow is designated as the major factor for precedence decision. By utilizing DDQN (Double Deep Q-Network), a deep learning-based reinforcement learning model, the precedence and the scheduling algorithm are determined by observing the state of the network and selecting an action at each decision period with a fixed length. In the network simulator developed for the study, it was confirmed that the DDQN agent showed better performance than various heuristic algorithms. 스마트 인더스트리, 메타버스, 디지털 트윈, 군사용 어플리케이션 등에서 확정적 데이터 전달을 요구하고 있다. 본 논문은 일반적으로 통용되는 플로우들의 클래스 혹은 우선순위와는 별도로, 네트워크 상황과 중요도에 따라 플로우 별로 동적으로 처리순위(precedence)를 할당하고, 이에 따라 스케줄링 알고리즘을 결정하는 강화학습 기반의스케줄링 프레임워크를 제안한다. 이를 실증하기 위해서 두 개의 처리순위 큐가 존재하는 환경을 상정하여, 강화학습 에이전트가 지정된 기준에 따라 플로우들의 처리순위를 지정하며 스케줄링 알고리즘을 선택하는 두 가지의행동(action)을 취한다. 네트워크 특성에 따라 다양한 기준으로 처리순위를 결정할 수 있다. 본 연구에서는 플로우가 요구하는 마감기한(deadline)을 처리순위 결정의 중요한 기준으로 사용하였다. 딥러닝 기반의 강화학습 모델인DDQN(Double Deep Q-Network)을 활용하여, 고정된 길이의 결정 주기마다 네트워크의 상태(state)를 관측하고행동을 선택함으로써 처리순위를 결정한다. 본 연구의 환경에 맞게 개발한 네트워크 시뮬레이터를 통해 DDQN 에이전트가 여러 휴리스틱 알고리즘과 비교하여 높은 성능을 보이는 것을 확인하였다.

      • KCI등재

        Applying Deep Reinforcement Learning to Improve Throughput and Reduce Collision Rate in IEEE 802.11 Networks

        ( Chih-heng Ke ),( Lia Astuti ) 한국인터넷정보학회 2022 KSII Transactions on Internet and Information Syst Vol.16 No.1

        The effectiveness of Wi-Fi networks is greatly influenced by the optimization of contention window (CW) parameters. Unfortunately, the conventional approach employed by IEEE 802.11 wireless networks is not scalable enough to sustain consistent performance for the increasing number of stations. Yet, it is still the default when accessing channels for single-users of 802.11 transmissions. Recently, there has been a spike in attempts to enhance network performance using a machine learning (ML) technique known as reinforcement learning (RL). Its advantage is interacting with the surrounding environment and making decisions based on its own experience. Deep RL (DRL) uses deep neural networks (DNN) to deal with more complex environments (such as continuous state spaces or actions spaces) and to get optimum rewards. As a result, we present a new approach of CW control mechanism, which is termed as contention window threshold (CW<sub>Threshold</sub>). It uses the DRL principle to define the threshold value and learn optimal settings under various network scenarios. We demonstrate our proposed method, known as a smart exponential-threshold-linear backoff algorithm with a deep Q-learning network (SETL-DQN). The simulation results show that our proposed SETL-DQN algorithm can effectively improve the throughput and reduce the collision rates.

      • KCI등재

        DDQN을 활용한 강화학습 기반 타임슬롯 스케줄링

        류지혜,권주혁,정진우 한국통신학회 2022 韓國通信學會論文誌 Vol.47 No.7

        To adopt reinforcement learning in the network scheduling area is getting more attention than ever, because of its flexibility to adapt to the dynamic changes in network traffic specifications and their requirements. In this study, a timeslot scheduling algorithm based on priority is designed using Double deep q-network (DDQN), a reinforcement learning algorithm. To evaluate the behavior of the DDQN agent, a reward function is defined based on the difference between the estimated delay and the deadline of packets transmitted at timeslot; and on the priority of packets. The simulation showed that the designed scheduling algorithm performs better than the existing algorithms such as the strict priority (SP) or weighted round robin (WRR) scheduler, in the sense that more packets have arrived within the deadline. By using the proposed DDQN-based scheduler, it is expected that autonomous network scheduling can be realized in the upcoming network framework. 트래픽 특성과 요구사항의 다양한 변화에 네트워크 자율적으로 유연하게 적응하고 대처하는 방안으로, 강화학습을 적용한 네트워크 스케줄러가 최근 주목받고 있다. 본 연구에서는 딥러닝을 적용한 강화학습 모델인 double deep q-network(DDQN)을 사용해 우선순위 기반의 타임슬롯 스케줄링을 구현한다. DDQN 에이전트의 행동에 대한 가치를 평가하기 위해 reward는 timeslot에서 전송된 패킷의 추정 delay와 deadline의 차이, 그리고 패킷의 우선순위에 기반해 지급하는 function으로 정의하였다. 시뮬레이션 결과, 학습된 스케줄러가 strict priority (SP) 혹은weighted round robin(WRR)과 같은 기존 알고리즘으로 스케줄링했을 때 우려되는 문제점을 극복한 것을 확인할수 있었다. 또한, 기존 스케줄러보다 높은 누적 보상의 합인 score를 기록하며, deadline 내에 더 많은 packet이 도착함을 확인하였다. 본 연구를 통해서 대규모 유선 네트워크에서 자율 네트워크 스케줄링 기능 실현의 가능성을타진하였다. 특히 제안하는 DDQN 기반 강화학습 에이전트를 사용하면 자율성과 성능을 모두 개선할 수 있을 것으로 기대한다.

      • KCI등재

        Visual Analysis of Deep Q-network

        ( Dewen Seng ),( Jiaming Zhang ),( Xiaoying Shi ) 한국인터넷정보학회 2021 KSII Transactions on Internet and Information Syst Vol.15 No.3

        In recent years, deep reinforcement learning (DRL) models are enjoying great interest as their success in a variety of challenging tasks. Deep Q-Network (DQN) is a widely used deep reinforcement learning model, which trains an intelligent agent that executes optimal actions while interacting with an environment. This model is well known for its ability to surpass skilled human players across many Atari 2600 games. Although DQN has achieved excellent performance in practice, there lacks a clear understanding of why the model works. In this paper, we present a visual analytics system for understanding deep Q-network in a non-blind matter. Based on the stored data generated from the training and testing process, four coordinated views are designed to expose the internal execution mechanism of DQN from different perspectives. We report the system performance and demonstrate its effectiveness through two case studies. By using our system, users can learn the relationship between states and Q-values, the function of convolutional layers, the strategies learned by DQN and the rationality of decisions made by the agent.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼