RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제
      • 좁혀본 항목 보기순서

        • 원문유무
        • 음성지원유무
        • 학위유형
        • 주제분류
          펼치기
        • 수여기관
          펼치기
        • 발행연도
          펼치기
        • 작성언어
        • 지도교수
          펼치기

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • INTERVAL TYPE-2 APPROACH TO KERNEL POSSIBILISCTIC C-MEANS CLUSTERING

        Raza, Muhammad Amjad 한양대학교 2012 국내석사

        RANK : 247359

        Pattern recognition and machine leaning is used to solve many complex problems of data mining, decision making and grouping of natural data. Among pattern recognition algorithms clustering is very popular and widely being used for problems of classification of image data as well as other real data. Fuzzy clustering replaced hard clustering due to its more robust clustering results and flexible classification boundary. Kernel fuzzy clustering is extensively used for clustering of nonlinear separable, complex distributed and ring shape like data sets, and moreover it is useful for data sets containing some sort of noisy pattern with random distribution. Kernel function is commonly defined as dot product of two values obtained by linear or nonlinear transfer function and has positive semi definite nature. Kernel trick adds additional degree of freedom by implicitly mapping of input patterns into higher dimensional space known as kernel space. The outcome of Kernel clustering is much better as compared to conventional fuzzy clustering algorithms such as fuzzy c-means, possibilistic c-means and possibilistic fuzzy c-mean not only for spherical data sets but also non spherical data sets. However in case of kernel possibilistic C-means, cluster coincidence occurs resulting into poor trained network. In this dissertation, we propose an Interval type 2 kernel possibilistic c-means to overcome this cluster coincidence problem in kernel possibilistic c-means clustering algorithm since type-2 fuzzy sets handle uncertainties better than type-1 fuzzy sets. Choice of kernel function is mostly data dependent, we selected Gaussian kernel with variable sigma for experiments. For same value of sigma of Gaussian kernel our proposed method outperforms and presented results will show the validity of our proposed method

      • Evaluation of Indirect Liquid Cooling for Electric Vehicle Battery Thermal Management System

        Raza, Waseem Graduate School of Jeju National University 2021 국내석사

        RANK : 247359

        Green-powered electric vehicles are reliable substitutes to lessen greenhouse gas emissions and reliance on fossil fuels. Electric vehicles are now extensively used worldwide, and this will be high in the future. The battery pack is the key feature of an electric vehicle. The battery pack's performance and life are incredibly temperature-sensitive, a problem related to battery thermal management. So, it is mandatory to keep the optimal temperature range for smooth performance and safety. This study evaluates the coolant cooling system for the thermal management of 45kWh and 64 kWh battery packs, respectively. A chiller and radiator are employed to cool the operating fluid, combining water and ethylene glycol with the 50:50 ratio. A thermal-fluid simulation executed using a standard commercial tool known as GT Suite, and the battery pack temperature experiment using a vehicle was conducted to validate the simulation results. The results reveal that the proposed systems'performance typically improved by mproving the flow rate and the chiller power. It depicts that the cooling rate is maximum at 40 °C initial attery temperature and an ambient temperature of 25 °C with 3 kW chiller capacity and flow rate of 20 liters per minute (LPM) in the chiller cycle mode. While in the radiator cycle, the highest cooling rate is at an ambient temperature of 25 °C, an initial battery temperature of 40 °C with an airflow rate of 3600 cubic meters per hour (CMH), and a 30 LPM of coolant flow rate in both systems, respectively. Besides, a thermal management system of a 45 kWh battery pack was found better than a 64 kWh battery pack system in terms of cooling performance. In the future, the proposed BTMS could be a realistic solution and help develop the electric vehicles' thermal system.

      • Software defined mobility management in 5G networks

        Raza, Syed Muhammad Sungkyunkwan university 2018 국내박사

        RANK : 247359

        Explosive growth in data, increasing operating cost, and narrow revenue gains have forced the mobile network operators and vendor to architectural redesign with new concepts for next-generation 5G mobile networks. Among others, network softwarization is a key enabling technology for 5G mobile networks, which is lead by Software Defined Networking (SDN) and Network Function Virtualization (NFV). Centralized control and management of a network through decoupling of control and data planes in SDN, complimented by softwarization of network function through NFV, provides operators the much desired agility and scalability and enables them to embrace the concepts likes Mobile Edge Cloud and network slicing. With the plethora of new technologies and concepts considered for 5G mobile networks, mobility management solutions in current centralized and hierarchical cellular networks must be evolved to satisfy the requirements and use cases posed by the 5G mobile networks. To this end, this thesis briefly illustrates different mobility solutions in IP networks, and then look to evolve Proxy Mobile IPv6 (PMIPv6) for the future LTE and early stage 5G mobile networks. In the evolution of PMIPv6, this thesis first separates the control plane of the PMIPv6 from the data plane by proposing SDN based PMIPv6 (SDN-PMIPv6) architecture. In SDN-PMIPv6, access gateways control is centralized at the controller, whereas the data path remains direct between the access gateway and a local mobility anchor via an IP tunnel. Reactive and proactive handover schemes in SDN-PMIPv6 are presented in this thesis, with their analytical modeling, testbed implementation, and comprehensive performance evaluation. Using SDN-PMIPv6 architecture as a platform, On-demand Inter-domain SDN-PMIPv6 (OIS-PMIPv6) architecture is proposed to enable multi-domain mobility support while considering scalability and performance matrices as key design parameters. To highlight the effectiveness of OIS-PMIPv6, thorough performance evaluation based on analytical model and testbed experiments is made part of this thesis. To deliver the desired agility in 5G mobile networks, this thesis proposes Anchor as a Service (AaaS), where dynamic deployment of virtual network function of access gateways and anchors is performed as per service requirements in SDN/NFV environment. Although this work is still ongoing, nevertheless architecture design and use cases along with brief implementation details is covered. Another significant factor of AaaS is that it shows how proposed SDN-PMIPv6 and OIS-PMIPv6 becomes relevant in the 5G mobile networks architecture

      • THE EXPERIENCE OF SAARC AS A REGIONAL BLOCK AND ITS TRADE EFFECTIVENESS

        Raza, Hassan 고려대학교 국제대학원 2017 국내석사

        RANK : 247359

        The formation of South Asian Association for Regional Cooperation (SAARC) in 1985 is a significant step towards regionalism in South Asia. It was created to achieve better living standards for the people of this region through greater development. However, the performance of SAARC in the most crucial area of economic cooperation has been far from encouraging. The flow of intraregional trade is just five percent of the total trade flow despite the presence of the South Asian Free Trade Area (SAFTA). The presence of massive tariffs and nontariff barriers reflect a lack of political will on the part of member countries. There are many structural and functional problems afflicting the performance of SAARC. The most important factor is the presence of conflicts and mistrust which more often override the economic interests and derail any initiative aimed at enhancing the performance of SAARC. In this backdrop, this study makes an attempt to analyze the performance of SAARC by calculating two trade indicators which are Trade Intensity Index (TII) and Revealed Comparative Advantage (RCA) and tries to trace the reasons behind the non-effectiveness of SAARC.

      • Development of Novel Compounds for Melanin Inhibition Using Biochemical, Chemoinformatics and Pharmacological Approaches

        RAZA, HUSSAIN 공주대학교 일반대학원 2020 국내박사

        RANK : 247359

        현재 논문에서, 다른N-(substituted-phenyl)-4-{(4-[(E)-3- phenyl-2-propenyl]-1-piperazinyl}butanamides가 반응식 1에 기재된 프로토콜에 따라 합성되었다. 수성 염기성 매질에서 다양하게 치환 된anilines (1a-e)과4-chlorobutanoyl chloride (2)를 반응시켜 다양한 친전자체4-chloro-N-(substitutedphenyl)butanamides (3a-e)를 생성함으로써 합성을 개시하였다. 이어서, 극성 비양성자성 용매에서 이들 친전자체를 1-[(E)-3- phenyl-2-propenyl]piperazine(4)과 결합하여 목표한 화합물 N-(substituted-phenyl)-4-{(4-[(E)-3-phenyl-2-propenyl]-1-piperazinyl} butanamides(5a-e)를 얻었다. 모든 유도체의 구조는 양성자 핵자기공명 (1H-NMR), 탄소 핵자기공명 (13C-NMR), Infra-Red (IR) 스펙트럼 데이터. 및 CHN 분석을 통해 동정되었고 특성이 조사되었다. 이들 부탄 아미드의 버섯 티로시나제에 대한 시험관내 억제능이 평가되었으며 , 모든 화합물이 생물학 활성을 가진 것이 밝혀졌다 . 이들 중 5b는 IC50 값이 0.013 ± 0.001 μM로 가장 높은 억제능을 나타냈다. 동일한 화합물 5b를 또한 생체내실험을 통해 분석하였고, 그것이 제브라 피쉬에서 피부색소를 상당히 감소시키는 것이 밝혀졌다. 또한 컴퓨터 분석결과는 전술한 결과와 일치했다. 또한, 용혈 활성을 통해 이들 분자의 세포 독성이 조사되었으며, 5e를 제외하고 다른 모든 화합물은 최소 독성을 나타내는 것이 밝혀졌다. 화합물 5a는 또한 유사한 결과를 나타냈다. 따라서, 이들 화합물 중 일부는 부작용이 적은 탈색소 약물의 제형 및 개발에 적합한 후보가 될 수있다. (1 장). 새로운N-(substituted-phenyl)-4-(4-phenyl1-piperazinyl) butanamides (4a-c) 의 합성이 뛰어난 2 단계 전략을 통해서 수행되었다. 이들 화합물의 구조는 CHN 분석 데이터와 함께 IR, EI-MS, 1H-NMR, 13C-NMR 스펙트럼에 의해 확증되었다. 버섯 타이로시나제에 대한 실험실 억제의 결과는 모든 화합물이 이 효소를 잘 억제하였으며 그 중에서 4b는 표준 (16.841 ± 1.146 μM)에 비해 IC50 값이 0.168 ± 0.057 μM 이었으며 가장 억제능이 높은 화합물로 확인되었다. 이 분자의 동력학적 분석 (Ki = 0.22 μM)을 통해 티로시나아제 효소를 경쟁적으로 억제하지 않는 것이 밝혀졌다. 또한 제브라 피쉬 배아에서 연구했을 때 생체내 실험에서 피부 색소를 현저하게 감소시켰다 (75.373 %, P <0.05). 또한, 이들 부탄 아미드의 세포 독성도 조사되었으며, 이들 분자들은 매우 낮은 세포 독성을 갖는 것이 밝혀졌다. 따라서, 본 연구로부터 이들 화합물은 피부 관련 질환의 개선을 위해보다 적은 세포 독성 치료제로서 사용될 수 있다는 결론을 내렸다. (제 2 장). 3 장의 연구는 자외선 차단을 위해 필수적인 필수 단백질 (멜라닌)의 보호작용을 연구했다. 본 연구에서 3 가지 다른 (시험 관내, 생체 내 및 컴퓨터 분석) 방법을 사용하여 새로운 약물 Acemetacin (ACE)의 멜라닌 억제능력을 검사하였다. ACE는 표준 코지산 (IC50 = 16.841 ± 1.146 µM)과 비해서 현저한 억제능을 (IC50 = 0.353 ± 0.003 μM) 보였고, 티로시나아제에 대한 억제 메커니즘 분석 결과 ACE는 경쟁적 억제를 나타냈다. 생체 내 연구에서 어린 제브라 피쉬를 5, 10, 15 및 20 μM의 ACE 및 양성 대조군 용량으로 노출하였다. 72 시간 처리에서, ACE는 코지산 (39.64 %)에 비해 20 μM의 농도에서 색소 침착 수준을 표현 적으로 (P <0.05) 감소시켰다(62.89 %). ACE의 결합 양상이 분자 도킹에 의해 확인되었고 도킹된 복합체의 안정성이 MD 시뮬레이션에 의해 확인되었다. 이런 결과에 기초하여, ACE가 티로시나제에 대해서 멜라닌 생성에 대해 우수한 치료 가능성을 보유한 것으로 결론 내렸다. (3 장). In the current thesis work, different N-(substituted-phenyl)-4-{(4-[(E)-3-phenyl-2-propenyl]-1-piperazinyl}butanamides have been synthesized according to the protocol described in scheme 1. The synthesis was initiated by reacting various substituted anilines (1a-e) with 4-chlorobutanoyl chloride (2) in aqueous basic medium to give various electrophiles, 4-chloro-N-(substituted-phenyl)butanamides (3a-e). These electrophiles were then coupled with 1-[(E)-3-phenyl-2-propenyl]piperazine (4) in polar aprotic medium to attain the targeted N-(substituted-phenyl)-4-{(4-[(E)-3-phenyl-2-propenyl]-1-piperazinyl}butanamides (5a-e). The structures of all derivatives were identified and characterized by proton-nuclear magnetic resonance (1H-NMR), carbon-nuclear magnetic resonance (13C-NMR) and Infra-Red (IR) spectral data along with CHN analysis. The in vitro inhibitory potentials of these butanamides were evaluated against mushroom tyrosinase, whereby all compounds were found to be biologically active. Among them, 5b exhibited highest inhibitory potential with IC50 value of 0.013 ± 0.001 µM. The same compound 5b was also assayed through in vivo approach, and it was explored that it significantly reduced the pigments in zebrafish. The in silico studies were also in agreement with aforesaid results. Moreover, these molecules were profiled for their cytotoxicity through hemolytic activity, and it was found that except 5e, all other compounds showed minimal toxicity. The compound 5a also exhibited comparable results. Hence, some of these compounds might be worthy candidates for the formulation and development of depigmentation drugs with minimum side effects. (Chapter 1). The syntheses of some new N-(substituted-phenyl)-4-(4-phenyl1-piperazinyl)butanamides (4a-c) were carried out through a facile bi-step strategy. The structures of these compounds were corroborated by their IR, EI-MS, 1H-NMR, 13C-NMR spectra along with CHN analysis data. The results of mushroom tyrosinase in vitro inhibition revealed that all compounds were superb inhibitors of this enzyme and among them 4b was identified as the most active compound having IC50 value of 0.168 ± 0.057 µM, relative to the standard (16.841 ± 1.146 µM). The kinetic analysis (Ki = 0.22 µM) of this molecule revealed that it does not competitively inhibit the tyrosinase enzyme. It also significantly reduced (P<0.05) the enormous amount of pigments to about 75.373% in an in vivo protocol, when studied on the zebrafish embryos. Moreover, the cytotoxicity of these butanamides was also profiled and it was an inferred that these molecules possess very mild cytotoxicity. So, it was consummated from the present investigation that these compounds might be utilized as less cytotoxic therapeutic agents for the betterment of skin related ailments. (Chapter 2). The present study describes that the protection. Essential protein (melanin) that is vital for the skin for defense from UV rays. In present research, emerging drug Acemetacin (ACE) was examined for their melanin inhibition using three different (in vitro, in vivo and computational) methods. ACE showed potentially remarkable potency (IC50 = 0.353 ± 0.003 µM) against tyrosinase in the comparison of standard, kojic acid (IC50 = 16.841 ± 1.146 µM) and enzyme inhibition mechanism analysis, it was exposed ACE exhibited competitive inhibition. In the in vivo study zebrafish seeds were exposed with 5, 10, 15 and 20 µM of ACE and positive control doses. At 72 h treatment, ACE expressively (P<0.05) reduced the level of pigmentation (62.89%) at a concentration of 20 µM, relative to that of kojic acid (39.64%). The binding profile of ACE was confirmed by molecular docking and stability of the docked complexes was justified by MD simulation. Based on our results, it was concluded as ACE possessed good therapeutic potential against melanogenesis by targeting the tyrosinase. (Chapter 3).

      • Optimization based Policy Gradient for MARL

        Raza ur Rehman Hafiz Muhammad 영남대학교 대학원 2023 국내박사

        RANK : 247359

        The current invention relates to the multi-agents deep reinforcement learning (MADRL). In multi-agent deep reinforcement learning, many intelligent agents interact and work together in a setting where they try to learn from their mistakes and develop better decision-making skills. Recently, MADRL showed very promising results in cooperative multi-agent systems (MAS) and proved its importance in this field. Particularly, in complex tasks like self-driving vehicles, two state gaming (StarCraft), logistics distribution in a factory, productivity optimization, and cooperative multi-robot exploration system. Many different techniques are introduced for solving these problems. Deep multiagent reinforcement learning shows promising results in terms of completing many challenging tasks. To demonstrate its viability of the field, (VDN) enabled centralized value-function learning to be coupled by decentralized execution. Their approach combined the individual agent terms from a core state-action value function. VDN, however, can only represent a small class of centralized action-value functions and does not employ additional state information during training. Modern methods like QMIX employ the CTDE (centralized training with decentralized execution) paradigm. In this method, a mixer network is used to factorize the joint state-action value function for all agents as a monotonic function. In order to guarantee the individual-global-max condition IMG for each agent, the mixer network is employed to calculate the joint state-action value of all agents. A hyper-network, which predicts a strictly positive weight for the mixer network based on the present state of each agent as an input, is used to achieve the monotonic condition. The outputs of the mixer network also depend on the current state via this hyper-network. The mixing network is given the same DQN algorithm that was used in the optimization process. The joint action-value function class of QMIX is also restricted. To address this limitation, QTRAN introduced a novel factorization method to express the complete value function class with the help of IGM consistency. However, although requiring more processing effort to implement, this method ensured more general factorization than QMIX. Mahajan et al.'s analysis of QMIX's exploration capabilities in particular contexts showed limitations. To improve the performance of all agents, they presented a paradigm in which a latent space exists. Therefore, obtaining effective scalability for supporting MARL remains a difficulty that is solved by QPLEX. Although QPLEX performs well, sophisticated networks are still needed to produce these outcomes. Additionally, because it employs a greedy policy for the choice of an individual agent's activity, it necessitates several training episodes for a sizable number of agents. Additionally, two novel Deep Quality-Value (DQV)-based MARL algorithms known as QVMix and QVMix-Max have been developed by researchers. The development of these algorithms makes use of centralized training and decentralized execution. The outcomes of these algorithms demonstrate that QVMix outperformed the others because it is less prone to an overestimation bias of the Q function. However, QVMix also needs a lot of processing power and training time because it also employs a greedy method for choosing the actions taken by each individual agent. In this thesis, to overcome these restrictions, we suggest a novel hybrid policy that is based on optimization and is inspired by nature. For the action selection of each individual agent in this policy, we employed GWO in conjunction with a greedy policy. Although they require environmental knowledge, optimization algorithms such as GWO (often used for finding the prey) and Ant Colony Optimizer (typically used for determining the shortest path) outperform the greedy policy. In GWO, agents are taught centrally, with the leader agent assisting the other agents. As a result, because the current innovation uses bio-inspired optimization, it takes less computer resources and fewer episodes than legacy methodologies. In which there are no communication restrictions and agents cooperate to attain the goal. Additionally, in a known environment, optimization strategies converge more quickly than greedy policies. The optimization algorithm, however, fails in an unknowable environment, but the greedy policy performs noticeably better. We therefore attain the greatest outcomes for both cases by combining these approaches. We compared our suggested approach to the cutting-edge QMIX and QVMix algorithms using the StarCraft 2 Learning Environment. The results of the experiments show that our algorithm performs better than QMIX and QVMix in every case and needs less training sessions. 본 발명은 MADRL(Multi- Agent Deep Reinforcement Learning)에 관한 것이다. 다중 에이전트 심층 강화 학습에서 많은 지능형 에이전트는 실수로부터 배우고 더 나은 의사 결정 기술을 개발하려는 환경에서 상호 작용하고 함께 작업합니다. 최근 MADRL은 협력 다중 에이전트 시스템(MAS)에서 매우 유망한 결과를 보여 이 분야에서 그 중요성을 입증했습니다. 특히 자율주행차, 투 스테이트 게임(스타크래프트), 공장 내 물류 유통, 생산성 최적화, 협동형 멀티로봇 탐사 시스템과 같은 복잡한 작업에서. 이러한 문제를 해결하기 위해 다양한 기술이 도입되었습니다. 심층 다중 에이전트 강화 학습은 많은 도전적인 작업을 완료하는 측면에서 유망한 결과를 보여줍니다 . 현장의 실행 가능성을 입증하기 위해 (VDN)은 중앙 집중식 가치 기능 학습이 분산 실행과 결합되도록 했습니다. 그들의 접근 방식은 핵심 상태-행동 가치 함수의 개별 에이전트 용어를 결합했습니다. 그러나 VDN은 작은 클래스의 중앙 집중식 작업 값 함수만 나타낼 수 있으며 교육 중에 추가 상태 정보를 사용하지 않습니다. QMIX와 같은 최신 방법은 CTDE(분산형 실행을 통한 중앙 집중식 교육) 패러다임을 사용합니다. 이 방법에서는 믹서 네트워크를 사용하여 모든 에이전트에 대한 공동 상태-행동 값 함수를 단조 함수로 분해합니다. 각 에이전트에 대한 개별-글로벌-최대 조건 IMG를 보장하기 위해 믹서 네트워크를 사용하여 모든 에이전트의 공동 상태-행동 값을 계산합니다. 단조 조건을 달성하기 위해 각 에이전트의 현재 상태를 입력으로 하여 믹서 네트워크에 대한 엄격한 양의 가중치를 예측하는 하이퍼 네트워크가 사용됩니다. 믹서 네트워크의 출력도 이 하이퍼 네트워크를 통한 현재 상태에 따라 달라집니다. 혼합 네트워크에는 최적화 프로세스에 사용된 것과 동일한 DQN 알고리즘이 제공됩니다. QMIX의 공동 행동 가치 함수 클래스도 제한됩니다. 이 제한을 해결하기 위해 QTRAN은 IGM 일관성의 도움으로 완전한 가치 함수 클래스를 표현하는 새로운 분해 방법을 도입했습니다. 그러나 구현하는 데 더 많은 처리 노력이 필요하지만 이 방법은 QMIX보다 더 일반적인 분해를 보장합니다. 특정 상황에서 QMIX의 탐색 기능에 대한 Mahajan et al.의 분석은 한계를 보여주었습니다. 모든 에이전트의 성능 향상을 위해 잠재 공간이 존재하는 패러다임을 제시했다. 따라서 MARL을 지원하기 위한 효과적인 확장성을 확보하는 것은 QPLEX로 해결되는 난제로 남아 있습니다. QPLEX의 성능은 우수하지만 이러한 결과를 생성하려면 정교한 네트워크가 여전히 필요합니다. 또한 개별 에이전트의 활동 선택에 탐욕적인 정책을 사용하기 때문에 상당한 수의 에이전트에 대해 여러 훈련 에피소드가 필요합니다. 또한 QVMix 및 QVMix -Max 로 알려진 두 가지 새로운 DQV(Deep Quality-Value) 기반 MARL 알고리즘이 연구원에 의해 개발되었습니다. 이러한 알고리즘의 개발은 중앙 집중식 교육 및 분산 실행을 사용합니다. 이러한 알고리즘의 결과는 QVMix가 Q 함수의 과대평가 편향에 덜 취약하기 때문에 다른 알고리즘보다 우수한 성능을 보였다는 것을 보여줍니다. 그러나 QVMix는 또한 각 개별 에이전트가 수행하는 작업을 선택하는 욕심 많은 방법을 사용하기 때문에 많은 처리 능력과 교육 시간이 필요합니다. 본 논문에서는 이러한 제약을 극복하기 위해 자연에서 영감을 얻은 최적화 기반의 새로운 하이브리드 정책을 제안한다. 이 정책에서 각 개별 에이전트의 작업 선택을 위해 욕심 많은 정책과 함께 GWO를 사용했습니다. 환경 지식이 필요하지만 GWO(먹이를 찾는 데 자주 사용됨) 및 Ant Colony Optimizer(일반적으로 최단 경로를 결정하는 데 사용됨)와 같은 최적화 알고리즘이 욕심쟁이 정책을 능가합니다. GWO에서 에이전트는 다른 에이전트를 지원하는 리더 에이전트와 함께 중앙에서 학습됩니다. 결과적으로 현재의 혁신은 생체에서 영감을 얻은 최적화를 사용하기 때문에 기존 방법론보다 컴퓨터 리소스와 에피소드가 적습니다. 통신 제한이 없으며 에이전트가 목표를 달성하기 위해 협력합니다. 또한 알려진 환경에서 최적화 전략은 그리디 정책보다 더 빠르게 수렴됩니다. 그러나 최적화 알고리즘은 알 수 없는 환경에서 실패하지만 탐욕 정책이 눈에 띄게 더 잘 수행됩니다. 따라서 이러한 접근 방식을 결합하여 두 경우 모두에 대해 가장 큰 결과를 얻습니다. 우리는 StarCraft 2 학습 환경을 사용하여 최첨단 QMIX 및 QVMix 알고리즘에 대해 제안된 접근 방식을 비교했습니다. 실험 결과는 우리의 알고리즘이 모든 경우에 QMIX 및 QVMix 보다 더 잘 수행되고 훈련 세션이 더 적게 필요함을 보여줍니다.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼