RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      KCI등재 SCIE SCOPUS

      Target-Sensitive Control of Markov and Semi-Markov Processes

      한글로보기

      https://www.riss.kr/link?id=A104901885

      • 0

        상세조회
      • 0

        다운로드
      서지정보 열기
      • 내보내기
      • 내책장담기
      • 공유하기
      • 오류접수

      부가정보

      다국어 초록 (Multilingual Abstract)

      We develop the theory for Markov and semi-Markov control using dynamic programming and reinforcement learning in which a form of semi-variance which computes the variability of rewards below a pre-specified target is penalized. The objective is to opt...

      We develop the theory for Markov and semi-Markov control using dynamic programming and reinforcement learning in which a form of semi-variance which computes the variability of rewards below a pre-specified target is penalized. The objective is to optimize a function of the rewards and risk where risk is penalized. Penalizing variance, which is popular in the literature, has some drawbacks that can be avoided with semi-variance.

      더보기

      참고문헌 (Reference)

      1 J. Filar, "Variancepenalized Markov decision processes" 14 (14): 147-161, 1989

      2 R.-R. Chen, "Value iteration and optimization of multiclass queueing networks" 32 : 65-97, 1999

      3 K. Boda, "Time consistent dynamic risk measures" 63 : 169-186, 2005

      4 M. Sobel, "The variance of discounted Markov decision processes" 19 : 794-802, 1982

      5 C. G. Turvey, "The semi-varianceminimizing hedge ratios" 28 (28): 100-115, 2003

      6 F. Brauer, "The Qualitative Theory of Ordinary Differential Equations: An Introduction" Dover Publishers 1989

      7 V. S. Borkar, "The ODE method for convergence of stochastic approximation and reinforcement learning" 38 (38): 447-469, 2000

      8 M. Bouakiz, "Target-level criterion in Markov decision processes" 86 : 1-15, 1995

      9 V. S. Borkar, "Stochastic approximation with twotime scales" 29 : 291-294, 1997

      10 R. Cavazos-Cadena, "Solution to risk-sensitive average cost optimality equation in a class of MDPs with finite state space" 57 : 253-285, 2003

      1 J. Filar, "Variancepenalized Markov decision processes" 14 (14): 147-161, 1989

      2 R.-R. Chen, "Value iteration and optimization of multiclass queueing networks" 32 : 65-97, 1999

      3 K. Boda, "Time consistent dynamic risk measures" 63 : 169-186, 2005

      4 M. Sobel, "The variance of discounted Markov decision processes" 19 : 794-802, 1982

      5 C. G. Turvey, "The semi-varianceminimizing hedge ratios" 28 (28): 100-115, 2003

      6 F. Brauer, "The Qualitative Theory of Ordinary Differential Equations: An Introduction" Dover Publishers 1989

      7 V. S. Borkar, "The ODE method for convergence of stochastic approximation and reinforcement learning" 38 (38): 447-469, 2000

      8 M. Bouakiz, "Target-level criterion in Markov decision processes" 86 : 1-15, 1995

      9 V. S. Borkar, "Stochastic approximation with twotime scales" 29 : 291-294, 1997

      10 R. Cavazos-Cadena, "Solution to risk-sensitive average cost optimality equation in a class of MDPs with finite state space" 57 : 253-285, 2003

      11 R. Porter, "Semivariance and stochastic dominance" 64 : 200-204, 1974

      12 X.-R. Cao, "Semi-Markov decision problems and performance sensitivity analysis" 48 (48): 758-768, 2003

      13 권우영, "SSPQL: Stochastic Shortest Path-based Q-learning" 제어·로봇·시스템학회 9 (9): 328-338, 2011

      14 W. Fleming, "Risksensitive control of finite state machines on an infinte horizon" 35 : 1790-1810, 1997

      15 D. Hernandez-Hernandez, "Risksensitive control of Markov processes in countable state space" 29 : 147-155, 1996

      16 V. Borkar, "Risk-sensitive optimal control for Markov decision processes with monotone cost" 27 : 192-209, 2002

      17 A. E. B. Lim, "Risk-sensitive control with HARA utility" 46 (46): 563-578, 2001

      18 T. Bielecki, "Risk-sensitive control of finite state Markov chains in discrete time" 50 : 167-188, 1999

      19 G. Di Masi, "Risk-sensitive control of discrete-time Markov processes with infinite horizon" 38 (38): 61-78, 1999

      20 R. Howard, "Risk-sensitive MDPs" 18 (18): 356-369, 1972

      21 S. J. Bradtke, "Reinforcement learning methods for continuous-time MDPs, In Advances in Neural Information Processing Systems 7" MIT Press 1995

      22 A. Gosavi, "Reinforcement learning for long-run average cost" 155 : 654-674, 2004

      23 J. Filar, "Percentile perfor mance criteria for limiting average Markov decision processes" 40 : 2-10, 1995

      24 E. Seneta, "Non-Negative Matrices and Markov Chains" Springer-Verlag 1981

      25 D. P. Bertsekas, "Neuro-Dynamic Programming" Athena 1996

      26 C. Wu, "Minimizing risk models in Markov decision processes with policies depending on target values" 231 : 47-67, 1999

      27 D. White, "Minimizing a threshold probability in discounted Markov decision processes" 173 : 634-646, 1993

      28 J. Estrada, "Mean-semivariance behavior: Downside risk and capital asset pricing" 16 : 169-185, 2007

      29 M. L. Puterman, "Markov Decision Processes" Wiley Interscience 1994

      30 J. Abounadi, "Learning algorithms for Markov decision processes with average cost" 40 : 681-698, 2001

      31 J. Baxter, "Infinite-horizon policygradient estimation" 15 : 319-350, 2001

      32 G. Hübner, "Improved procedures for eliminating sub-optimal actions in Markov programming by the use of contraction properties" 257-263, 1978

      33 X.-R. Cao, "From perturbation analysis to Markov decision processes and reinforcement learning" 13 : 9-39, 2003

      34 D. P. Bertsekas, "Dynamic Programming and Optimal Control, 2nd edition" Athena 2000

      35 Qi Jiang, "Dynamic File Grouping for Load Balancing in Streaming Media Clustered Server Systems" 제어·로봇·시스템학회 7 (7): 630-637, 2009

      36 K. Chung, "Discounted MDPs: distribution functions and exponential utility maximization" 25 : 49-62, 1987

      37 R. Cavazos-Cadena, "Controlled Markov chains with risk-sensitive criteria" 43 : 121-139, 1999

      38 E. Altman, "Constrained Markov Decision Processes" CRC Press 1998

      39 V. S. Borkar, "Asynchronous stochastic approximation" 36 (36): 840-851, 1998

      40 S. Ross, "Applied Probability Models with Optimization Applications" Dover 1992

      41 A. Gosavi, "A risk-sensitive approach to total productive maintenance" 42 : 1321-1330, 2006

      42 S. Singh, "A policygradient method for semi-Markov decision processes with application to call admission control" 178 (178): 808-818, 2007

      43 V. S. Borkar, "A new analog parallel scheme for fixed point computation, part I: Theory" 44 : 351-355, 1997

      44 A. Gosavi, "A budget-sensitive approach to scheduling maintenance in a total productive maintenance (TPM)" 23 (23): 46-56, 2011

      45 H. C. Tijms, "A First Course in Stochastic Models, 2nd edition" Wiley 2003

      더보기

      동일학술지(권/호) 다른 논문

      동일학술지 더보기

      더보기

      분석정보

      View

      상세정보조회

      0

      Usage

      원문다운로드

      0

      대출신청

      0

      복사신청

      0

      EDDS신청

      0

      동일 주제 내 활용도 TOP

      더보기

      주제

      연도별 연구동향

      연도별 활용동향

      연관논문

      연구자 네트워크맵

      공동연구자 (7)

      유사연구자 (20) 활용도상위20명

      인용정보 인용지수 설명보기

      학술지 이력

      학술지 이력
      연월일 이력구분 이력상세 등재구분
      2023 평가예정 해외DB학술지평가 신청대상 (해외등재 학술지 평가)
      2020-01-01 평가 등재학술지 유지 (해외등재 학술지 평가) KCI등재
      2010-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2009-12-29 학회명변경 한글명 : 제어ㆍ로봇ㆍ시스템학회 -> 제어·로봇·시스템학회 KCI등재
      2008-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2007-10-29 학회명변경 한글명 : 제어ㆍ자동화ㆍ시스템공학회 -> 제어ㆍ로봇ㆍ시스템학회
      영문명 : The Institute Of Control, Automation, And Systems Engineers, Korea -> Institute of Control, Robotics and Systems
      KCI등재
      2005-01-01 평가 등재학술지 선정 (등재후보2차) KCI등재
      2004-01-01 평가 등재후보 1차 PASS (등재후보1차) KCI등재후보
      2002-07-01 평가 등재후보학술지 선정 (신규평가) KCI등재후보
      더보기

      학술지 인용정보

      학술지 인용정보
      기준연도 WOS-KCI 통합IF(2년) KCIF(2년) KCIF(3년)
      2016 1.35 0.6 1.07
      KCIF(4년) KCIF(5년) 중심성지수(3년) 즉시성지수
      0.88 0.73 0.388 0.04
      더보기

      이 자료와 함께 이용한 RISS 자료

      나만을 위한 추천자료

      해외이동버튼