目次
?) approximated as a Markov process with a finite number of states.- 5.3 A useful policy iteration algorithm, for discounted (? 2.- 5.4 The infinite horizon without discounting for partially observable Markov processes.- 5.4.1 Model formulations.- 5.4.2 Cost of a stationary policy.- 5.4.3 Policy improvement phase.- 5.4.4 Policy iteration algorithm.- 5.5 Partially observable semi-Markov decision processes.- 5.5.1 Model formulation.- 5.5.2 State dynamics.- 5.5.3 The observation space.- 5.5.4 Overall system dynamics.- 5.5.5 Decision alternatives in clinical disorders.- 5.6 Risk-sensitive partially observable Markov decision processes.- 5.6.1 Model formulation and practical examples.- 5.6.1.1 Maintenance policies for a nuclear reactor pressure vessel.- 5.6.1.2 Medical diagnosis and treatment as applied to physiological systems.- 5.6.2 The stationary Markov decision process with probabilistic observations of states.- 5.6.3 A branch and bound algorithm.- 5.6.4 A Fibonacci search method for a branch and bound algorithm for a partially observable Markov decision process.- 5.6.5 A numerical example.- 6 Policy Constraints in Markov Decision Processes.- 6.1 Methods of investigating policy costraints in Markov decision processes.- 6.2 Markov decision processes with policy constraints.- 6.2.1 A Lagrange multiplier formulation.- 6.2.2 Development and convergence of the algorithm.- 6.2.3 The case of transient states and periodic processes.- 6.3 Risk-sensitive Markov decision process with policy constraints.- 6.3.1 A Lagrange multiplier formulation.- 6.3.2 Development and convergence of the algorithm.- 7 Applications.- 7.1 The emergency repair control for electrical power systems.- 7.1.1 Reliability and system effectiveness.- 7.1.2 Reward structure.- 7.1.3 The Markovian decision process for emergency repair.- 7.1.4 Linear programming formulation for repair optimization.- 7.1.5 The investment problem.- 7.2 Stochastic models for evaluation of inspection and repair schedules [2].- 7.2.1 Inspection actions.- 7.2.1.1 Complete inspection.- 7.2.1.2 Control limit inspection.- 7.2.1.3 Inspection.- 7.2.2 Markov chain models.- 7.2.3 Cost structures and operating requirements.- 7.2.3.1Inspection costs.- 7.2.3.2 Repair costs.- 7.2.3.3 Operating costs and requirements.- 7.2.3.4 Inspection and repair policies.- 7.2.3.5 Closed loop policies.- 7.2.3.6 Updating state probabilities after an inspection.- 7.2.3.7 Obtaining next-time state probabilities using transition matrix.- 7.2.3.8Open loop policies.- 7.3 A Markovian dicision model for clinical diagnosis and treatment applied to the respiratory system.- 7.3.1 Concept of state in the respiratory system.- 7.3.2 The clinical observation space.- 7.3.3. Computing probabilities in cause-effect models and overall system dynamics.- 7.3.4 Decision alternatives in respiratory disorders.- 7.3.4.1 Branch and bound algorithm.- 7.3.4.2 Steps in the branch and bound algorithm.- 7.3.5 A numerical example for the respiratory system.- 7.3.6 Concllusions.