Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

Guo XP; Huang YH; Zhang Y

Applied Mathematics and Optimization, Vol.75, No.2, 317-341, 2017

DOI10.1007/s00245-016-9352-6 Export Citation

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

This paper studies the constrained (nonhomogeneous) continuous-time Markov decision processes on the finite horizon. The performance criterion to be optimized is the expected total reward on the finite horizon, while N constraints are imposed on similar expected costs. Introducing the appropriate notion of the occupation measures for the concerned optimal control problem, we establish the following under some suitable conditions: (a) the class of Markov policies is sufficient; (b) every extreme point of the space of performance vectors is generated by a deterministic Markov policy; and (c) there exists an optimal Markov policy, which is a mixture of no more than N + 1 deterministic Markov policies.

Keywords:Continuous-time Markov decision process;Constrained-optimality;Finite horizon;Mixture of N+1 deterministic Markov policies;Occupation measure