IEEE Transactions on Automatic Control, Vol.46, No.1, 96-100, 2001
A probabilistic analysis of bias optimality in unichain Markov decision processes
This paper focuses on bias optimality in unichain, finite state, and action-space Markov decision processes, Using relative value functions, we present new methods for evaluating optimal bias. This leads to a probabilistic analysis which transforms the original reward problem into a minimum average cost problem. The result is an explanation of how and why bias implicitly discounts future rewards.