IEEE Transactions on Automatic Control, Vol.44, No.8, 1583-1587, 1999
Steering policies for controlled Markov chains under a recurrence condition
The authors consider the class of steering policies for controlled Markov chains under a recurrence condition. A steering policy is defined as one adaptively alternating between two stationary policies in order to track a sample average cost to a desired value. Convergence of the sample average costs is derived via direct sample path arguments, and the performance of the steering policy is discussed. Steering policies are motivated by, and particularly useful in, the discussion of constrained Markov chains with a single constraint.
Keywords:OPTIMAL PRIORITY ASSIGNMENT;CONSTRAINT