화학공학소재연구정보센터
SIAM Journal on Control and Optimization, Vol.57, No.2, 880-899, 2019
LONG RUN CONTROL WITH DEGENERATE OBSERVATION
We consider average reward per unit time problem of controlled partially observed discrete time Markov processes in the case where the only observation of the state is in the form of a deterministic function of the current state of the process. Under nice ergodic assumptions we first solve the problem with at most countable observation space. Then we generalize it to uncountable observation space in which we do not have the explicit form of the filtering process and the corresponding controlled transition operator. We also study the case of noisy but still degenerate observation.