IEEE Transactions on Automatic Control, Vol.46, No.5, 687-697, 2001
Optimal and asymptotically optimal decision rules for sequential screening and resource allocation
We consider the problem of maximizing the expected sum of n variables X-k chosen sequentially in an i.i.d. sequence of length N. It is equivalent to the following resource allocation problem: n machines have to be allocated to N jobs of value X-k (k = 1,..., N) arriving sequentially, the ith machine has a (known) probability p(i) to perform the job successfully and the total expected reward must be maximized. The optimal solution of this stochastic dynamic-programming problem is derived when the distribution of the X(k)s is known: the optimal threshold for accepting X-k, Or allocating the ith machine to job k, is given by a backward recurrence equation. This optimal solution is compared to the simpler (but suboptimal) open-loop feedback-optimal solution for which the threshold is constant, and their asymptotic behaviors are investigated. The asymptotic behavior of the optimal threshold is used to derive a simple open-loop solution, which is proved to be asymptotically optimal (N --> infinity with n fixed) for a large class of distributions for X-k.
Keywords:dynamic programming;resource allocation;sequential experimental design;sequential screening;stochastic assignment