First-order sensitivity of the optimal value in a Markov decision model with respect to deviations in the transition probability function

Abstract

Markov decision models (MDM) used in practical applications are most often less complex than the underlying 'true' MDM. The reduction of model complexity is performed for several reasons. However, it is obviously of interest to know what kind of model reduction is reasonable (in regard to the optimal value) and what kind is not. In this article we propose a way how to address this question. We introduce a sort of derivative of the optimal value as a function of the transition probabilities, which can be used to measure the (first-order) sensitivity of the optimal value w.r.t. changes in the transition probabilities. 'Differentiability' is obtained for a fairly broad class of MDMs, and the 'derivative' is specified explicitly. Our theoretical findings are illustrated by means of optimization problems in inventory control and mathematical finance.