Seminar by Prof. Selva Nadarajah, University of Illinois at Chicago

Thumbnail

Event details

Date 09.07.2018
Hour 15:0016:30
Speaker Prof. Selva Nadarajah, University of Illinois at Chicago
Location
Category Conferences - Seminars
"Revisiting approximate linear programming using a saddle point approach"


Abstract:
Approximate linear programs (ALPs) are well-known models for computing value function approximations (VFAs) for intractable Markov decision processes (MDPs) arising in applications. VFAs from ALPs have desirable theoretical properties, define an operating policy, and provide a lower bound on the optimal policy cost, which can be used to assess the suboptimality of heuristic policies. However, solving ALPs near-optimally remains challenging, for example, in applications where the MDP includes cost functions or transition dynamics that are nonlinear or when rich basis functions are required to obtain a good VFA. We address this tension between ALP theory and solvability by proposing a convex saddle-point reformulation of an ALP that includes as primal and dual variables, respectively, a vector of basis function weights and a constraint violation density function over the state-action space. To solve this reformulation, we develop a proximal stochastic mirror descent (PSMD) method. We establish that PSMD returns a near-optimal ALP solution and a lower bound on the optimal policy cost in a finite number of iterations with high probability. We compare PSMD with the commonly used constraint sampling approach to solve ALPs and benchmark heuristics on inventory control and energy storage applications, where using row generation is not a viable option. We find that PSMD provides reliable and high-quality lower bounds that are tighter than lower bounds based on a perfect information relaxation, while using constraint sampling to solve ALPs may not provide a valid lower bound. PSMD policies outperform benchmark heuristics and are comparable or better than the ones obtained using constraint sampling. Our ALP reformulation and solution approach broadens the applicability of approximate linear programming.