METU | Course Syllabus

Students who pass the course satisfactorily will be able to

identify the problem structures that satisfy the principle of optimality (identify stage and state of the problem),
decompose a problem into a sequence of manageable (smaller) subproblems,
construct (recursive) backward and forward Dynamic Programming (DP) models,
solve a given problem by finding optimal solutions of a sequence of subproblems,
construct deterministic and stochastic DP models,
identify network, allocation, gambling, stock-option and inventory models that can be solved using DP formulations,
identify the optimal policy structures by investigating DP formulations analytically,
investigate monotonicity of the optimal policy,
identify the trade-off between short-term and long-term yields,
use Bayes’ law to incorporate learning into DP models,
use stochastic ordering of random variables to determine optimal threshold levels,
formulate and solve Bandit problems for various real-life applications,
develop Markov Decision Process (MDP) models under total, discounted and average payoff criteria,
solve MDPs using Linear Programming, Policy Iteration and Value Iteration Algorithms,
identify deterministic, randomized, Markovian, stationary and nonstationary policies.