CPS590.03 Optimization and Decision-making under Uncertainty

Instructor: Kamesh Munagala

    Tue @ Soc. Sci 327  and Thu @ Gray 220,    3:05-4:20 pm

Spring 2016

Markov Decision Process

This course was offered under the title "Sequential Decision Theory: Algorithms, Policies, and Games" in Fall 2009.

Lecture Notes:
The references are omitted, and easy to deduce from the links with the course schedule.

Course Outline:
In several modern applications such as internet auctions, robot navigation, wireless communication, social networks, and database query processing, inputs are known only with associated uncertainty. This could be because the inputs arrive over time, or because the optimization decisions need to be made with probabilistic models of inputs before they are exactly realized. In different application scenarios, the specific problems that arise are different; however, the models, algorithmic paradigms, and solution techniques needed to address them are often similar.

In this course, we will develop general algorithmic tools for decision making under uncertainty in such diverse contexts. The techniques we present have their roots in diverse disciplines - Statistics, Optimization, Decision theory, Microeconomics, and Theoretical Computer Science. Though the course is theoretical in nature, almost all problems considered have significant practical motivation which will also be highlighted.

Topics Covered:

Grading: Homeworks (40%), Scribing lecture notes (25%), Project (35%).

Project: Project suggestions are available via Sakai. The project can involve either a theoretical problem, or involve application to a systems or economics problem. In either case, I would prefer it if the project developed at least some theoretical understanding. You can also pick a collection of papers to read and summarize. I would prefer groups of no more than two people. The output should be a report and a presentation.

Scribing: Each of you would have to scribe three lectures. The templates for scribing are on Sakai. Please download these into one directory and edit/rename/compile lecture1.tex. Please read the first 6 pages of Mathematical Writing before you scribe.

Reading Material:
No textbook is required. I will mainly use recent papers and surveys, and lecture notes that you will help scribe.
Check out related courses offered at Stanford and UPenn.

Lecture Schedule: Here is a tentative schedule of topics. Note that there will be almost no coverage of queueing theory, reinforcement learning, and financial models, all of which are important in their own right. Furthermore, I will choose to present simple and intuitive algorithms and proofs over optimal but more complicated algorithms.

Lecture NumberDateTopicReadingsAdditional Readings
1Jan. 14Course organization and logistics
Applications of stochastic optimization
Warmup: Sensor placement  to maximize Entropy
[KG], [Survey]

Stochastic Optimization and Approximation Algorithms

2Jan. 19
Entropy and Kraft's inequality
Submodularity and the Greedy Algorithm

[NWF][FMV], [BFNS], [BFNS2]: Non-monotone submodular functions
[BENW],[MZ]: Distributed algorithms
3Jan. 21More examples of submodularity:
Influence in social networks, Extreme values
[KKT][MR]: Shows influence is submodular
4Jan. 26
Stochastic set cover:
Adaptive vs. non-adaptive algorithms


Jan. 28

Adaptive greedy algorithm and duality

[Das]: Adaptive greedy algorithm for active learning
[AH], [CPRS]: Adaptive greedy decision tree construction
[KKM]: Evaluating monotone CNF/DNF formulae
Feb. 2
Prophet Inequalities and Linear Programming:
Optimal policies and LP Relaxations
[KW]: Matroid prophet inequalities
Feb. 4
LP Rounding and duality

[DGV]: Stochastic knapsack and adaptivity gaps
Feb. 9
Stochastic matchings and LP-based Policies
[MSU]: Stochastic scheduling with precedence constraints
Auction theory: [AGT, Ch. 9], [BCMX], [CHK], [CHMS], [BILW]
Feb. 11
Markov Decision Processes
Bellman's equation, Value Iteration, Bandit problems


Feb. 16

Greedy revisited: The Gittins Index Theorem

[FW]: Four proofs of the Gittins index theorem
[FKKK], [GM]:  Crowdsourcing, auctions, and the Gittins index
[GM09b], [GKN], [GMS], [GKMR]:  Bandits and approximation

Online Learning and Prediction Problems

Feb. 18Online Prediction and No-regret Learning
The Randomized Weighted Majority (RWM) algorithm
[AGT, Ch. 4]
[Sec], [K], [BIKK], [KP]: The secretary model

Feb. 23
No-regret learning via regularization:
Hannan's Follow the Regularized Leader (FTRL) Algorithm
[Haz, Ch. 5]
[K-notes], [ABH], [FV]: Blackwell's Approachability
Feb. 25
Gradient descent in convex spaces
[Haz, Ch. 2]
[AGT, Ch. 4], [ACFS]: Adversarial multi-armed bandits
[FKM]: Convex multi-armed bandits
[ACF], [AG1],[AG2],[AG3],[GM]: Stochastic Multi-armed bandits
Mar. 1
No-regret learning via gradient descent
Tight lower bounds for regret

[Haz, Ch. 3]

Mar. 3
Applications: Stochastic gradient descent
Agnostic PAC learning
[Haz, Ch. 9] [CCP], [KSH]: Sample-Average Approximation (SAA)
16Mar. 8
Zero sum games via no-regret learning

[Haz, Ch. 7]

[BC-B]: Survey of bandit algorithms
[BSS]: Truthful mechanisms for keyword auctions
[Times]: The human mind also trades off exploration and exploitation!
17Mar. 10
Solving Linear Programs via No-regret learning
[AHK] [LN], [You], [AK], [BGM], [WMMR]: Parallel implementations
[AD]: Online LPs with stochastic inputs
[BGM]: Algorithm for Optimal Bayesian Auctions


Online Algorithms

Mar. 22
Online Data Compression
Dynamic Huffman coding, Move-to-front, Lempel-Ziv
Mar. 24
Online Algorithms with Adversarial Inputs:
Ski Rental, List update, Potential functions
[Albers][MMAW], [FL], [CGMP]: Switch scheduling
[IKM1, IKM2]: General models for scheduling
Mar. 29
Deterministic and Randomized algorithms for Paging
Mar. 31
Yao's theorem and lower bounds
Online load balancing

22Apr. 5
Online Steiner trees and the Greedy algorithm
Online matchings and Dual Fitting
Note the nice lower bound construction in [IW]
[GGLS]: Stochastic Steiner trees
Apr. 7
Randomized matching and optimal dual fitting

24Apr. 12
Primal-dual algorithms: Ski Rental, Budgeted allocations
25Apr. 14Online set cover
26Apr. 19