Approximate Dynamic Programming: Solving the Curses of Dimensionality
Buy Rights Online Buy Rights

Rights Contact Login For More Details

  • Wiley

More About This Title Approximate Dynamic Programming: Solving the Curses of Dimensionality

English

Warren B. Powell, PhD, is Professor of Operations Research and Financial Engineering at Princeton University, where he is founder and Director of CASTLE Laboratory, a research unit that works with industrial partners to test new ideas found in operations research. The recipient of the 2004 INFORMS Fellow Award, Dr. Powell has authored over 100 refereed publications on stochastic optimization, approximate dynamic programming, and dynamic resource management.

English

Preface.

Acknowledgments.

1. The challenges of dynamic programming.

1.1 A dynamic programming example: a shortest path problem.

1.2 The three curses of dimensionality.

1.3 Some real applications.

1.4 Problem classes.

1.5 The many dialects of dynamic programming.

1.6 What is new in this book?

1.7 Bibliographic notes.

2. Some illustrative models.

2.1 Deterministic problems.

2.2 Stochastic problems.

2.3 Information acquisition problems.

2.4 A simple modeling framework for dynamic programs.

2.5 Bibliographic notes.

Problems.

3. Introduction to Markov decision processes.

3.1 The optimality equations.

3.2 Finite horizon problems.

3.3 Infinite horizon problems.

3.4 Value iteration.

3.5 Policy iteration.

3.6 Hybrid valuepolicy iteration.

3.7 The linear programming method for dynamic programs.

3.8 Monotone policies.

3.9 Why does it work?

3.10 Bibliographic notes.

Problems

4. Introduction to approximate dynamic programming.

4.1 The three curses of dimensionality (revisited).

4.2 The basic idea.

4.3 Sampling random variables .

4.4 ADP using the postdecision state variable.

4.5 Lowdimensional representations of value functions.

4.6 So just what is approximate dynamic programming?

4.7 Experimental issues.

4.8 Dynamic programming with missing or incomplete models.

4.9 Relationship to reinforcement learning.

4.10 But does it work?

4.11 Bibliographic notes.

Problems.

5. Modeling dynamic programs.

5.1 Notational style.

5.2 Modeling time.

5.3 Modeling resources.

5.4 The states of our system.

5.5 Modeling decisions.

5.6 The exogenous information process.

5.7 The transition function.

5.8 The contribution function.

5.9 The objective function.

5.10 A measuretheoretic view of information.

5.11 Bibliographic notes.

Problems.

6. Stochastic approximation methods.

6.1 A stochastic gradient algorithm.

6.2 Some stepsize recipes.

6.3 Stochastic stepsizes.

6.4 Computing bias and variance.

6.5 Optimal stepsizes.

6.6 Some experimental comparisons of stepsize formulas.

6.7 Convergence.

6.8 Why does it work?

6.9 Bibliographic notes.

Problems.

7. Approximating value functions.

7.1 Approximation using aggregation.

7.2 Approximation methods using regression models.

7.3 Recursive methods for regression models.

7.4 Neural networks.

7.5 Batch processes.

7.6 Why does it work?

7.7 Bibliographic notes.

Problems.

8. ADP for finite horizon problems.

8.1 Strategies for finite horizon problems.

8.2 Qlearning.

8.3 Temporal difference learning.

8.4 Policy iteration.

8.5 Monte Carlo value and policy iteration.

8.6 The actorcritic paradigm.

8.7 Bias in value function estimation.

8.8 State sampling strategies.

8.9 Starting and stopping.

8.10 A taxonomy of approximate dynamic programming strategies.

8.11 Why does it work?

8.12 Bibliographic notes.

Problems.

9. Infinite horizon problems.

9.1 From finite to infinite horizon.

9.2 Algorithmic strategies.

9.3 Stepsizes for infinite horizon problems.

9.4 Error measures.

9.5 Direct ADP for online applications.

9.6 Finite horizon models for steady state applications.

9.7 Why does it work?

9.8 Bibliographic notes.

Problems.

10. Exploration vs. exploitation.

10.1 A learning exercise: the nomadic trucker.

10.2 Learning strategies.

10.3 A simple information acquisition problem.

10.4 Gittins indices and the information acquisition problem.

10.5 Variations.

10.6 The knowledge gradient algorithm.

10.7 Information acquisition in dynamic programming.

10.8 Bibliographic notes.

Problems.

11. Value function approximations for special functions.

11.1 Value functions versus gradients.

11.2 Linear approximations.

11.3 Piecewise linear approximations.

11.4 The SHAPE algorithm.

11.5 Regression methods.

11.6 Cutting planes.

11.7 Why does it work?

11.8 Bibliographic notes.

Problems.

12. Dynamic resource allocation.

12.1 An asset acquisition problem.

12.2 The blood management problem.

12.3 A portfolio optimization problem.

12.4 A general resource allocation problem.

12.5 A fleet management problem.

12.6 A driver management problem.

12.7 Bibliographic references.

Problems.

13. Implementation challenges.

13.1 Will ADP work for your problem?

13.2 Designing an ADP algorithm for complex problems.

13.3 Debugging an ADP algorithm.

13.4 Convergence issues.

13.5 Modeling your problem.

13.6 Online vs. offline models.

13.7 If it works, patent it!

English

"Perhaps the most appealing aspect of Professor Powell’s book is the fact that it spans both theory and practice...Problems, deemed intractable a few years ago, are now easily solved by using the exhibited techniques in this book. I would strongly recommend the book to any practitioner facing complex, dynamic models involving constantly changing information streams." (IIE Transactions-Operations Engineering, 2008) 

"Focus[es] on the core … of dynamic programming with a simple and clear exposition of the material … while … elevating the standard of the theory."*(Computing Reviews, May 5, 2008)

 "Motivated by examples from modern-day operations research, Approximate Dynamic Programming is an accessible introduction to dynamic modeling and is also a valuable guide for the development of high-quality solutions to problems that exist in operations research and engineering.  (Mathematical Reviews, 2008)

loading