Dynamic Algorithm Configuration

TNT members involved in this project:

As configurations should be chosen during runtime depending on the current algorithm state, it can be viewed as a reinforcement learning (RL) problem where at each timestep an agent selects the configuration to use based on the performance in the last step and the current state of the algorithm. This enables us to use powerful RL methods on one hand; on the other, RL also brings a set of challenges like instability, noise and sample inefficiency that need to be adressed in applications such as DAC. Therefore research on DAC also includes research on reliable, interpretable, general and fast reinforcement learning.

Challenges

Large Search Space

Adapting one hyperparameter each step of an algorithm run already provides a plethora of possible hyperparameter schedules. The combination of dynamic adadptions and the optimization of multiple hyperparameters and algorithm components makes the problem much harder than the original static algorithm configuration problem which requires new optimization methods.

Complex Reasoning

It is often not immediately clear which action at which timestep will cause an improvement in an algorithm's run down the line. Any DAC method therefore needs to model complex dependecies between algorithm states, domain & instance information, performance and configurations. This requires solutions to extract knowledge from the given information to guide the learning process efficiently.

New State & Domain Descriptions

As DAC includes a temporal component in contrast to traditional algorithm configuration, DAC also requires additional information for the learning process. The state information as well as the domain information should include meaningful descriptions that help distinguish between different configuration options. Finding new ways to quantify this information content for DAC methods will help to explain their performance and behaviour as well as improve and accelerate learning.

Exemplary Application Areas

AI Planning

In AI planning (e.g., solving complex scheduling or logistic problems), efficiency is key. Therefore an important hyperparameter is the heuristic that is used to solve the planning problem. Learning heuristic schedules for planning domains is an example of a successful application of DAC, but there are also other possibilities like learning new search strategies from algorithm components or expanding the heuristic schedules across domain boundaries.

Evolutionary Algorithms

To solve complex black-box optimization problems, evolutionary algorithms are well established methods. There are several theoretical results proving that dynamic step size adaption for evolutionary strategies is superior to static values. In fact, it has been empirically shown that it also performs better than simple hand-designed schedules. There are a number of different hyperparameters and algorithm components that determine the behaviour of an ES; however, so there is a variety of possiblities to explore with DAC.

Deep Learning

One of the key technologies in AI is deep learning these days. However, the training process of deep neural networks is often guided by hand-designed heuristics, e.g., learning rate schedules. DAC opens up way to learn these heuristics from meta-data describing the learning process and thus speeding and improving learning quality in a robust fashion.

Funding

The young, but very promising field of DAC enables us to address very basic research questions, but also to apply new insights and methods to exciting applications. Regarding the basic research for the development of new methods, we are funded by the DFG. To address applications, another industry partner is also providing funding.

Publications

Show recent publications only

Conference Contributions
- André Biedenkapp, David Speck, Silvan Sievers, Frank Hutter, Marius Lindauer, Jendrik Seipp
  Learning Domain-Independent Policies for Open List Selection
  Proceedings of the 3rd ICAPS workshop on Bridging the Gap Between AI Planning and Reinforcement Learning (PRL), pp. 1-9, 2022
  BibTeX
- David Speck, André Biedenkapp, Frank Hutter, Robert Mattmüller, Marius Lindauer
  Learning Heuristic Selection with Dynamic Algorithm Configuration
  Proceedings of the 31st International Conference on Automated Planning and Scheduling {(ICAPS'21)}, August 2021
  (pdfarXiv.org) BibTeX
- Theresa Eimer, André Biedenkapp, Maximilian Reimer, Steven Adriaensen, Frank Hutter, Marius Lindauer
  DACBench: A Benchmark Library for Dynamic Algorithm Configuration
  Proceedings of the international joint conference on artificial intelligence (IJCAI), August 2021
  (pdfarXiv.org) BibTeX
- Andre Biedenkapp, Raghu Rajan, Frank Hutter, Marius Lindauer
  TempoRL: Learning When to Act
  Proceedings of the international conference on machine learning (ICML), July 2021
  (arXiv.org) BibTeX
- Theresa Eimer, Andre Biedenkapp, Frank Hutter, Marius Lindauer
  Self-Paced Context Evaluation for Contextual Reinforcement Learning
  Proceedings of the international conference on machine learning (ICML), July 2021
  (arXiv.org) BibTeX
- Gresa Shala, Andre Biedenkapp, Noor Awad, Steven Adriaensen, Marius Lindauer, Frank Hutter
  Learning Step-Size Adaptation in CMA-ES
  Proceedings of the Sixteenth International Conference on Parallel Problem Solving from Nature ({PPSN}'20), September 2020
  (pdfarXiv.org) BibTeX
- Theresa Eimer, Andre Biedenkapp, Frank Hutter, Marius Lindauer
  Towards Self-Paced Context Evaluations for Contextual Reinforcement Learning
  Workshop on Inductive Biases, Invariances and Generalization in Reinforcement Learning (BIG@ICML'20), July 2020
  (pdf) BibTeX
- Andre Biedenkapp, Raghu Rajan, Frank Hutter, Marius Lindauer
  Towards TempoRL: Learning When to Act
  Workshop on Inductive Biases, Invariances and Generalization in Reinforcement Learning (BIG@ICML'20), July 2020
  (pdf) BibTeX
- David Speck, André Biedenkapp, Frank Hutter, Robert Mattmüller, Marius Lindauer
  Learning Heuristic Selection with Dynamic Algorithm Configuration
  Proceedings of international workshop on Bridging the Gap Between AI Planning and Reinforcement Learning at ICAPS, June 2020
  (arXiv.org) BibTeX
- Andre Biedenkapp, H. Furkan Bozkurt, Theresa Eimer, Frank Hutter, Marius Lindauer
  Algorithm Control: Foundation of a New Meta-Algorithmic Framework
  Proceedings of the European Conference on Artificial Intelligence (ECAI), 2020
  (pdf) BibTeX
Technical Report
- Steven Adriaensen, André Biedenkapp, Gresa Shala, Noor Awad, Theresa Eimer, Marius Lindauer, Frank Hutter
  Automated Dynamic Algorithm Configuration
  ArXiv, May 2022
  (arXiv.org) BibTeX