Dynamic Algorithm Configuration

TNT members involved in this project:

As configurations should be chosen during runtime depending on the current algorithm state, it can be viewed as a reinforcement learning (RL) problem where at each timestep an agent selects the configuration to use based on the performance in the last step and the current state of the algorithm. This enables us to use powerful RL methods on one hand; on the other, RL also brings a set of challenges like instability, noise and sample inefficiency that need to be adressed in applications such as DAC. Therefore research on DAC also includes research on reliable, interpretable, general and fast reinforcement learning.

Challenges

Large Search Space

Adapting one hyperparameter each step of an algorithm run already provides a plethora of possible hyperparameter schedules. The combination of dynamic adadptions and the optimization of multiple hyperparameters and algorithm components makes the problem much harder than the original static algorithm configuration problem which requires new optimization methods.

Complex Reasoning

It is often not immediately clear which action at which timestep will cause an improvement in an algorithm's run down the line. Any DAC method therefore needs to model complex dependecies between algorithm states, domain & instance information, performance and configurations. This requires solutions to extract knowledge from the given information to guide the learning process efficiently.

New State & Domain Descriptions

As DAC includes a temporal component in contrast to traditional algorithm configuration, DAC also requires additional information for the learning process. The state information as well as the domain information should include meaningful descriptions that help distinguish between different configuration options. Finding new ways to quantify this information content for DAC methods will help to explain their performance and behaviour as well as improve and accelerate learning.

Exemplary Application Areas

AI Planning

In AI planning (e.g., solving complex scheduling or logistic problems), efficiency is key. Therefore an important hyperparameter is the heuristic that is used to solve the planning problem. Learning heuristic schedules for planning domains is an example of a successful application of DAC, but there are also other possibilities like learning new search strategies from algorithm components or expanding the heuristic schedules across domain boundaries.

Evolutionary Algorithms

To solve complex black-box optimization problems, evolutionary algorithms are well established methods. There are several theoretical results proving that dynamic step size adaption for evolutionary strategies is superior to static values. In fact, it has been empirically shown that it also performs better than simple hand-designed schedules. There are a number of different hyperparameters and algorithm components that determine the behaviour of an ES; however, so there is a variety of possiblities to explore with DAC.

Deep Learning

One of the key technologies in AI is deep learning these days. However, the training process of deep neural networks is often guided by hand-designed heuristics, e.g., learning rate schedules. DAC opens up way to learn these heuristics from meta-data describing the learning process and thus speeding and improving learning quality in a robust fashion.

Funding

The young, but very promising field of DAC enables us to address very basic research questions, but also to apply new insights and methods to exciting applications. Regarding the basic research for the development of new methods, we are funded by the DFG. To address applications, another industry partner is also providing funding.

Recent Publications

Show all publications

Steven Adriaensen, André Biedenkapp, Gresa Shala, Noor Awad, Theresa Eimer, Marius Lindauer, Frank Hutter
Automated Dynamic Algorithm Configuration
ArXiv, May 2022
(arXiv.org) BibTeX
André Biedenkapp, David Speck, Silvan Sievers, Frank Hutter, Marius Lindauer, Jendrik Seipp
Learning Domain-Independent Policies for Open List Selection
Proceedings of the 3rd ICAPS workshop on Bridging the Gap Between AI Planning and Reinforcement Learning (PRL), pp. 1-9, 2022
BibTeX
David Speck, André Biedenkapp, Frank Hutter, Robert Mattmüller, Marius Lindauer
Learning Heuristic Selection with Dynamic Algorithm Configuration
Proceedings of the 31st International Conference on Automated Planning and Scheduling {(ICAPS'21)}, August 2021
(pdfarXiv.org) BibTeX