Deep Learning for Automated Planning -- Multiple Projects
People
Supervisor
Description
- Prof Sylvie Thiebaux (computer science)
Planning is one main areas of AI. The most basic problem in planning consists in generating a course of action -- a plan, or a policy -- enabling an agent to achieve given goals in its environment, starting from a known initial state. Planners are general purpose solvers that take as input the description of the problem to solve (this describes the actions available to the agent and their effects on the environment), and return a plan (or policy) achieving the problem's goal. Planners typically use search or constrained optimisation to solve the problem.
Deep learning has become the method of choice for perception tasks in computer vision or natural language processing. However, whether and how deep learning can learn to solve reasoning tasks such as planning, is very much an open question. Deep learning has mainly been used in reinforcement learning, which enables an agent to learn how to act in its environment by interacting with it and receiving rewards for good behaviour. Unfortunately this requires very large amount of data and CPU, is unsafe, doesn't easily capture and exploit existing knowledge of the problem, and is overall difficult to apply to real world problems other than computer games. Instead, new approaches are needed that combine reasoning and learning.
We have started to tackle this challenge with two previous ANU honours students, now doing their respective PhDs at Berkeley and MIT. We have devised new neural network architectures which exploit the structure of planning representations to learn policies and planning heuristics (cost estimators to guide the search of a planner) from examples of plans. Exploiting the structure of the planning domain reduces significantly the amount of training needed compared to reinforcement learning approaches. Another interesting aspect of the learnt policies is that they generalise to all problems within the same planning domain, including problems of different size. That way, one can learn policies or heuristics from a few examples plans of small problems and apply the result to larger problems from the same planning domain. The results on a range of planning benchmark problems are encouraging.
This has opened many avenues of research, on how to improve the efficiency and power of these architectures, and on verifying and explaining their results. This leads to the research topics below.
Goals
- Integrating learning with search. For instance, a search procedure guided by a neural network heuristic or policy could generate its own examples to improve the guiding policy/heuristic at the same time as solving the problem.
- Learning to solve more expressive planning problems with deep learning. For instance, problems with resources or partial observability.
- More effective deep learning architectures for planning. For instance, architectures that exploit the relational structure of planning problems representations.
- Verification of policies represented by neural networks. This could verify how safe, how robust, or how general a policy is.
- Handling constraints in deep learning for planning.
- Predict+Optimise for planning. Predict+Optimise is a paradigm to solve optimisation problems for which some of the input parameters must be predicted using ML.
- Interpreting and Explaining neural network policie
Requirements
- Honours/MCOMP/MLCV students should have taken Artificial Intelligence (COMP3620/COMP6320) and one of the ML courses (COMP4670/COMP8600 or COMP4660/COMP8420) or equivalent, with excellent results.
- Having taken one of the Optimisation courses (COMP4691/COMP8691 or COMP4680/COMP8650) or the Advanced AI class (COMP4620/COMP8620) would be a plus.
- Excellent programming skills
Background Literature
- [topics 1,2,7] Sam Toyer, Sylvie Thiébaux, Felipe W. Trevizan, Lexing Xie: ASNets: Deep Learning for Generalised Planning. J. Artif. Intell. Res. 68: 1-68 (2020). Short version AAAI 2018.
- [topics 1,2,3] William Shen, Felipe W. Trevizan, Sylvie Thiébaux: Learning Domain-Independent Planning Heuristics with Hypergraph Networks. ICAPS 2020: 574-584.
- [topics 1,2,3] William Shen, Felipe W. Trevizan, Sam Toyer, Sylvie Thiébaux, Lexing Xie: Guiding Search with Generalized Policies for Probabilistic Planning. SOCS 2019: 97-105
- [topic 1] André Hottung, Kevin Tierney: Neural Large Neighborhood Search for the Capacitated Vehicle Routing Problem. ECAI 2020: 443-450
- [topic 3] Sankalp Garg, Aniket Bajpai, Mausam: Symbolic Network: Generalized Neural Policies for Relational MDPs. ICML 2020: 3397-340
- [topic 4] Changliu Liu, Tomer Arnon, Christopher Lazarus, Christopher Strong, Clark Barrett, Mykel J. Kochenderfer, Algorithms for Verifying Deep Neural Networks, arxiv, 2020.
- [topic 4] Timo P. Gros, Holger Hermanns, Jörg Hoffmann, Michaela Klauck, Marcel Steinmetz: Deep Statistical Model Checking. FORTE 2020: 96-114
- [topic 4] Marcel Vinzent and Jörg Hoffmann: Neural Network Action Policy Verification via Predicate Abstraction, Workshop on Planning and Reinforcement Learning, PRL-21.
- [topic 5] Yatin Nandwani, Abhishek Pathak, Mausam and Parag Singla, A Primal-Dual Formulation for Deep Learning with Constraints NeurIPS 2019: 12157-12168.
- [topic 5] Fabrizio Detassis, Michele Lombardi, Michela Milano: Teaching the old dog new tricks: supervised learning with constraints. NeHuAI@ECAI 2020: 44-51
- [topic 6] Bryan Wilder, Bistra Dilkina, Milind Tambe: Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization. AAAI 2019: 1658-1665
- [topic 6] Jayanta Mandi, Emir Demirovic, Peter J. Stuckey, Tias Guns: Smart Predict-and-Optimize for Hard Combinatorial Optimization Problems. AAAI 2020: 1603-1610
- [topic 6] Giuseppe De Giacomo, Luca Iocchi, Marco Favorito, Fabio Patrizi: Foundations for Restraining Bolts: Reinforcement Learning with LTLf/LDLf Restraining Specifications. ICAPS 2019: 128-136
- [topic 7] Alexey Ignatiev, Nina Narodytska, João Marques-Silva: Abduction-Based Explanations for Machine Learning Models. AAAI 2019: 1511-1519
- [topic 7] Alexey Ignatiev, Nina Narodytska, João Marques-Silva: On Relating Explanations and Adversarial Examples. NeurIPS 2019: 15857-15867
- [topic 7] Rebecca Eifler, Michael Cashmore, Jörg Hoffmann, Daniele Magazzeni, Marcel Steinmetz: A New Approach to Plan-Space Explanation: Analyzing Plan-Property Dependencies in Oversubscription Planning. AAAI 2020: 9818-9826
Gain
- You will pursue state-of-the-art research in Artificial Intelligence.
- The work, if successful, is (hopefully) going to be published in a top venue.
- If successful, we can easily extend this to a PhD project.
Keywords
- Planning
- Markov Decision Processes
- Deep learning
- Graph Neural Networks
- Search
- Optimisation
- Verification
- Reinforcement Learning