Deep Learning for Automated Planning -- Multiple Projects




Planning is one main areas of AI. The most basic problem in planning consists in generating a course of action -- a plan, or a policy -- enabling an agent to achieve given goals in its environment, starting from a known initial state. Planners are general purpose solvers that take as input the description of the problem to solve (this describes the actions available to the agent and their effects on the environment), and return a plan (or policy) achieving the problem's goal. Planners typically use search or constrained optimisation to solve the problem.

Deep learning has become the method of choice for perception tasks in computer vision or natural language processing. However, whether and how deep learning can learn to solve reasoning tasks such as planning, is very much an open question. Deep learning has mainly been used in reinforcement learning, which enables an agent to learn how to act in its environment by interacting with it and receiving rewards for good behaviour. Unfortunately this requires very large amount of data and CPU, is unsafe, doesn't easily capture and exploit existing knowledge of the problem, and is overall difficult to apply to real world problems other than computer games. Instead, new approaches are needed that combine reasoning and learning.

We have started to tackle this challenge with two previous ANU honours students, now doing their respective PhDs at Berkeley and MIT. We have devised new neural network architectures which exploit the structure of planning representations to learn policies and planning heuristics (cost estimators to guide the search of a planner) from examples of plans. Exploiting the structure of the planning domain reduces significantly the amount of training needed compared to reinforcement learning approaches. Another interesting aspect of the learnt policies is that they generalise to all problems within the same planning domain, including problems of different size. That way, one can learn policies or heuristics from a few examples plans of small problems and apply the result to larger problems from the same planning domain. The results on a range of planning benchmark problems are encouraging.

This has opened many avenues of research, on how to improve the efficiency and power of these architectures, and on verifying and explaining their results. This leads to the research topics below.


The student will work on one of the following research topics:
  1. Integrating learning with search. For instance, a search procedure guided by a neural network heuristic or policy could generate its own examples to improve the guiding policy/heuristic at the same time as solving the problem.
  2. Learning to solve more expressive planning problems with deep learning. For instance, problems with resources or partial observability.
  3. More effective deep learning architectures for planning. For instance, architectures that exploit the relational structure of planning problems representations.
  4. Verification of policies represented by neural networks. This could verify how safe, how robust, or how general a policy is.
  5. Handling constraints in deep learning for planning.
  6. Predict+Optimise for planning. Predict+Optimise is a paradigm to solve optimisation problems for which some of the input parameters must be predicted using ML.
  7. Interpreting and Explaining neural network policie


  • Honours/MCOMP/MLCV students should have taken Artificial Intelligence (COMP3620/COMP6320) and one of the ML courses (COMP4670/COMP8600 or COMP4660/COMP8420) or equivalent, with excellent results.
  • Having taken one of the Optimisation courses (COMP4691/COMP8691 or COMP4680/COMP8650) or the Advanced AI class (COMP4620/COMP8620) would be a plus.
  • Excellent programming skills

Background Literature

Depending on the topic chosen, there will be additional references.


  • You will pursue state-of-the-art research in Artificial Intelligence.
  • The work, if successful, is (hopefully) going to be published in a top venue.
  • If successful, we can easily extend this to a PhD project.


  • Planning
  • Markov Decision Processes
  • Deep learning
  • Graph Neural Networks
  • Search
  • Optimisation
  • Verification
  • Reinforcement Learning

Updated:  10 August 2021/Responsible Officer:  Dean, CECS/Page Contact:  CECS Marketing