GOOSE: Learning Heuristics for Planning and Search

Dillon Z. Chen


Abstract

Automated planning is a long-standing field in artificial intelligence concerned with finding a sequence of actions to progress an initial state in a given world model into a desired goal state. A dominant approach for solving these problems is heuristic search, where a search algorithm explores the vast space of possible states guided by a heuristic function, h(s), which estimates the cost of reaching the goal from any given state s. The performance of these planners is critically dependent on the quality of this heuristic.

GOOSE (Graphs Optimised fOr Search Evaluation) is a planning system that learns state-of-the-art heuristics from small, easy-to-solve training problems to new, unseen problems with different initial states, goals, and greater number of objects. GOOSE is a cumulation of papers published at AAAI, ICAPS, NeurIPS, and ECAI (see References for more details). It leverages 3 key insights and techniques for learning heuristics with powerful generalisation.

  1. We leverage graph learning for handling inputs of arbitrary size, as is the case for planning problems.
  2. We employ classical, statistical machine learning, which we show theoretically and empirically to be vastly superior than deep learning in terms of sample complexity, training speed, and planning performance
  3. We learn heuristics from ranking states rather than predicting optimal heuristic values


Problem

We are given a set of training planning problems T from the same domain exhibiting a small number of objects N. Our objective is to learn via supervision a heuristic function h(s) that can be used for search to solve new problems that
  1. do not exist in T, and
  2. exhibit a number of objects greater than N, possibly by several orders of magnitude.

Method

GOOSE represents planning problems as graphs, followed by embedding the arbitrarily sized graphs into fixed sized Euclidean vectors, which are finally used to learn a linear heuristic function.

Figure 1 — The GOOSE pipeline.

(1) Graph Representation

GOOSE represents planning problems which consist of an initial state, a goal state, a set of objects, and a set of actions as a graph where nodes encode objects and facts, and edge the relations between objects and facts. Node colours and edge labels are used to encode predicate, goal, and relation information. Notably, actions are not encoded in the graph as their semantics are learned from training. During search, each state can be viewed as a new planning problem with the same goal and actions but a different initial state. Formally, this graph representation is referred to the Instance Learning Graph (ILG) defined in Definition 3.1 of this paper.

(2) Feature Embedding

The resulting graph of each planning problem and state is then embedded into a Euclidean vector via the Weisfeiler-Leman (WL) algorithm. The WL algorithm was originally used as an incomplete graph isomorphism test but has been later used to generate graph kernels and benchmark graph neural network (GNN) expressivity. The WL algorithm iteratively refines node colours by aggregating the colours of neighbouring nodes, and after a fixed number of iterations, the histogram of node colours is used as the graph embedding.

(3) Optimisation

Finally, GOOSE learns a linear heuristic function h(s) = w · f(s) where f(s) is the WL embedding of the ILG of state s and w is a weight vector learned from training. Originally, GOOSE learned w via predicting optimal heuristic values using Gaussian Process Regression and Support Vector Regression. Later, we leverage the insight that heuristic functions in search can be viewed as ranking functions. Later versions of GOOSE learn heuristics framed as a state rankings with Linear Programs and Support Vector Machines.


Results

In our experiments, GOOSE heuristics are trained on planning problems of up to few dozen objects and tested on unseen problems of up to several hundreds of objects. The learned heuristics (WL) are compared against the hFF domain-independent heuristic and heuristic learned with GNNs in greedy best first search.

Figure 2 — Sizes of benchmark training and testing problems on various planning domains.

WL heuristics outperform deep learning heuristics for training

Notably, WL heuristics are trained in a matter of seconds optimally with CPUs, while GNNs may take many more seconds or minutes to converge to a suboptimal solution on GPUs. In some cases, WL heuristics are over 900 times faster to train than GNN heuristics. Furthermore, WL heuristics exhibit much smaller models and its features are interpretable (cf. page 8 of this paper) in contrast to deep learning models.

Figure 3 — Time to train after data preprocessing in log scale (left;↓) and number of learnable parameters (right;↓) of GNN and WL models on various planning domains.

WL heuristics outperform deep learning heuristics for planning

WL heuristics also outperform GNN heuristic when used for planning and search, again where GNNs also have access to GPUs. The state spaces of planning problems generally grow exponentially with the number of objects, and yet WL heuristics manage to solve almost 20% more problems than GNN heuristics within the same time limit. Furthermore, WL heuristics achieve highly competitive performance against the domain-independent and state-of-the-art hFF heuristic.

Figure 4 — Cumulative number (↑) of problems solved over time.


References

Below are relevant papers on GOOSE published at major AI and ML conferences in chronological order.



© 2025 Dillon Z. Chen. All rights reserved.