Introduction by Example

This short example will demonstrate how you can use WLPlan for generating features for planning problems and states which you can then use to train a regression model. The corresponding notebook is available here.

A longer example of using WLPlan for training, inference and search in Python is available in this test file. This notebook only contains the training part.

The GOOSE planner provides an optimised usage of WLPlan that implements training in Python, and inference and search in C++.

Setup

We begin by installing and importing some Python packages

  • pymdzcf: a mimir fork for generating state successors

  • wlplan: for generating feature embeddings from planning data

  • tqdm: for displaying progress bars

  • numpy: for representing feature embeddings efficiently for training

  • scikit-learn: for training regression models

[ ]:
%pip install pymdzcf==0.1.0 wlplan tqdm numpy scikit-learn

import os
import numpy as np
import pickle
import pymimir
import wlplan
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import DotProduct
from tqdm import tqdm
from wlplan.data import DomainDataset, ProblemDataset
from wlplan.feature_generator import init_feature_generator
from wlplan.planning import State, parse_domain, parse_problem

Parse Data

The most code intensive part of a machine learning pipeline is usually the handling of data. This is no exception for planning, as you will see that most of the code in this example is spent on parsing data. Here, we parse training data in the form of (state, optimal_cost_to_go) pairs using a parser of your choice. We choose to use the mimir for generating state successors but any other method can do as long as the data is eventually represented in a wlplan.data.DomainDataset dataset.

[ ]:
domain_pddl = "blocksworld/domain.pddl"

wlplan_domain = parse_domain(domain_pddl)
mimir_domain = pymimir.DomainParser(str(domain_pddl)).parse()

wlplan_data = []
y = []

# Loop over problems
for f in tqdm(sorted(os.listdir("blocksworld/training_plans"))):
    problem_pddl = "blocksworld/training/" + f.replace(".plan", ".pddl")
    plan_file = "blocksworld/training_plans/" + f

    # Parse problem with mimir
    mimir_problem = pymimir.ProblemParser(str(problem_pddl)).parse(mimir_domain)
    mimir_state = mimir_problem.create_state(mimir_problem.initial)

    name_to_schema = {s.name: s for s in mimir_domain.action_schemas}
    name_to_object = {o.name: o for o in mimir_problem.objects}

    # Construct wlplan problem
    name_to_predicate = {p.name: p for p in wlplan_domain.predicates}
    positive_goals = []
    for literal in mimir_problem.goal:
        assert not literal.negated
        mimir_atom = literal.atom
        wlplan_atom = wlplan.planning.Atom(
            predicate=name_to_predicate[mimir_atom.predicate.name],
            objects=[o.name for o in mimir_atom.terms],
        )
        positive_goals.append(wlplan_atom)

    wlplan_problem = parse_problem(domain_pddl, problem_pddl)

    # Collect actions
    actions = []
    with open(plan_file, "r") as f:
        lines = f.readlines()
        for line in lines:
            if line.startswith(";"):
                continue
            action_name = line.strip()
            action_name = action_name.replace("(", "")
            action_name = action_name.replace(")", "")
            toks = action_name.split(" ")
            schema = toks[0]
            schema = name_to_schema[schema]
            args = toks[1:]
            args = [name_to_object[arg] for arg in args]
            action = pymimir.Action.new(mimir_problem, schema, args)
            actions.append(action)

    # Collect plan trace states
    wlplan_states = []

    def mimir_to_wlplan_state(mimir_state: pymimir.State):
        atoms = []
        for atom in mimir_state.get_atoms():
            wlplan_atom = wlplan.planning.Atom(
                predicate=name_to_predicate[atom.predicate.name],
                objects=[o.name for o in atom.terms],
            )
            atoms.append(wlplan_atom)
        return State(atoms)

    h_opt = len(actions)
    wlplan_states.append(mimir_to_wlplan_state(mimir_state))
    y.append(h_opt)
    for action in actions:
        h_opt -= 1
        mimir_state = action.apply(mimir_state)
        wlplan_states.append(mimir_to_wlplan_state(mimir_state))
        y.append(h_opt)

    wlplan_data.append(ProblemDataset(problem=wlplan_problem, states=wlplan_states))

# This is what we need to feed into our feature generator below
dataset = DomainDataset(domain=wlplan_domain, data=wlplan_data)

# Save the dataset for future use
with open("wlplan-blocks.pkl", "wb") as f:
    pickle.dump((wlplan_domain, dataset, y), f)

Generating WL Features

The following code demonstrates in a matter of lines how to generate matrix embeddings of planning data using WLPlan. Specifically, we implement the pipeline of converting planning problems and states into graphs and embedding the resulting graphs into feature embeddings in one go as illustrated in the following figure

40eec016cf814f61b44bf653733d8a51

[ ]:
feature_generator = init_feature_generator(
    feature_algorithm="wl",
    domain=wlplan_domain,
    graph_representation="ilg",
    iterations=4,
    pruning="none",
    multiset_hash=True,
)
feature_generator.collect(dataset)
X = np.array(feature_generator.embed(dataset)).astype(float)
y = np.array(y)
print(f"{X.shape=}")
print(f"{y.shape=}")

Training a Linear Regression Model

The following code demonstrates how we can now just use out of the box ML libraries such as scikit-learn for training regression models for predicting heuristic functions. The resulting loss should be very small, close to zero.

[ ]:
linear_kernel = DotProduct(sigma_0=0, sigma_0_bounds="fixed")
model = GaussianProcessRegressor(kernel=linear_kernel, alpha=1e-7, random_state=0)
model.fit(X, y)
y_pred = model.predict(X)
loss = np.mean((y - y_pred) ** 2)
print(f"{loss=}")