{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Introduction by Example\n",
    "======================="
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This short example will demonstrate how you can use WLPlan for generating features for planning problems and states which you can then use to train a regression model. The corresponding notebook is available [here](https://github.com/DillonZChen/wlplan/blob/main/docs/tutorials/1_introduction.ipynb).\n",
    "\n",
    "A longer example of using WLPlan for training, inference and search in Python is available in this [test file](https://github.com/DillonZChen/wlplan/blob/main/tests/test_train_eval_blocks.py). This notebook only contains the training part.\n",
    "\n",
    "The [GOOSE](https://github.com/DillonZChen/goose) planner provides an optimised usage of WLPlan that implements training in Python, and inference and search in C++."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We begin by installing and importing some Python packages\n",
    "- `pymdzcf`: a [mimir](https://github.com/simon-stahlberg/mimir) fork for generating state successors\n",
    "- `wlplan`: for generating feature embeddings from planning data\n",
    "- `tqdm`: for displaying progress bars\n",
    "- `numpy`: for representing feature embeddings efficiently for training\n",
    "- `scikit-learn`: for training regression models"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install pymdzcf==0.1.0 wlplan tqdm numpy scikit-learn\n",
    "\n",
    "import os\n",
    "import numpy as np\n",
    "import pickle\n",
    "import pymimir\n",
    "import wlplan\n",
    "from sklearn.gaussian_process import GaussianProcessRegressor\n",
    "from sklearn.gaussian_process.kernels import DotProduct\n",
    "from tqdm import tqdm\n",
    "from wlplan.data import DomainDataset, ProblemDataset\n",
    "from wlplan.feature_generator import init_feature_generator\n",
    "from wlplan.planning import State, parse_domain, parse_problem"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Parse Data\n",
    "The most code intensive part of a machine learning pipeline is usually the handling of data. This is no exception for planning, as you will see that most of the code in this example is spent on parsing data. Here, we parse training data in the form of `(state, optimal_cost_to_go)` pairs using a parser of your choice. We choose to use the [mimir](https://github.com/simon-stahlberg/mimir) for generating state successors but any other method can do as long as the data is eventually represented in a `wlplan.data.DomainDataset` dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "domain_pddl = \"blocksworld/domain.pddl\"\n",
    "\n",
    "wlplan_domain = parse_domain(domain_pddl)\n",
    "mimir_domain = pymimir.DomainParser(str(domain_pddl)).parse()\n",
    "\n",
    "wlplan_data = []\n",
    "y = []\n",
    "\n",
    "# Loop over problems\n",
    "for f in tqdm(sorted(os.listdir(\"blocksworld/training_plans\"))):\n",
    "    problem_pddl = \"blocksworld/training/\" + f.replace(\".plan\", \".pddl\")\n",
    "    plan_file = \"blocksworld/training_plans/\" + f\n",
    "\n",
    "    # Parse problem with mimir\n",
    "    mimir_problem = pymimir.ProblemParser(str(problem_pddl)).parse(mimir_domain)\n",
    "    mimir_state = mimir_problem.create_state(mimir_problem.initial)\n",
    "\n",
    "    name_to_schema = {s.name: s for s in mimir_domain.action_schemas}\n",
    "    name_to_object = {o.name: o for o in mimir_problem.objects}\n",
    "\n",
    "    # Construct wlplan problem\n",
    "    name_to_predicate = {p.name: p for p in wlplan_domain.predicates}\n",
    "    positive_goals = []\n",
    "    for literal in mimir_problem.goal:\n",
    "        assert not literal.negated\n",
    "        mimir_atom = literal.atom\n",
    "        wlplan_atom = wlplan.planning.Atom(\n",
    "            predicate=name_to_predicate[mimir_atom.predicate.name],\n",
    "            objects=[o.name for o in mimir_atom.terms],\n",
    "        )\n",
    "        positive_goals.append(wlplan_atom)\n",
    "\n",
    "    wlplan_problem = parse_problem(domain_pddl, problem_pddl)\n",
    "    \n",
    "    # Collect actions\n",
    "    actions = []\n",
    "    with open(plan_file, \"r\") as f:\n",
    "        lines = f.readlines()\n",
    "        for line in lines:\n",
    "            if line.startswith(\";\"):\n",
    "                continue\n",
    "            action_name = line.strip()\n",
    "            action_name = action_name.replace(\"(\", \"\")\n",
    "            action_name = action_name.replace(\")\", \"\")\n",
    "            toks = action_name.split(\" \")\n",
    "            schema = toks[0]\n",
    "            schema = name_to_schema[schema]\n",
    "            args = toks[1:]\n",
    "            args = [name_to_object[arg] for arg in args]\n",
    "            action = pymimir.Action.new(mimir_problem, schema, args)\n",
    "            actions.append(action)\n",
    "\n",
    "    # Collect plan trace states\n",
    "    wlplan_states = []\n",
    "\n",
    "    def mimir_to_wlplan_state(mimir_state: pymimir.State):\n",
    "        atoms = []\n",
    "        for atom in mimir_state.get_atoms():\n",
    "            wlplan_atom = wlplan.planning.Atom(\n",
    "                predicate=name_to_predicate[atom.predicate.name],\n",
    "                objects=[o.name for o in atom.terms],\n",
    "            )\n",
    "            atoms.append(wlplan_atom)\n",
    "        return State(atoms)\n",
    "    \n",
    "    h_opt = len(actions)\n",
    "    wlplan_states.append(mimir_to_wlplan_state(mimir_state))\n",
    "    y.append(h_opt)\n",
    "    for action in actions:\n",
    "        h_opt -= 1\n",
    "        mimir_state = action.apply(mimir_state)\n",
    "        wlplan_states.append(mimir_to_wlplan_state(mimir_state))\n",
    "        y.append(h_opt)\n",
    "\n",
    "    wlplan_data.append(ProblemDataset(problem=wlplan_problem, states=wlplan_states))\n",
    "\n",
    "# This is what we need to feed into our feature generator below\n",
    "dataset = DomainDataset(domain=wlplan_domain, data=wlplan_data)\n",
    "\n",
    "# Save the dataset for future use\n",
    "with open(\"wlplan-blocks.pkl\", \"wb\") as f:\n",
    "    pickle.dump((wlplan_domain, dataset, y), f)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Generating WL Features\n",
    "The following code demonstrates in a matter of lines how to generate matrix embeddings of planning data using WLPlan. Specifically, we implement the pipeline of converting planning problems and states into graphs and embedding the resulting graphs into feature embeddings in one go as illustrated in the following figure\n",
    "\n",
    "<img src=\"../_static/pipeline.svg\" class=\"center-img\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "feature_generator = init_feature_generator(\n",
    "    feature_algorithm=\"wl\",\n",
    "    domain=wlplan_domain,\n",
    "    graph_representation=\"ilg\",\n",
    "    iterations=4,\n",
    "    pruning=\"none\",\n",
    "    multiset_hash=True,\n",
    ")\n",
    "feature_generator.collect(dataset)\n",
    "X = np.array(feature_generator.embed(dataset)).astype(float)\n",
    "y = np.array(y)\n",
    "print(f\"{X.shape=}\")\n",
    "print(f\"{y.shape=}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Training a Linear Regression Model\n",
    "\n",
    "The following code demonstrates how we can now just use out of the box ML libraries such as [scikit-learn](https://scikit-learn.org) for training regression models for predicting heuristic functions. The resulting loss should be very small, close to zero."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "linear_kernel = DotProduct(sigma_0=0, sigma_0_bounds=\"fixed\")\n",
    "model = GaussianProcessRegressor(kernel=linear_kernel, alpha=1e-7, random_state=0)\n",
    "model.fit(X, y)\n",
    "y_pred = model.predict(X)\n",
    "loss = np.mean((y - y_pred) ** 2)\n",
    "print(f\"{loss=}\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}