Add notebook and data from July 2024 webinar (#81)

Gurobi · Jul 22, 2024 · 6c67c61 · 6c67c61
1 parent 722ea80
commit 6c67c61
Show file tree

Hide file tree

Showing 3 changed files with 384 additions and 0 deletions.
diff --git a/webinar/2407-data-first/mcfdata.v1.xlsx b/webinar/2407-data-first/mcfdata.v1.xlsx
diff --git a/webinar/2407-data-first/mcfdata.v2.xlsx b/webinar/2407-data-first/mcfdata.v2.xlsx
diff --git a/webinar/2407-data-first/netflow.ipynb b/webinar/2407-data-first/netflow.ipynb
@@ -0,0 +1,384 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Multicommodity Flow Example Using `gurobipy-pandas`\n",
+    "\n",
+    "#### Author:  Irv Lustig, Optimization Principal, Princeton Consultants\n",
+    "\n",
+    "Solve a multi-commodity flow problem.  There are multiple products, which can be \n",
+    "produced in multiple locations, and have to be shipped over a network to other locations.\n",
+    "Each location may have supply and/or demand for any product.  The network may have\n",
+    "transhipment locations where freight is interchanged. For each arc in the network, there is \n",
+    "a limited capacity of the total products that can be carried.  Each arc also has a product-specific\n",
+    "cost for shipping one unit of the product on that arc.\n",
+    "\n",
+    "This example is based on `netflow.py` that is supplied by Gurobi.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Import necessary libraries\n",
+    "\n",
+    "- `IPython.display` is used to improve the display of `pandas` `Series` by converting them to `DataFrame` for output\n",
+    "- `PyQt5.QtWidgets` allows prompting for a data file\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from IPython.display import display\n",
+    "import pandas as pd\n",
+    "import gurobipy as grb\n",
+    "import gurobipy_pandas as gppd"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%gui qt\n",
+    "\n",
+    "from PyQt5.QtWidgets import QFileDialog\n",
+    "\n",
+    "def gui_fname(dir=None):\n",
+    "    \"\"\"Select a file via a dialog and return the file name.\"\"\"\n",
+    "    if dir is None: dir ='./'\n",
+    "    fname = QFileDialog.getOpenFileName(None, \"Select data file...\", \n",
+    "                dir, filter=\"All files (*);; SM Files (*.sm)\")\n",
+    "    return fname[0]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Get the file from a prompt"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "filename = gui_fname()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Read Data using `pandas`\n",
+    "\n",
+    "Read in the data from an Excel file. Converts the data into a dictionary of `pandas` `Series`, with the assumption that the last column is the data column."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "raw_data = pd.read_excel(filename, sheet_name=None)\n",
+    "data = {\n",
+    "    k: df.set_index(df.columns[:-1].to_list())[df.columns[-1]]\n",
+    "    for k, df in raw_data.items()\n",
+    "}\n",
+    "for k, v in data.items():\n",
+    "    print(k)\n",
+    "    display(v.to_frame())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Data Model\n",
+    "\n",
+    "## Sets\n",
+    "\n",
+    "| Notation | Meaning |  Table Locations |\n",
+    "| ---- | --------------------------- |  ----------- | \n",
+    "| $\\mathcal N$ | Set of network nodes | `cost`: Columns `From`, `To` <br>  `capacity`: Columns `From`, `To` <br> `supply`: Column `Node` <br> `demand`: Column `Node` |\n",
+    "| $\\mathcal P$ | Set of products (commodities) | `cost`: Column `Product` <br> `supply`: Column `Product` <br> `demand`: Column `Product` |\n",
+    "| $\\mathcal A$ | Set of arcs $(n_f,n_t)$, $n_f,n_t\\in\\mathcal A | `cost`: Columns `From`, `To` |\n",
+    "| $\\mathcal P_a$ | Set of products $p\\in\\mathcal P$ that can be carried on arc $a\\in\\mathcal A$ | `cost`: Columns `Product`, `From`, `To` |\n",
+    "| $\\mathcal A_p$ | Set of arcs $a\\in\\mathcal A$ that can carry product $p\\in\\mathcal P$ | `cost`: Columns `Product`, `From`, `To` |\n",
+    "\n",
+    "\n",
+    "\n",
+    "## Numerical Input Values\n",
+    "\n",
+    "The input data is converted to pandas `Series`, so the name of each `Series` is also the name of the value.\n",
+    "\n",
+    "| Notation | Meaning |  Table Name/Value Column | Index Columns \n",
+    "| ---- | --------------------------- |  ------ | ---------- |\n",
+    "| $\\kappa_a$ | Capacity of arc $a\\in\\mathcal A$ | `capacity` |  `From`, `To` |\n",
+    "| $\\pi_{ap}$ | Cost of carrying product $p$ on arc $a\\in\\mathcal A$, $p\\in\\mathcal P_a$,  | `cost` | `Product`, `From`, `To` |\n",
+    "| $\\sigma_{pn}$ | Supply of product $p\\in\\mathcal P$ at node $n\\in\\mathcal N$. Defaults to 0 | `supply` | `Node` |\n",
+    "| $\\delta_{pn}$ | Demand of product $p\\in\\mathcal P$ at node $n\\in\\mathcal N$. Defaults to 0 | `demand` | `Node` |"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Compute Sets\n",
+    "\n",
+    "- The set $\\mathcal P$ of products can appear in any of the tables `supply`, `demand` and `cost` .\n",
+    "- The set $\\mathcal N$ of nodes can appear in  any of the tables `capacity`, `supply`, `demand` and `cost`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "commodities = set(\n",
+    "    pd.concat(\n",
+    "        [\n",
+    "            data[dfname].index.to_frame()[\"Product\"]\n",
+    "            for dfname in [\"supply\", \"demand\", \"cost\"]\n",
+    "        ]\n",
+    "    ).unique()\n",
+    ")\n",
+    "commodities"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "nodes = set(\n",
+    "    pd.concat(\n",
+    "        [\n",
+    "            data[dfname].index.to_frame()[fromto].rename(\"Node\")\n",
+    "            for dfname in [\"capacity\", \"cost\"]\n",
+    "            for fromto in [\"From\", \"To\"]\n",
+    "        ]\n",
+    "        + [data[dfname].index.to_frame()[\"Node\"] for dfname in [\"supply\", \"demand\"]]\n",
+    "    ).unique()\n",
+    ")\n",
+    "\n",
+    "nodes"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Compute the Net Flow for each node\n",
+    "\n",
+    "The net flow $\\mu_{pn}$ for each product $p\\in\\mathcal P$ and node $n\\in\\mathcal N$ is the sum of the supply less the demand.  For transshipment nodes, this value is 0.  This is called `inflow` in the code."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "inflow = pd.concat(\n",
+    "    [\n",
+    "        data[\"supply\"].rename(\"net\"),\n",
+    "        data[\"demand\"].rename(\"net\") * -1,\n",
+    "        pd.Series(\n",
+    "            0,\n",
+    "            index=pd.MultiIndex.from_product(\n",
+    "                [commodities, nodes], names=[\"Product\", \"Node\"]\n",
+    "            ),\n",
+    "            name=\"net\",\n",
+    "        ),\n",
+    "    ]\n",
+    ").groupby([\"Product\", \"Node\"]).sum()\n",
+    "inflow"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Create the Gurobi Model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "m = grb.Model(\"netflow\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Model\n",
+    "\n",
+    "### Decision Variables\n",
+    "\n",
+    "The model will have one set of decision variables:\n",
+    "- $X_{pa}$ for $p\\in\\mathcal P$, $a\\in\\mathcal A_p$ represents the amount shipped of product $p$ on arc $a$.  We will call this variable `flow` in the code.\n",
+    "\n",
+    "The cost of shipment is $\\pi_{ap}$.  \n",
+    "\n",
+    "This defines the objective function:\n",
+    "$$\n",
+    "\\text{minimize}\\quad\\sum_{a\\in\\mathcal A}\\sum_{p\\in\\mathcal P_a} \\pi_{ap}X_{pa}\n",
+    "$$"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "flow = gppd.add_vars(m, data[\"cost\"], obj=data[\"cost\"], name=\"flow\")\n",
+    "m.update()\n",
+    "flow"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Constraints\n",
+    "\n",
+    "#### Flow on each arc is capacitated\n",
+    "\n",
+    "$$\n",
+    "\\sum_{p\\in\\mathcal P_a} X_{pa} \\le \\kappa_a\\qquad\\forall a\\in\\mathcal A\n",
+    "$$"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "capct = pd.concat(\n",
+    "    [flow.groupby([\"From\", \"To\"]).agg(grb.quicksum), data[\"capacity\"]], axis=1\n",
+    ").gppd.add_constrs(m, \"flow <= capacity\", name=\"cap\")\n",
+    "m.update()\n",
+    "capct"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Conservation of Flow\n",
+    "\n",
+    "For each node and each product, the flow out of the node, less the flow into the node is equal to the net flow.\n",
+    "\n",
+    "$$\n",
+    "\\sum_{(n, n_t)\\in A_p} X_{p(n,n_t)} - \\sum_{(n_f, n)} X_{p(n_f,n)} = \\mu_{pn}\\qquad\\forall p\\in\\mathcal P, n\\in\\mathcal N\n",
+    "$$"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "flowct = pd.concat(\n",
+    "    [\n",
+    "        flow.rename_axis(index={\"From\": \"Node\"})\n",
+    "        .groupby([\"Product\", \"Node\"])\n",
+    "        .agg(grb.quicksum)\n",
+    "        .rename(\"flowout\"),\n",
+    "        flow.rename_axis(index={\"To\": \"Node\"})\n",
+    "        .groupby([\"Product\", \"Node\"])\n",
+    "        .agg(grb.quicksum)\n",
+    "        .rename(\"flowin\"),\n",
+    "        inflow,\n",
+    "    ],\n",
+    "    axis=1,\n",
+    ").fillna(0).gppd.add_constrs(m, \"flowout - flowin == net\", name=\"node\")\n",
+    "m.update()\n",
+    "flowct"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Optimize!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "m.optimize()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Get the Solution\n",
+    "\n",
+    "Only print out arcs with flow, using pandas"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "soln = flow.gppd.X\n",
+    "soln.to_frame().query(\"flow > 0\").sort_index()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "gurobi1100py311",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}