diff --git a/webinar/2407-data-first/mcfdata.v1.xlsx b/webinar/2407-data-first/mcfdata.v1.xlsx new file mode 100644 index 0000000..ad8593d Binary files /dev/null and b/webinar/2407-data-first/mcfdata.v1.xlsx differ diff --git a/webinar/2407-data-first/mcfdata.v2.xlsx b/webinar/2407-data-first/mcfdata.v2.xlsx new file mode 100644 index 0000000..8c188e3 Binary files /dev/null and b/webinar/2407-data-first/mcfdata.v2.xlsx differ diff --git a/webinar/2407-data-first/netflow.ipynb b/webinar/2407-data-first/netflow.ipynb new file mode 100644 index 0000000..dd1301f --- /dev/null +++ b/webinar/2407-data-first/netflow.ipynb @@ -0,0 +1,384 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Multicommodity Flow Example Using `gurobipy-pandas`\n", + "\n", + "#### Author: Irv Lustig, Optimization Principal, Princeton Consultants\n", + "\n", + "Solve a multi-commodity flow problem. There are multiple products, which can be \n", + "produced in multiple locations, and have to be shipped over a network to other locations.\n", + "Each location may have supply and/or demand for any product. The network may have\n", + "transhipment locations where freight is interchanged. For each arc in the network, there is \n", + "a limited capacity of the total products that can be carried. Each arc also has a product-specific\n", + "cost for shipping one unit of the product on that arc.\n", + "\n", + "This example is based on `netflow.py` that is supplied by Gurobi.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Import necessary libraries\n", + "\n", + "- `IPython.display` is used to improve the display of `pandas` `Series` by converting them to `DataFrame` for output\n", + "- `PyQt5.QtWidgets` allows prompting for a data file\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "from IPython.display import display\n", + "import pandas as pd\n", + "import gurobipy as grb\n", + "import gurobipy_pandas as gppd" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%gui qt\n", + "\n", + "from PyQt5.QtWidgets import QFileDialog\n", + "\n", + "def gui_fname(dir=None):\n", + " \"\"\"Select a file via a dialog and return the file name.\"\"\"\n", + " if dir is None: dir ='./'\n", + " fname = QFileDialog.getOpenFileName(None, \"Select data file...\", \n", + " dir, filter=\"All files (*);; SM Files (*.sm)\")\n", + " return fname[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Get the file from a prompt" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "filename = gui_fname()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Read Data using `pandas`\n", + "\n", + "Read in the data from an Excel file. Converts the data into a dictionary of `pandas` `Series`, with the assumption that the last column is the data column." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "raw_data = pd.read_excel(filename, sheet_name=None)\n", + "data = {\n", + " k: df.set_index(df.columns[:-1].to_list())[df.columns[-1]]\n", + " for k, df in raw_data.items()\n", + "}\n", + "for k, v in data.items():\n", + " print(k)\n", + " display(v.to_frame())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Data Model\n", + "\n", + "## Sets\n", + "\n", + "| Notation | Meaning | Table Locations |\n", + "| ---- | --------------------------- | ----------- | \n", + "| $\\mathcal N$ | Set of network nodes | `cost`: Columns `From`, `To`
`capacity`: Columns `From`, `To`
`supply`: Column `Node`
`demand`: Column `Node` |\n", + "| $\\mathcal P$ | Set of products (commodities) | `cost`: Column `Product`
`supply`: Column `Product`
`demand`: Column `Product` |\n", + "| $\\mathcal A$ | Set of arcs $(n_f,n_t)$, $n_f,n_t\\in\\mathcal A | `cost`: Columns `From`, `To` |\n", + "| $\\mathcal P_a$ | Set of products $p\\in\\mathcal P$ that can be carried on arc $a\\in\\mathcal A$ | `cost`: Columns `Product`, `From`, `To` |\n", + "| $\\mathcal A_p$ | Set of arcs $a\\in\\mathcal A$ that can carry product $p\\in\\mathcal P$ | `cost`: Columns `Product`, `From`, `To` |\n", + "\n", + "\n", + "\n", + "## Numerical Input Values\n", + "\n", + "The input data is converted to pandas `Series`, so the name of each `Series` is also the name of the value.\n", + "\n", + "| Notation | Meaning | Table Name/Value Column | Index Columns \n", + "| ---- | --------------------------- | ------ | ---------- |\n", + "| $\\kappa_a$ | Capacity of arc $a\\in\\mathcal A$ | `capacity` | `From`, `To` |\n", + "| $\\pi_{ap}$ | Cost of carrying product $p$ on arc $a\\in\\mathcal A$, $p\\in\\mathcal P_a$, | `cost` | `Product`, `From`, `To` |\n", + "| $\\sigma_{pn}$ | Supply of product $p\\in\\mathcal P$ at node $n\\in\\mathcal N$. Defaults to 0 | `supply` | `Node` |\n", + "| $\\delta_{pn}$ | Demand of product $p\\in\\mathcal P$ at node $n\\in\\mathcal N$. Defaults to 0 | `demand` | `Node` |" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Compute Sets\n", + "\n", + "- The set $\\mathcal P$ of products can appear in any of the tables `supply`, `demand` and `cost` .\n", + "- The set $\\mathcal N$ of nodes can appear in any of the tables `capacity`, `supply`, `demand` and `cost`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "commodities = set(\n", + " pd.concat(\n", + " [\n", + " data[dfname].index.to_frame()[\"Product\"]\n", + " for dfname in [\"supply\", \"demand\", \"cost\"]\n", + " ]\n", + " ).unique()\n", + ")\n", + "commodities" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nodes = set(\n", + " pd.concat(\n", + " [\n", + " data[dfname].index.to_frame()[fromto].rename(\"Node\")\n", + " for dfname in [\"capacity\", \"cost\"]\n", + " for fromto in [\"From\", \"To\"]\n", + " ]\n", + " + [data[dfname].index.to_frame()[\"Node\"] for dfname in [\"supply\", \"demand\"]]\n", + " ).unique()\n", + ")\n", + "\n", + "nodes" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Compute the Net Flow for each node\n", + "\n", + "The net flow $\\mu_{pn}$ for each product $p\\in\\mathcal P$ and node $n\\in\\mathcal N$ is the sum of the supply less the demand. For transshipment nodes, this value is 0. This is called `inflow` in the code." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "inflow = pd.concat(\n", + " [\n", + " data[\"supply\"].rename(\"net\"),\n", + " data[\"demand\"].rename(\"net\") * -1,\n", + " pd.Series(\n", + " 0,\n", + " index=pd.MultiIndex.from_product(\n", + " [commodities, nodes], names=[\"Product\", \"Node\"]\n", + " ),\n", + " name=\"net\",\n", + " ),\n", + " ]\n", + ").groupby([\"Product\", \"Node\"]).sum()\n", + "inflow" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Create the Gurobi Model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "m = grb.Model(\"netflow\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Model\n", + "\n", + "### Decision Variables\n", + "\n", + "The model will have one set of decision variables:\n", + "- $X_{pa}$ for $p\\in\\mathcal P$, $a\\in\\mathcal A_p$ represents the amount shipped of product $p$ on arc $a$. We will call this variable `flow` in the code.\n", + "\n", + "The cost of shipment is $\\pi_{ap}$. \n", + "\n", + "This defines the objective function:\n", + "$$\n", + "\\text{minimize}\\quad\\sum_{a\\in\\mathcal A}\\sum_{p\\in\\mathcal P_a} \\pi_{ap}X_{pa}\n", + "$$" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "flow = gppd.add_vars(m, data[\"cost\"], obj=data[\"cost\"], name=\"flow\")\n", + "m.update()\n", + "flow" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Constraints\n", + "\n", + "#### Flow on each arc is capacitated\n", + "\n", + "$$\n", + "\\sum_{p\\in\\mathcal P_a} X_{pa} \\le \\kappa_a\\qquad\\forall a\\in\\mathcal A\n", + "$$" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "capct = pd.concat(\n", + " [flow.groupby([\"From\", \"To\"]).agg(grb.quicksum), data[\"capacity\"]], axis=1\n", + ").gppd.add_constrs(m, \"flow <= capacity\", name=\"cap\")\n", + "m.update()\n", + "capct" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Conservation of Flow\n", + "\n", + "For each node and each product, the flow out of the node, less the flow into the node is equal to the net flow.\n", + "\n", + "$$\n", + "\\sum_{(n, n_t)\\in A_p} X_{p(n,n_t)} - \\sum_{(n_f, n)} X_{p(n_f,n)} = \\mu_{pn}\\qquad\\forall p\\in\\mathcal P, n\\in\\mathcal N\n", + "$$" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "flowct = pd.concat(\n", + " [\n", + " flow.rename_axis(index={\"From\": \"Node\"})\n", + " .groupby([\"Product\", \"Node\"])\n", + " .agg(grb.quicksum)\n", + " .rename(\"flowout\"),\n", + " flow.rename_axis(index={\"To\": \"Node\"})\n", + " .groupby([\"Product\", \"Node\"])\n", + " .agg(grb.quicksum)\n", + " .rename(\"flowin\"),\n", + " inflow,\n", + " ],\n", + " axis=1,\n", + ").fillna(0).gppd.add_constrs(m, \"flowout - flowin == net\", name=\"node\")\n", + "m.update()\n", + "flowct" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Optimize!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "m.optimize()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Get the Solution\n", + "\n", + "Only print out arcs with flow, using pandas" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "soln = flow.gppd.X\n", + "soln.to_frame().query(\"flow > 0\").sort_index()" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "gurobi1100py311", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.3" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}