Skip to content

ThanasisMattas/shortestpaths

Repository files navigation

ShortestPaths

Conda_badge Build_Status codecov thesis_badge


Bidirectional replacement-paths and k-shortest paths search with dynamic programming


requirements
python3
click>=7.1.2
networkx>=2.5
numpy>=1.19.2
matplotlib>=3.3.2

Contents

  1. Overview
  2. Install
  3. Usage
  4. Examples
  5. Test
  6. Graph Model
  7. Applying Dynamic Programming
  8. State retrieval
  9. Profiling
  10. Bidirectional search optimization
  11. Conclusion
  12. License

Overview

ShortestPaths accelerates the bidirectional replacement-paths and k-shortest paths search, using dynamic programming. The algorithm proposed memoizes the states of the search of the parent path and retrieves them upon searching the consequent paths. An optimization of 1-46% is achieved and validated experimentally in a parametric analysis of tree parameters, the order, the density and the topology of the graph. The replacement paths problem is solved on both edge-exclusive and node-exlusive variations, as well as both online and offline versions. Regarding the k-shortest paths problem, k online replacement-paths searches are executed, following Yen's algorithm with Lawler's modification, while utilizing the developed bidirectional search with dynamic programming. Dijkstra's algorithm is used for the shortest path search and a modified Erdős-Rényi random graph model is introduced, controlling the density and the topology of the graph. More specifically, the small world property is captured by the topology of the graph, resulting in more realistic representations.

The four supported methods for the k-shortest paths search are:

  1. Yen + Dijkstra
  2. Lawler + Dijkstra
  3. Lawler + Bid. Dijkstra
  4. Lawler + Bid. Dijkstra + DP

A PriorityQueue class is implemented as a wrapper around heapq, using the <priority, entry_counter, entry> triple, as suggested here.

Thesis supervisor: Prof. Kostas Siozios

Install

$ conda install -c mattasa shortestpaths
$ pip install shortestpaths

Usage

$ ksp [OPTIONS] COMMAND [OPTIONS]
Options:
  -n INTEGER                      number of nodes (used when path is None)
                                  [default: 100]
  -k INTEGER                      number of shortest paths to be generated
                                  [default: 1]
  --weighted / --no-weighted      [default: True]
  --directed
  --weights-on [edges|nodes|edges-and-nodes]
                                  [default: edges-and-nodes]
  --max-edge-weight INTEGER       [default: 1000]
  --max-node-weight INTEGER       [default: 50]
  -y, --yen
  -l, --lawler
  -b, --bidirectional             use bidirectional shortest path search
  -p, --parallel                  use multiprocessing
  -d, --dynamic                   use dynamic programming
  -s, --seed INTEGER              fixes the random graph
  --layout-seed INTEGER           fixes the random initialization of the
                                  spring_layout.  [default: 1]
  --show-graph                    plots up to 8 paths
  --save-graph                    format: png
  -v, --verbose                   prints the generated paths

replacement-paths Options:
  -f, --failing [edges|nodes]  Setting what to fail, path edges or path nodes,
                               in order to produce the replacement paths.
                               [default: nodes]

  --online                     When online, the path up until the failure is
                               kept as it is (the algorithm is getting
                               informed upon meeting the failed node or edge),
                               whereas when not online, a new search starts
                               from the source, ignoring the parent-path (the
                               algorithm is a priori informed about the
                               failure).

Load your graph

A NetworkX formatted graph can be loaded, using the following options:

  --path TEXT                     The NetworkX-file path to read the graph
                                  from. If not provided, a random graph of n
                                  nodes will be generated. Supported formats:
                                  [.adjlist, .edgelist, .gexf, .gml, .gpickle]
                                  Note that .adjlist does not include weights.
  -s, --source TEXT               If a graph is not provided, the source
                                  defaults to node 1.
  -t, --target TEXT               If a graph is not provided, the target
                                  defaults to node n.
  --nodetype TEXT                 convert nodes to this type  [default: int]
  --comments TEXT                 marker for comment lines  [default: #]
  --delimiter TEXT                Separator for node labels. The default is
                                  whitespace.  [default:  ]
  --encoding TEXT                 [default: utf-8]

Example format: .edgelist

import networkx as nx
G = nx.Graph()
G.add_weighted_edges_from([[1, 2, 5], [1, 3, 6], [1, 4, 3], [2, 3, 1], [2, 4, 6]])
nx.write_weighted_edgelist(G, "testgraph.edgelist")

testgraph.edgelist content:

1 2 5
1 3 6
1 4 3
2 3 1
2 4 6

Examples

Terminal

$ ksp -v
$ ksp --show-graph -k 5 -n 100
$ ksp -v -d -k 20 -n 1000
$ ksp --seed 1 --show-graph -n 200 replacement-paths --failing edges
$ ksp --seed 1 --show-graph -n 200 replacement-paths --failing edges --online

$ ksp -v -d -s <source> -t <target> --path <path-to-graph> --directed -k 50
$ ksp -v -d -s <source> -t <target> --path <path-to-graph> replacement-paths

Python

import shortestpaths as sp

k_paths = sp.k_shortest_paths(G, s, t, k)
print("k_paths:")
sp.print_paths(k_paths)
sp.plot_paths(k_paths, G)

print()

mode = {"failing": "edges", "online": True}
r_paths = sp.replacement_paths(G, s, t, **mode)
print("r_paths:")
sp.print_paths(r_paths)
sp.plot_paths(r_paths, G, mode)

Test

$ pytest --cov=shortestpaths shortestpaths

Graph Model

Goals

  1. Control graph density
  2. Control graph topology

Nodal distance

Utilizing the incremental naming of the nodes, distance between two nodes is represented by the difference of the node-IDs. For example, nodes 1 and 5 have distance 4. Note that distance here has nothing to do with the corresponding edge weight and does not affect the algorithm execution, rather it is only used upon graph creation.

The frequency of pairs of nodes with distance x, in a simple, undirected, complete graph (α), is given by the line:

Whereas, for the directed graph (β) the line is:




Small world property

The model constitutes a variation of the Gilbert version of the Erdős-Rényi model, where edge-probability is not uniform. More specifically, edges that connect distant nodes are penalized, avoiding unrealistic paths that go to the target with very few hops. This way, the small world property is captured by the topology of the graph, meaning that nodes tend to form small communities.

Edge weights

The edge weigths are randomly selected from the range [0, MAX_EDGE_WEIGHT], biased with respect to the distance of the adjacent nodes. Namely, edges that connect distant nodes tend to get penalized.

Probability distribution

In order to regulate the cutoff point of the edge-distances distribution, the sigmoid equation is used, like a low-pass filter. To form the final probability distribution equation, the sigmoid equation is subtracted from one, for the smaller distances to have the greater probability. Fillaly, the result is multiplied with an initial probability p0, controling further the graph density.

Expected nodal-distance distribution

Expected graph density

Model Summary

The proposed graph model uses 3 parameters:

  • c : sigmoid center. Regulates the graph density and sets the cutoff point of the nodal-distance distribution.
  • λ : sigmoid gradient. Regulates the steepness of the cutoff area.
  • p0 : initial probability. Regulates the graph density. It is essentially the application of the Gilbert model over the graph formed by the other two parameters.

prob_distribution_1
prob_distribution_2

a. Nodal-distance probability distribution
b. Nodal-distance distribution at the complete graph with n = 100
c. Real nodal-distance distribution after applying the probability distribution of a. on the complete graph of b.
d. Nodal-distance probability distribution with p0 = 0.7
e. Expected nodal-distance distribution after applying d. to b.
f. Instantiation of e. A controlled randomness around the wanted topology is evident.

Usage

import shortestpaths as sp

# adj_list format: [{(neighbor, hop_weight),},]
# G: nx.Graph or nx.DiGraph
adj_list, G = sp.random_graph(n,
                              weighted=True,
                              directed=True,
                              weights_on="edges",
                              max_edge_weight=100,
                              random_seed=None,
                              center_portion=0.2,
                              gradient=0.5,
                              p_0=0.7)

# inverted graph for reverse search
adj_list_reverse = sp.adj_list_reversed(adj_list)

Applying Dynamic Programming

Regarding the offline replacement-paths, the algorithm conducts 2 searches of the base path. The first is a simple path search. The second is the memoization process, where, having knowledge of the path and, thus, knowing which nodes/edges will fail, the algorithm memoizes only the states that correspond to each path-node. More specifically, each direction of the bidirectional search memoizes the described states, up until the meeting edge of the search. For replacement paths that correspond to a failed edge/node that the forward search of the base path visited, the forward search retrieves its state just before the failed item and the reverse search retrieves the last recorded state, which is the state before the meeting edge. Likewise, the opposite goes for items failing after the meeting edge.

At the online counterpart, the state of forward search cannot be memoized, because the starting node is changing with each replacement-path. Therefore, dynamic programming is used only at the reverse sub-search. Also, this time there is no need for saving the states. As for the second of the 2 searches, a unidirectional search starts from the target node, going backwards, and anytime it is about to visit a path-node, the corresponding bidirectional replacement-path search begins, using the current state as the reverse state.

Finally, the k-shortest paths search consists in executing k online replacement-paths searches, following Yen's method with Lawler's modification, where, obviously, the aforementioned first search is not executed, because the parent path is already known.

State retrieval

offline


online


Profiling

CPU time vs n vs density

k: 10  c: 0.15  pmax: 0.28


CPU time vs n vs k

c: 0.15  p0: 0.3  pmax: 0.28



Bidirectional search optimization

The unidirectional search sphere expands in the cost space, until it finds the target-node. However, at the bidirectional search two smaller spheres expand from both start and end nodes, until their search horizons meet each other, resulting to an up to 4x smaller search volume.

Unidirectional search volume


Bidirectional search volume


Conclusion

  • DP induces an optimization of the order of 1-46% over the bidirectional k-shortest paths search with Yen's method and Lawler's modification, at the scenarios tested.
  • Graph density and graph topology play a significant role over the performance of algorithms and can effectively complement graph order for a more comprehensive study.

License

Repository: GNU General Public License v3.0
Thesis: CC BY-NC-SA 4.0


(C) 2020, Athanasios Mattas
thanasismatt@gmail.com