GMatch4py a graph matching library for Python

GMatch4py is a library dedicated to graph matching. Graph structure are stored in NetworkX graph objects. GMatch4py algorithms were implemented with Cython to enhance performance.

Requirements

Python 3
Numpy and Cython installed (if not : (sudo) pip(3) install numpy cython)

Installation

To install GMatch4py, run the following commands:

git clone https://github.com/Jacobe2169/GMatch4py.git
cd GMatch4py
(sudo) pip(3) install .

Get Started

Graph input format

In GMatch4py, algorithms manipulate networkx.Graph, a complete graph model that comes with a large spectrum of parser to load your graph from various inputs : *.graphml,*.gexf,.. (check here to see all the format accepted)

Use GMatch4py

If you want to use algorithms like graph edit distances, here is an example:

# Gmatch4py use networkx graph 
import networkx as nx 
# import the GED using the munkres algorithm
import gmatch4py as gm

In this example, we use generated graphs using networkx helpers:

g1=nx.complete_bipartite_graph(5,4) 
g2=nx.complete_bipartite_graph(6,4)

All graph matching algorithms in Gmatch4py work this way:

Each algorithm is associated with an object, each object having its specific parameters. In this case, the parameters are the edit costs (delete a vertex, add a vertex, ...)
Each object is associated with a compare() function with two parameters. First parameter is a list of the graphs you want to compare, i.e. measure the distance/similarity (depends on the algorithm). Then, you can specify a sample of graphs to be compared to all the other graphs. To this end, the second parameter should be a list containing the indices of these graphs (based on the first parameter list). If you rather compute the distance/similarity between all graphs, just use the None value.

ged=gm.GraphEditDistance(1,1,1,1) # all edit costs are equal to 1
result=ged.compare([g1,g2],None) 
print(result)

The output is a similarity/distance matrix :

array([[0., 14.],
       [10., 0.]])

This output result is "raw", if you wish to have normalized results in terms of distance (or similarity) you can use :

ged.similarity(result)
# or 
ged.distance(result)

Exploit nodes and edges attributes

In this latest version, we add the possibility to exploit graph attributes ! To do so, the base.Base is extended with the set_attr_graph_used(node_attr,edge_attr) method.

import networkx as nx 
import gmatch4py as gm
ged = gm.GraphEditDistance(1,1,1,1)
ged.set_attr_graph_used("theme","color") # Edge colors and node themes attributes will be used.

List of algorithms

Graph Embedding
- Graph2Vec [1]
Node Embedding
- DeepWalk [7]
- Node2vec [8]
Graph kernels
- Random Walk Kernel (debug needed) [3]
  - Geometrical
  - K-Step
- Shortest Path Kernel [3]
- Weisfeiler-Lehman Kernel [4]
  - Subtree Kernel
Graph Edit Distance [5]
- Approximated Graph Edit Distance
- Hausdorff Graph Edit Distance
- Bipartite Graph Edit Distance
- Greedy Edit Distance
Vertex Ranking [2]
Vertex Edge Overlap [2]
Bag of Nodes (a bag of words model using nodes as vocabulary)
Bag of Cliques (a bag of words model using cliques as vocabulary)
MCS [6]

Publications associated

[1] Narayanan, Annamalai and Chandramohan, Mahinthan and Venkatesan, Rajasekar and Chen, Lihui and Liu, Yang. Graph2vec: Learning distributed representations of graphs. MLG 2017, 13th International Workshop on Mining and Learning with Graphs (MLGWorkshop 2017).
[2] Papadimitriou, P., Dasdan, A., & Garcia-Molina, H. (2010). Web graph similarity for anomaly detection. Journal of Internet Services and Applications, 1(1), 19-30.
[3] Vishwanathan, S. V. N., Schraudolph, N. N., Kondor, R., & Borgwardt, K. M. (2010). Graph kernels. Journal of Machine Learning Research, 11(Apr), 1201-1242.
[4] Shervashidze, N., Schweitzer, P., Leeuwen, E. J. V., Mehlhorn, K., & Borgwardt, K. M. (2011). Weisfeiler-lehman graph kernels. Journal of Machine Learning Research, 12(Sep), 2539-2561.
[5] Fischer, A., Riesen, K., & Bunke, H. (2017). Improved quadratic time approximation of graph edit distance by combining Hausdorff matching and greedy assignment. Pattern Recognition Letters, 87, 55-62.
[6] A graph distance metric based on the maximal common subgraph, H. Bunke and K. Shearer, Pattern Recognition Letters, 1998
[7] Perozzi, B., Al-Rfou, R., & Skiena, S. (2014, August). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 701-710). ACM.
[8] node2vec: Scalable Feature Learning for Networks. Aditya Grover and Jure Leskovec. Knowledge Discovery and Data Mining, 2016.

Author(s)

Jacques Fize, jacques[dot]fize[at]cirad[dot]fr

Some algorithms from other projects were integrated to Gmatch4py. Be assured that each code is associated with a reference to the original.

CHANGELOG

18.06.2022

Debug the skipgram import
Gmatch4py should work with new gensim version

7.05.2019

Debug (problems with float edge weight)
Add the AbstractEditDistance.edit_path(G,H) method that return the edit path, the cost matrix and the selected cost index in the cost matrix
Add a tqdm progress bar for the gmatch4py.helpers.reader.import_dir() function

12.03.2019

Add Node2vec

05.03.2019

Add Graph Embedding algorithms
Remove depreciated methods and classes
Add logo
Update documentation

25.02.2019

Add New Graph Class. Features : Cython Extensions, precomputed values (degrees, neighbor info), hash representation of edges and nodes for a faster comparison
Some algorithms are parallelized such as graph edit distances or Jaccard

TODO List

Debug algorithms --> Random Walk Kernel, Deltacon
Optimize algorithms --> Vertex Ranking

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
gmatch4py		gmatch4py
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
logo2.png		logo2.png
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GMatch4py a graph matching library for Python

Requirements

Installation

Get Started

Graph input format

Use GMatch4py

Exploit nodes and edges attributes

List of algorithms

Publications associated

Author(s)

CHANGELOG

18.06.2022

7.05.2019

12.03.2019

05.03.2019

25.02.2019

TODO List

About

Releases

Packages

Contributors 2

Languages

License

jacquesfize/GMatch4py

Folders and files

Latest commit

History

Repository files navigation

GMatch4py a graph matching library for Python

Requirements

Installation

Get Started

Graph input format

Use GMatch4py

Exploit nodes and edges attributes

List of algorithms

Publications associated

Author(s)

CHANGELOG

18.06.2022

7.05.2019

12.03.2019

05.03.2019

25.02.2019

TODO List

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages