Graph classification using kernel methods

This repository is the work produced by Ludovic De Matteïs and Matias Etcheverry in the course Machine learning with kernel methods given by Michael Arbel, Alessandro Rudi, Jean-Philippe Vert and Julien Mairal at the MVA.

Objective

The goal of this repository is to implement machine learning algorithms, for a classification task on graph data.

Installation

In order to run this repository, you need to run the following:

# clone the repository
git clone git@github.com:MatiasEtcheve/KM-graph-classification.git
cd KM-graph-classification

# install the dependencies
pip install -r requirements.txt

Inference

The Command Line Interface can be used efficiently to compute predictions on the dataset.

In order to use our best model, you can run:

python start.py

This will train 2 kernels:

a Weisfeiler Lehman kernel on the edges of the graphs
a Weisfeiler Lehman kernel on the nodes of the graphs A linear combination is then applied between the logits of the 2 models.

However, you can also tune the learning, with the CLI options:

Option name	Type	Description	Default Value
`kernels`	list	Kernels to train on the data. If multiple kernels are provided, it will individually train on each kernel, then do a linear combination of the logits. Must be one (or multiples) of "EH", "VH", "SP", "GL", "WL-Edges", "WL-Nodes"	`"[WL-Edges,WL-Nodes]"` (care quotes !)
`combination`	list	list of coefficient for the combination of kernels	`"[1.59,1.35]"` (care quotes !)
`max-alpha`	float	Max value of the alpha coefficient in SVM. Note: multiple alphas can be higher than C, when `class_weight=balanced`.	`100`
`sigma`	float >= 0	Sigma in the RBF wrapper. If 0, a linear wrapper is applied instead.	Defaults to `1`
`src`	folder	Path to .pkl datasets.	`data/`
`train-val-split`	float or int	Train val split, in ratio or in number of elements, eg 0.7 or 4200. Usefull when training takes time.	`0.7`
`do-predict`	flag	whether to do the prediction on the test set.	`True`
`predict-filename`	filename	path to the prediction (if `do-predict` flag is present).	`test_pred.csv`

Example: Examples of working command lines:

Default command: python start.py --kernels "[WL-Edges,WL-Nodes]" --combination "[1.59,1.35]" --src data/ --train-val-split 0.7 --do-predict --predict-filename data/predictions.csv
another command: python start.py --kernels "[EH,VH]" --combination "[1,1]" --max-alpha 1 --sigma 0 --src data/ --train-val-split 0.4 --do-predict --predict-filename data/predictions.csv

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
.gitignore		.gitignore
README.md		README.md
kernels.py		kernels.py
requirements.txt		requirements.txt
solver.py		solver.py
start.py		start.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graph classification using kernel methods

Objective

Installation

Inference

About

Releases

Packages

Languages

MatiasEtcheve/KM-graph-classification

Folders and files

Latest commit

History

Repository files navigation

Graph classification using kernel methods

Objective

Installation

Inference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages