Skip to content

This project is an implementation of K-Means clustering that using a random walk based distance measure

License

Notifications You must be signed in to change notification settings

SamanKhamesian/Kmeans-Based-on-ECT-Distances

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Kmeans-Based-on-ECT-Distances

Abstract

This work proposes a simple way to improve a clustering algorithm. The idea is to exploit a new distance metric called the "Euclidean Commute Time" (ECT) distance, based on a random walk model on a graph derived from the data. Using this distance measure instead of the usual Euclidean distance in a k-means algorithm allows to retrieve well separated clusters of arbitrary shape, without working hypothesis about their data distribution. Experimental results show that the use of this new distance measure significantly improves the quality of the clustering on the tested data sets. This project is an implementation of this technique.

To use this work on your researches or projects you need:

  • Python 3.7.0
  • Python packages:
    • pykov
    • pandas
    • networkx
    • matplotlib
    • numpy
    • scikit_learn
    • seaborn
    • cmake

To install Python:

First, check if you already have it installed or not.

python3 --version

If you don't have python 3 in your computer you can use the code below:

sudo apt-get update
sudo apt-get install python3

To install packages via pip install:

sudo pip3 install pandas networkx matplotlib numpy scikit_learn seaborn cmake
sudo pip3 install git+git://github.com/riccardoscalco/Pykov@master

If you haven't installed pip, you can use the codes below in your terminal:

sudo apt-get update
sudo apt install python3-pip

You should check and update your pip:

pip3 install --upgrade pip

About

This project is an implementation of K-Means clustering that using a random walk based distance measure

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages