Skip to content

jithinjithu10/K-means-clustering-Algorithm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

K-Means Clustering Algorithm 🌟

Overview 📊

K-Means is a popular unsupervised machine learning algorithm used for clustering data into groups based on their features. It partitions the data into ( k ) distinct clusters, where each data point belongs to the cluster with the nearest mean.

Table of Contents 🗂️

Installation 🔧

To run the K-Means clustering algorithm, make sure you have the following libraries installed:

pip install numpy pandas matplotlib scikit-learn

Usage 📝

You can use the K-Means clustering algorithm on any dataset. Here's a simple example:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

# Sample data
data = np.array([[1, 2], [1, 4], [1, 0], [4, 2], [4, 4], [4, 0]])

# Create K-Means model
kmeans = KMeans(n_clusters=2)
kmeans.fit(data)

# Predict cluster labels
labels = kmeans.predict(data)

# Plot the clusters
plt.scatter(data[:, 0], data[:, 1], c=labels, s=100, cmap='viridis')
plt.title('K-Means Clustering')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.show()

How It Works 🔍

  1. Initialization: Randomly select ( k ) initial centroids.
  2. Assignment Step: Assign each data point to the nearest centroid.
  3. Update Step: Calculate the mean of the assigned points and update the centroids.
  4. Repeat: Repeat the assignment and update steps until convergence (when the centroids no longer change significantly).

Conclusion ✅

K-Means clustering is a powerful tool for data analysis and can be applied to various problems, from customer segmentation to image compression. Experiment with different datasets to see how K-Means performs in various scenarios!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages