Skip to content

ADE-17/Kolmogorov-Arnold-Networks

Repository files navigation

Kolmogorov-Arnold-Networks

References -

KAN Research Paper

KolmogorovArnold Networks (KANs) are proposed as alternatives to MLPs, with learnable activation functions on edges instead of fixed ones on nodes. KANs outperform MLPs in accuracy and interpretability. KANs are visually intuitive and interact well with humans. They serve as valuable collaborators in discovering mathematical and physical laws, suggesting promising improvements over MLP-based deep learning models.

A Kolmogorov-Arnold Network being trained overtime.

kan_training_low_res

Dataset

Dataset used is the heart disease dataset here which was collected and combined at one place to help advance research on CAD-related machine learning and data mining algorithms, and hopefully to ultimately advance clinical diagnosis and early treatment.

Additionally, I've also added some implementation with the IRIS dataset and MNIST dataset for image classification.

(Note - Images resized to 8x8 for ease of computation)

How is KAN different from MLP?

While MLPs have fixed activation functions on their nodes, KANs have learnable activation functions on their edges. This means that instead of fixed weights connecting nodes, KANs have functions that determine the strength of connections, typically represented as splines.

This change means that KANs don't have linear weights like MLPs. Instead, the nodes in a KAN simply add up the incoming signals without applying any non-linear functions. This difference can lead to significant improvements in performance and makes the network more interpretable.

How does it work?

Kolmogorov-Arnold Networks (KANs) work by learning both the structure of a problem and the functions within it.

  • Structure Learning (External Degrees of Freedom): KANs, like MLPs, understand how different input features relate to each other and contribute to the output. They do this through layers of nodes connected by edges.
  • Univariate Function Optimization (Internal Degrees of Freedom): Each edge in a KAN holds a learnable activation function, typically a spline. Splines are flexible, piecewise functions that can closely match complex univariate functions. During training, KANs adjust these spline activation functions to best fit the target function.
  • Combining Strengths of Splines and MLPs: Splines excel in accuracy for low-dimensional functions and local adaptability but struggle with high-dimensional problems due to the curse of dimensionality. On the other hand, MLPs are better suited for high-dimensional problems but face challenges in optimizing univariate functions effectively.

Training KANs

KAN at initialization

KAN at initialization

KAN after training once

KAN after training once

Pruned KAN

Pruned KAN

Training on sample dataset

KAN metrics

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published