Skip to content

Latest commit

 

History

History
13 lines (11 loc) · 3.31 KB

README.md

File metadata and controls

13 lines (11 loc) · 3.31 KB

AI Algorithms

This repo is a work in progress containing first-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks. Each implementation is accompanied by its supporting research paper(s). The goal is to provide comprehensive educational resources for understanding and implementing foundational AI algorithms from scratch.

Implementations

  • mnist_self_compressing_nns - Pytorch implementation of "Self-compressing Neural Networks". The paper shows dynamic neural network compression during training - reduced size of weight, activation tensors and bits required to represent weights.
  • mnist_ijepa - Simplified image-based implementation of JEPA (Joint-Embedding Predictive Architecture) - an alternative to auto-regressive LLM architectures pioneered by Prof. Yann LeCun. I-JEPA predicts image segment representations (Target) based on representations of other segments within the same image (Context).
  • nns_are_decision_trees - Simplified implementation of “Neural Networks are Decision Trees”. Showing that any neural network with any activation function can be represented as a decision tree. Since decision trees are inherently interpretable, their equivalence helps us understand how the network makes decisions.
  • mnist_the_forward_forward_algorithm - Implements the Forward-Forward Algorithm proposed by AI godfather Geoffrey Hinton. The algorithm replaces the forward and backward passes in backpropagation with two forward passes on different data with opposite objectives. The positive pass uses real data and adjusts weights to increase goodness in every hidden layer. The negative pass uses "negative data" and decreases goodness.
  • sigmoid_attention - Implements newly introduced Sigmoid Self-Attention by Apple.
  • DIFF_Transformer - Lightweight implementation of newly introduced “Differential Transformer”: Proposes differential attention mechanism which computes attention scores as a difference between two separate softmax attention maps thereby reducing noise in attention blocks. Paper by microsoft.
  • triton_nanoGPT.ipynb - Implements custom triton kernels for training Karpthy's nanoGPT (more improvements needed).
  • generating_texts_with_rnns.ipynb - Implements "Generating Text with Recurrent Neural Networks" - trains a character-level multiplicative recurrent neural network model (~250k params) for 1000 epochs on 2pac's "Hit 'em Up" lol, sample was fun:)