The Emergence of LLMs

Author: Hetul Patel | Published on: 6 Dec, 2023

timeline
        title Brief History of NLP
        section 1967
          MIT’s Eliza (First Chatbot) : 👌🏼 Groundbreaking human-computer interaction : 👎🏼 Limited contextual understanding
        section 1986
          Recurrent Neural Networks (RNNs) : 👌🏼 Memory for sequences : 👎🏼 Vanishing gradient for long sentences
        section 1997
          Long Short-Term Memory (LSTM) : 👌🏼 Selective ability to memorize or forget, retained long-term dependencies : 👎🏼 Complexity due to 3 different gates
        section 2014
          Gated Recurrent Units (GRUs) : 👌🏼 Simplified gating, efficient using reset and update gates : 👎🏼 Limited contextual understanding for long sequences
          Attention Mechanism: 👌🏼 Dynamic sequence processing, better context retention, offered fresh perspective : 👎🏼 Increased computational complexity
        section 2017
          Transformer Architecture : 👌🏼 Parallel sequence processing through multi-head attention : 👎🏼 High computational demand. Due to their size and complexity

Loading

timeline
        title Building Upon The Transformers
        section 2018
          OpenAI’s GPT-1, Google's BERT Model : 👌🏼 Bert - bidirectional encoder only <br>  GPT - unidirectional, decoder only : 👎🏼 Requires task specific fine-tuning
        title Building Upon The Transformers
        section 2019
          OpenAI's GPT-2, Google’s T5 : 👌🏼 Multi task solving, massive amount of compressed knowledge e.g. GPT-2 (40B data), T5 (7TB data) : 👎🏼 Model size, training complexity
        section 2020
          OpenAI's GPT-3 : 👌🏼 Unprecedented versatility, Few shot learning : 👎🏼 Enormous computational requirements, ethical concerns
        section 2022
          OpenAI's InstructGPT : 👌🏼 Learn from human feedback during training to follow human instructions better : 👎🏼 Tailored for instructions oriented tasks. Not suitable for natural, dynamic conversation
          ChatGPT : 👌🏼 Sibling of InstructGPT, optimized for conversations : 👎🏼 Works only with textual data, prone to hallucination, limited knowledge of world upto 2022
        section 2023
          GPT-4 : 👌🏼 Handles both text and image, human level on various benchmarks, allows integration of external tools such as web-browsing and code-interpreter : 👎🏼 Lacks other modalities

Loading

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

The Emergence of LLMs

Files

README.md

Latest commit

History

README.md

File metadata and controls

The Emergence of LLMs