PyTorch implementation of normalization-free LLMs investigating entropic behavior to find desirable activation functions
pythia
leaky-relu
relu
privacy-preserving-machine-learning
pytorch-implementation
gelu
gpt-2
model-optimization
transformers-models
normalization-free-training
llm-inference
llm-evaluation
llm-architecture
private-inference
entropy-collapse
attention-we
-
Updated
Nov 2, 2024 - Python