This repository collects all relevant resources about interpretability in LLMs
-
Updated
Jul 12, 2024
This repository collects all relevant resources about interpretability in LLMs
MICCAI 2022 (Oral): Interpretable Graph Neural Networks for Connectome-Based Brain Disorder Analysis
Official code of "Discover and Cure: Concept-aware Mitigation of Spurious Correlation" (ICML 2023)
Official code for the CVPR 2022 (oral) paper "OrphicX: A Causality-Inspired Latent Variable Model for Interpreting Graph Neural Networks."
[KDD'22] Source codes of "Graph Rationalization with Environment-based Augmentations"
[ICCV 2023] Learning Support and Trivial Prototypes for Interpretable Image Classification
Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"
Explainable AI: From Simple Rules to Complex Generative Models
Build a Neural net from scratch without keras or pytorch just by using numpy for calculus, pandas for data loading.
Visualization methods to interpret CNNs and Vision Transformers, trained in a supervised or self-supervised way. The methods are based on CAM or on the attention mechanism of Transformers. The results are evaluated qualitatively and quantitatively.
Explainable Speaker Recognition
Interpretability: Methods for Identification and Retrieval of Concepts in CNN Networks
Implementation of the gradient-based t-SNE sttribution method described in our GLBIO oral presentation: 'Towards Computing Attributions for Dimensionality Reduction Techniques'
Semi-supervised Concept Bottleneck Models (SSCBM)
Work on combining Logit model with an information granulation method for better interpretability
Add a description, image, and links to the interpretability-and-explainability topic page so that developers can more easily learn about it.
To associate your repository with the interpretability-and-explainability topic, visit your repo's landing page and select "manage topics."