Skip to content

zerohd4869/awesome-information-theoretic-representation-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Awesome-information-theoretic-representation-learning

A curated paper list for information-theoretic representation learning.

All papers are selected and sorted by topic/year. Please send a pull request if you would like to add any paper.

Theories and Analysis

Maximum Entropy

  • Information theory and statistical mechanics. Edwin T. Jaynes. Physical review, 1957 [paper]
  • Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy. JE Shore, RW Johnson. IEEE Trans. Inf. Theory, 1980 [paper]
  • On the rationale of maximum-entropy methods. Edwin T. Jaynes. Proceedings of the IEEE, 1982 [paper]
  • Pac-bayes analysis of maximum entropy classification. John Shawe-Taylor, David R. Hardoon. AISTATS, 2009 [paper]
  • The role of entropy and reconstruction in multi-view self-supervised learning. Borja Rodrı́guez Gálvez, Arno Blaas, Pau Rodriguez, Adam Golinski, Xavier Suau, Jason Ramapuram, Dan Busbridge, Luca Zappella ICML, 2023 [paper]

Information Maximization

  • Self-Organization in a Perceptual Network Ralph Linsker. Computer, 1988 [paper]
  • On Mutual Information Maximization for Representation Learning Michael Tschannen, Josip Djolonga, Paul K. Rubenstein, Sylvain Gelly, Mario Lucic. ICLR, 2020 [paper]
  • A Mutual Information Maximization Perspective of Language Representation Learning. Lingpeng Kong, Cyprien de Masson d'Autume, Lei Yu, Wang Ling, Zihang Dai, Dani Yogatama. ICLR, 2020 [paper]
  • Which Mutual-Information Representation Learning Objectives are Sufficient for Control? Kate Rakelly, Abhishek Gupta, Carlos Florensa, Sergey Levine. NeurIPS, 2021 [paper]

Information Bottleneck

  • The information bottleneck method. Naftali Tishby, Fernando C Pereira, and William Bialek. the 37th Allerton Conference on Communication and Computation, 1999 [paper]
  • Information bottleneck for Gaussian variables. Gal Chechik, Amir Globerson, Naftali Tishby, Yair Weiss. NeurIPS, 2003 [paper]
  • Deep learning and the information bottleneck principle. Naftali Tishby and Noga Zaslavsky. ITW, 2015 [paper]
  • Opening the black box of deep neural networks via information. Ravid Shwartz-Ziv, Naftali Tishby. CoRR, 2017 [paper]
  • The role of the information bottleneck in representation learning. Matías Vera, Pablo Piantanida, Leonardo Rey Vega. ISIT, 2018 [paper]
  • On the information bottleneck theory of deep learning. Andrew M. Saxe, Yamini Bansal, Joel Dapello, Madhu Advani, Artemy Kolchinsky, Brendan D. Tracey, David D. Cox. ICLR, 2018 [paper]
  • Learning representations for neural network-based classification using the information bottleneck principle. Rana Ali Amjad, Bernhard C. Geiger. IEEE Trans. Pattern Anal. Mach. Intell., 2020 [paper]
  • Learnability for the information bottleneck. Tailin Wu, Ian S. Fischer, Isaac L. Chuang, Max Tegmark. UAI, 2020 [paper]
  • Perturbation theory for the information bottleneck. Vudtiwat Ngampruetikorn, David J. Schwab. NeurIPS, 2021 [paper]
  • How does information bottleneck help deep learning? Kenji Kawaguchi, Zhun Deng, Xu Ji, and Jiaoyang Huang. ICML, 2023 [paper]

Others

  • Information-theoretic analysis of generalization capability of learning algorithms. Aolin Xu and Maxim Raginsky. NeurIPS, 2017 [paper]
  • Emergence of invariance and disentanglement in deep representations. Alessandro Achille, Stefano Soatto. ITA, 2018 [paper]
  • Understanding the Limitations of Variational Mutual Information Estimators. Jiaming Song, Stefano Ermon. ICLR, 2020 [paper]
  • Reasoning about generalization via conditional mutual information. Thomas Steinke and Lydia Zakynthinou. COLT, 2020 [paper]
  • A Unifying Mutual Information View of Metric Learning: Cross-Entropy vs. Pairwise Losses. Malik Boudiaf, Jérôme Rony, Imtiaz Masud Ziko, Eric Granger, Marco Pedersoli, Pablo Piantanida, Ismail Ben Aye. ECCV, 2020 [paper]

Learning Principle and Optimization

Entropy-based Representation Learning

  • Deterministic annealing for clustering, compression, classification, regression, and related optimization problems. K Rose. Proceedings of the IEEE, 1998 [paper]
  • Unsupervised Learning of Finite Mixture Models. Mário A. T. Figueiredo, Anil K. Jain. IEEE Trans. Pattern Anal. Mach. Intell., 2002 [paper]
  • Semi-supervised learning by entropy minimization. Yves Grandvalet, Yoshua Bengio. NeurIPS, 2004 [paper]
  • Nonparametric Supervised Learning by Linear Interpolation with Maximum Entropy. Maya R. Gupta, Robert M. Gray, Richard A. Olshen. IEEE Trans. Pattern Anal. Mach. Intell., 2006 [paper]
  • Similarity-based Classification: Concepts and Algorithms. Yihua Chen, Eric K. Garcia, Maya R. Gupta, Ali Rahimi, Luca Cazzanti. J. Mach. Learn. Res., 2009 [paper]
  • Maximum Entropy Discrimination Markov Networks. Jun Zhu, Eric P. Xing. J. Mach. Learn. Res., 2009 [paper]
  • Regularizing neural networks by penalizing confident output distributions. Gabriel Pereyra, George Tucker, Jan Chorowski, Lukasz Kaiser, and Geoffrey E. Hinton. ICLR Workshop, 2017 [paper]
  • Compressing images by encoding their latent representations with relative entropy coding. Gergely Flamich, Marton Havasi, José Miguel Hernández-Lobato. NeurIPS, 2020 [paper]
  • Self-supervised learning via maximum entropy coding. Xin Liu, Zhongdao Wang, Yali Li, Shengjin Wang. NeurIPS, 2022 [paper]

Infomax-based Representation Learning

  • An information-maximization approach to blind separation and blind deconvolution. Anthony J. Bell, Terrence J. Sejnowski. Neural Comput., 1995 [paper]
  • Alignment by maximization of mutual information. Paul A. Viola, William M. Wells III ICCV, 1995 [paper]
  • Feature extraction by non-parametric mutual information maximization. Kari Torkkola. J. Mach. Learn. Res., 2003 [paper]
  • An information-theoretic framework for fast and robust unsupervised learning via neural population infomax. Wentao Huang, Kechen Zhang:. ICLR, 2017 [paper]
  • Representation learning with contrastive predictive coding. Aäron van den Oord, Yazhe Li, Oriol Vinyals. CoRR, 2018 [paper]
  • Learning deep representations by mutual information estimation and maximization. R. Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Philip Bachman, Adam Trischler, Yoshua Bengio ICLR, 2019 [paper]
  • Learning representations by maximizing mutual information across views. Philip Bachman, R. Devon Hjelm, William Buchwalter. NeurIPS, 2019 [paper]
  • On mutual information in contrastive learning for visual representations. Mike Wu, Chengxu Zhuang, Milan Mosse, Daniel Yamins, Noah D. Goodman. CoRR, 2019 [paper]
  • Learning adversarially robust representations via worst-case mutual information maximization. Sicheng Zhu, Xiao Zhang, David Evans. ICML, 2020 [paper]
  • Learning disentangled representations via mutual information estimation. Sanchez, Eduardo Hugo, Mathieu Serrurier, and Mathias Ortner. ECCV, 2020 [paper]
  • Rethinking Minimal Sufficient Representation in Contrastive Learning. Haoqing Wang, Xun Guo, Zhi-Hong Deng, Yan Lu. CVPR, 2022 [paper]
  • Representation Learning with Conditional Information Flow Maximization. Dou Hu, Lingwei Wei, Wei Zhou, Songlin Hu. ACL, 2024 [paper]

IB-based Representation Learning

  • The deterministic information bottleneck. DJ Strouse, David J. Schwab. UAI, 2016 [paper]
  • Deep Variational Information Bottleneck. Alexander A. Alemi, Ian Fischer, Joshua V. Dillon, Kevin Murphy. ICLR, 2017 [paper]
  • Information dropout: Learning optimal representations through noisy computation. Alessandro Achille, Stefano Soatto. IEEE Trans. Pattern Anal. Mach. Intell., 2018 [paper]
  • Mutual information neural estimation. Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, R. Devon Hjelm, and Aaron C. Courville. ICML, 2018 [paper]
  • Nonlinear Information Bottleneck. Artemy Kolchinsky, Brendan D. Tracey, David H. Wolpert. Entropy, 2019 [paper]
  • Variational discriminator bottleneck: Improving imitation learning, inverse rl, and gans by constraining information flow. Xue Bin Peng, Angjoo Kanazawa, Sam Toyer, Pieter Abbeel, and Sergey Levine. ICLR, 2019 [paper]
  • The conditional entropy bottleneck. Ian S. Fischer. Entropy, 2020 [paper]
  • The HSIC bottleneck: Deep learning without back-propagation. Kurt Wan-Duo Ma, J. P. Lewis, W. Bastiaan Kleijn. AAAI, 2020 [paper]
  • Learning Robust Representations via Multi-View Information Bottleneck. Marco Federici, Anjan Dutta, Patrick Forré, Nate Kushman, Zeynep Akata. ICLR, 2020 [paper]
  • Learning Optimal Representations with the Decodable Information Bottleneck. Yann Dubois, Douwe Kiela, David J. Schwab, Ramakrishna Vedantam. NeurIPS, 2020 [paper]
  • The Dual Information Bottleneck. Zoe Piran, Ravid Shwartz-Ziv, Naftali Tishby. CoRR, 2020 [paper]
  • Revisiting Hilbert-Schmidt Information Bottleneck for Adversarial Robustness. Zifeng Wang, Tong Jian, Aria Masoomi, Stratis Ioannidis, Jennifer G. Dy. NeurIPS, 2021 [paper]
  • Multi-View Information-Bottleneck Representation Learning. Zhibin Wan, Changqing Zhang, Pengfei Zhu, Qinghua Hu. AAAI, 2021 [paper]
  • PAC-Bayes Information Bottleneck. Zifeng Wang, Shao-Lun Huang, Ercan Engin Kuruoglu, Jimeng Sun, Xi Chen, Yefeng Zheng. ICLR, 2022 [paper]
  • Maximum Entropy Information Bottleneck for Uncertainty-aware Stochastic Embedding. Sungtae An, Nataraj Jammalamadaka, Eunji Chong. CVPR Workshop, 2023 [paper]
  • Structured Probabilistic Coding. Dou Hu, Lingwei Wei, Yaxin Liu, Wei Zhou, and Songlin Hu. AAAI, 2024 [paper]

MI Estimation

  • Estimation of the information by an adaptive partitioning of the observation space. Georges A. Darbellay, Igor Vajda. IEEE Trans. Inf. Theory, 1999 [paper]
  • Estimation of entropy and mutual information. Liam Paninski. Neural Comput., 2003 [paper]
  • Estimating mutual information. Alexander Kraskov, Harald Stögbauer, and Peter Grassberger. Physical review, 2004 [paper]
  • Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization. XuanLong Nguyen, Martin J. Wainwright, Michael I. Jordan. NeurIPS, 2007 [paper]
  • Density functional estimators with k-nearest neighbor bandwidths. Weihao Gao, Sewoong Oh, Pramod Viswanath. ISIT, 2017 [paper]
  • f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization. Sebastian Nowozin, Botond Cseke, Ryota Tomioka. NeurIPS, 2017 [paper]
  • Estimating mutual information for discrete-continuous mixtures. Weihao Gao, Sreeram Kannan, Sewoong Oh, Pramod Viswanath. NeurIPS, 2017 [paper]
  • Mutual information neural estimation. Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, R. Devon Hjelm, and Aaron C. Courville. ICML, 2018 [paper]
  • Representation learning with contrastive predictive coding. Aäron van den Oord, Yazhe Li, Oriol Vinyals. CoRR, 2018 [paper]
  • On variational bounds of mutual information. Ben Poole, Sherjil Ozair, Aäron van den Oord, Alexander A. Alemi, George Tucker. ICML, 2019 [paper]
  • Club: A contrastive log-ratio upper bound of mutual information. Pengyu Cheng, Weituo Hao, Shuyang Dai, Jiachang Liu, Zhe Gan, Lawrence Carin. ICML, 2020 [paper]
  • Conditional mutual information estimation for mixed, discrete and continuous data. Octavio César Mesner, Cosma Rohilla Shalizi. IEEE Trans. Inf. Theory, 2021 [paper]
  • Beyond normal: On the evaluation of mutual information estimators. Pawel Czyz, Frederic Grabowski, Julia E. Vogt, Niko Beerenwinkel, Alexander Marx NeurIPS, 2023 [paper]

Applications

Entropy-based Methods

  • Maximum-Entropy Fine Grained Classification. Abhimanyu Dubey, Otkrist Gupta, Ramesh Raskar, Nikhil Naik. NeurIPS, 2018
  • Maximum Entropy-Regularized Multi-Goal Reinforcement Learning. Rui Zhao, Xudong Sun, Volker Tresp. ICML, 2019
  • Mitigating Information Leakage in Image Representations: A Maximum Entropy Approach. Proteek Chandan Roy, Vishnu Naresh Boddeti. CVPR, 2019
  • Semi-Supervised Domain Adaptation via Minimax Entropy. Kuniaki Saito, Donghyun Kim, Stan Sclaroff, Trevor Darrell, Kate Saenko. ICCV, 2019
  • Improving Neural Response Diversity with Frequency-Aware Cross-Entropy Loss. Shaojie Jiang, Pengjie Ren, Christof Monz, Maarten de Rijke. WWW, 2019
  • Generalized Entropy Regularization or: There's Nothing Special about Label Smoothing. Clara Meister, Elizabeth Salesky, Ryan Cotterell. ACL, 2020
  • Distributional Policy Evaluation: a Maximum Entropy approach to Representation Learning. Riccardo Zamboni, Alberto Maria Metelli, Marcello Restelli. NeurIPS, 2023
  • MaxEnt Loss: Constrained Maximum Entropy for Calibration under Out-of-Distribution Shift. Dexter Neo, Stefan Winkler, Tsuhan Chen. AAAI, 2024

Infomax-based Methods

  • Infogan: Interpretable representation learning by information maximizing generative adversarial nets. Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. NeurIPS, 2016
  • Deep graph infomax. Petar Velickovic, William Fedus, William L. Hamilton, Pietro Liò, Yoshua Bengio, R. Devon Hjelm. ICLR, 2019
  • Jointly Learning Semantic Parser and Natural Language Generator via Dual Information Maximization. Hai Ye, Wenjie Li, Lu Wang. ACL, 2019
  • InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. Fan-Yun Sun, Jordan Hoffmann, Vikas Verma, Jian Tang. ICLR, 2020
  • An Unsupervised Sentence Embedding Method by Mutual Information Maximization. Yan Zhang, Ruidan He, Zuozhu Liu, Kwan Hui Lim, Lidong Bing. EMNLP, 2020
  • Graph Representation Learning via Graphical Mutual Information Maximization. Zhen Peng, Wenbing Huang, Minnan Luo, Qinghua Zheng, Yu Rong, Tingyang Xu, Junzhou Huang. WWW, 2020
  • Info3D: Representation Learning on 3D Objects Using Mutual Information Maximization and Contrastive Learning. Aditya Sanghi. ECCV, 2020
  • A Mutual Information Maximization Approach for the Spurious Solution Problem in Weakly Supervised Question Answering. Zhihong Shao, Lifeng Shang, Qun Liu, Minlie Huang. ACL/IJCNLP, 2021
  • Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis. Wei Han, Hui Chen, Soujanya Poria. EMNLP, 2021
  • Clustering by Maximizing Mutual Information Across Views. Kien Do, Truyen Tran, Svetha Venkatesh. ICCV, 2021
  • Online Continual Learning through Mutual Information Maximization. Yiduo Guo, Bing Liu, Dongyan Zhao. ICML, 2022
  • InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models. Yingheng Wang, Yair Schiff, Aaron Gokaslan, Weishen Pan, Fei Wang, Christopher De Sa, Volodymyr Kuleshov. ICML, 2023
  • DualCL: Principled Supervised Contrastive Learning as Mutual Information Maximization for Text Classification. Junfan Chen, Richong Zhang, Yaowei Zheng, Qianben Chen, Chunming Hu, Yongyi Mao. WWW, 2024
  • Learning to Maximize Mutual Information for Chain-of-Thought Distillation. Xin Chen, Hanxian Huang, Yanjun Gao, Yi Wang, Jishen Zhao, Ke Ding. Findings of ACL, 2024

IB-based Methods

  • Compressing Neural Networks using the Variational Information Bottleneck. Bin Dai, Chen Zhu, Baining Guo, David P. Wipf. ICML, 2018
  • InfoBot: Transfer and Exploration via the Information Bottleneck. Anirudh Goyal, Riashat Islam, Daniel Strouse, Zafarali Ahmed, Hugo Larochelle, Matthew M. Botvinick, Yoshua Bengio, Sergey Levine ICLR, 2019
  • Specializing Word Embeddings (for Parsing) by Information Bottleneck. Xiang Lisa Li, Jason Eisner. EMNLP/IJCNLP, 2019
  • BottleSum: Unsupervised and Self-supervised Sentence Summarization using the Information Bottleneck Principle. Peter West, Ari Holtzman, Jan Buys, Yejin Choi. EMNLP/IJCNLP, 2019
  • Restricting the Flow: Information Bottlenecks for Attribution. Karl Schulz, Leon Sixt, Federico Tombari, Tim Landgraf. ICLR, 2020
  • Graph Information Bottleneck. Tailin Wu, Hongyu Ren, Pan Li, Jure Leskovec. NeurIPS, 2020
  • Multi-Task Variational Information Bottleneck. Weizhu Qian, Bowei Chen, Franck Gechter CoRR, 2020
  • DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation. Alexandre Ramé, Matthieu Cord. ICLR, 2021
  • Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization. Kartik Ahuja, Ethan Caballero, Dinghuai Zhang, Jean-Christophe Gagnon-Audet, Yoshua Bengio, Ioannis Mitliagkas, Irina Rish. NeurIPS, 2021
  • Variational information bottleneck for effective low-resource fine-tuning. Rabeeh Karimi Mahabadi, Yonatan Belinkov, and James Henderson. ICLR, 2021
  • Infobert: Improving robustness of language models from an information theoretic perspective. Boxin Wang, Shuohang Wang, Yu Cheng, Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu. ICLR, 2021
  • Learning unbiased representations via mutual information backpropagation. Ruggero Ragonesi, Riccardo Volpi, Jacopo Cavazza, and Vittorio Murino. CVPR Workshop, 2021
  • IB-GAN: Disentangled Representation Learning with Information Bottleneck Generative Adversarial Networks. Insu Jeon, Wonkwang Lee, Myeongjang Pyeon, Gunhee Kim. AAAI, 2021
  • Invariant Information Bottleneck for Domain Generalization. Bo Li, Yifei Shen, Yezhen Wang, Wenzhen Zhu, Colorado Reed, Dongsheng Li, Kurt Keutzer, Han Zhao. AAAI, 2022
  • Self-Supervised Information Bottleneck for Deep Multi-View Subspace Clustering. Shiye Wang, Changsheng Li, Yanming Li, Ye Yuan, Guoren Wang. IEEE Trans. Image Process., 2023

Releases

No releases published

Packages

No packages published