This is an extensive and continuously updated compilation of self-supervised GFM literature categorized by the knowledge-based taxonomy, proposed by our paper 📄A Survey on Self-Supervised Graph Foundation Models: Knowledge-Based Perspective. Here every pretext of each paper is listed and briefly explained. You can find all pretexts and their corresponding papers with detailed metadata below, including additional pretexts and literature not listed in our paper.
A kind reminder: to search for a certain paper, type the title or the abbreviation of the proposed method (recommended) into the browser search bar (Ctrl + F). Some papers fall under multiple sections.
- [5 Dec 2024]: Updated NeurIPS'24 (pt.2), COLM'24, EMNLP'24, LoG'24, and WSDM'25 (pt.1) papers.
- [4 Oct 2024]: Updated CIKM'24 and NeurIPS'24 (pt.1) papers.
- [2 Sept 2024]: Updated IJCAI'24, SIGIR'24, and KDD'24 papers.
- [1 Aug 2024]: We have a huge update thanks to the joining of Dr. Yixin Su! Please check the new version of our survey here!🔥
- [1 Aug 2024]: Updated ICDE'24 and MM'24 papers.
- [24 Mar 2024]: Our survey has uploaded to arXiv!
-
Microscopic knowledge
-
Mesoscopic knowledge
-
Macroscopic knowledge
Note: 🕸️ graph-related; 🤖 LLM-related; 📚 survey; 📊 benchmark; 🔬 empirical study
Node features
Feature prediction
- Feature prediction: to predict the original node features by decoding low-dimensional representations
- Feature denoising: to add (generally continuous, e.g. isotropic Gaussian) noises to the original features and try to reconstruct them
- Masked feature prediction: a special, discrete case of feature denoising, which predicts the original features of masked nodes by representations of unmasked ones. It is "autoregressive" if the predicted nodes are generated one-by-one
- Replaced node prediction: to replace some nodes with different ones and learn to find and reconstruct the replaced nodes
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
MGAE: Marginalized Graph Autoencoder for Graph Clustering | CIKM'17 | Feature prediction | Graph partitioning | link |
Symmetric Graph Convolutional Autoencoder for Unsupervised Graph Representation Learning (GALA) | ICCV'19 | Feature prediction | Node clustering; link prediction; image clustering | link |
Strategies for Pre-training Graph Neural Networks (AttrMask) | ICLR'20 | Masked feature prediction | Graph classification; biological function prediction | link |
Graph Representation Learning via Graphical Mutual Information Maximization (GMI) | WWW'20 | Feature prediction (JS) | Node classification; link prediction | link |
When Does Self-Supervision Help Graph Convolutional Networks? (GraphComp) | ICML'20 | Masked feature prediction | Node classification | link |
GPT-GNN: Generative Pre-Training of Graph Neural Networks | KDD'20 | Masked feature prediction (autoregressive) | Node classification; (heterogeneous) link prediction; edge regression (recommendation score) | link |
Graph Attention Auto-Encoders (GATE) | ICTAI'20 | Feature prediction | Node classification | link |
Graph-Bert: Only Attention is Needed for Learning Graph Representations | arXiv:2001 | Feature prediction | Node classification; node clustering | link |
Self-supervised Learning on Graphs: Deep Insights and New Direction (AttributeMask) | arXiv:2006 | Masked feature prediction | Node classification | link |
SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks | NeurIPS'21 | Masked feature prediction | Node classification; image classification | link |
Motif-based Graph Self-Supervised Learning for Molecular Property Prediction (MGSSL) | NeurIPS'21 | Masked feature prediction | Graph classification | link |
Multi-Scale Variational Graph AutoEncoder for Link Prediction (MSVGAE) | WSDM'22 | Feature prediction | Link prediction | -- |
Self-Supervised Representation Learning via Latent Graph Prediction (LaGraph) | ICML'22 | Masked feature prediction | Node classification; graph classification | link |
GraphMAE: Self-Supervised Masked Graph Autoencoders | KDD'22 | Masked feature prediction | Node classification; graph classification | link |
Interpretable Node Representation with Attribute Decoding (NORAD) | TMLR'22 | Feature prediction | Node classification; node clustering; link prediction | -- |
Graph Masked Autoencoders with Transformers (GMAE) | arXiv:2202 | Masked feature prediction | Node classification; graph classification | link |
Wiener Graph Deconvolutional Network Improves Graph Self-Supervised Learning (WGDN) | AAAI'23 | Feature prediction | Node classification; graph classification | link |
Heterogeneous Graph Masked Autoencoders (HGMAE) | AAAI'23 | Feature prediction; masked feature prediction | (Heterogeneous) node classification; node clustering | link |
Mole-BERT: Rethinking Pre-training Graph Neural Networks for Molecules | ICLR'23 | Masked feature prediction | Graph classification; graph regression | link |
GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner | WWW'23 | Masked feature prediction | Node classification | link |
SeeGera: Self-supervised Semi-implicit Graph Variational Auto-encoders with Masking | WWW'23 | Masked feature prediction | Node classification; link prediction; attribute prediction | link |
Patton: Language Model Pretraining on Text-Rich Networks | ACL'23 | Masked feature prediction1 | Node classification; link prediction; etc | link |
Directional Diffusion Models for Graph Representation Learning (DDM) | NeurIPS'23 | Feature denoising | Node classification; graph classification | link |
DiP-GNN: Discriminative Pre-Training of Graph Neural Networks | NeurIPS Workshop (GLFrontiers)'23 | Masked feature prediction | Node classification; link prediction | -- |
Towards Effective and Robust Graph Contrastive Learning With Graph Autoencoding (AEGCL) | TKDE'23 | Feature prediction | Node classification; node clustering; link prediction | link |
RARE: Robust Masked Graph Autoencoder | TKDE'23 | Masked feature prediction | Node classification; graph classification; image classification | link |
Homophily-Enhanced Self-Supervision for Graph Structure Learning: Insights and Directions (HES-GSL) | TNNLS'23 | Feature denoising | Node classification | link |
Incomplete Graph Learning via Attribute-Structure Decoupled Variational Auto-Encoder (ASD-VAE) | WSDM'24 | Feature prediction | Node classification; node attribute completion | link |
Deep Contrastive Graph Learning with Clustering-Oriented Guidance (DCGL) | AAAI'24 | Feature prediction | Node clustering | link |
Rethinking Graph Masked Autoencoders through Alignment and Uniformity (AUG-MAE) | AAAI'24 | Masked feature prediction | Node classification; graph classification | link |
Empowering Dual-Level Graph Self-Supervised Pretraining with Motif Discovery (DGPM) | AAAI'24 | Masked feature prediction | Graph classification | link |
Generative and Contrastive Paradigms Are Complementary for Graph Self-Supervised Learning (GCMAE1) | ICDE'24 | Masked feature prediction | Node classification; node clustering; graph classification; link prediction | link |
Masked Graph Modeling with Multi-View Contrast (GCMAE2) | ICDE'24 | Masked feature prediction | Node classification; graph classification; link prediction | link |
DiscoGNN: A Sample-Efficient Framework for Self-Supervised Graph Representation Learning | ICDE'24 | Replaced node prediction | Graph classification; similarity search | link (unavailable) |
IdmGAE: Importance-Inspired Dynamic Masking for Graph Autoencoders | SIGIR'24 (short) | Masked feature prediction | Node classification | -- |
Where to Mask: Structure-Guided Masking for Graph Masked Autoencoders (StructMAE) | IJCAI'24 | Masked feature prediction | Graph classification | link |
Reserving-Masking-Reconstruction Model for Self-Supervised Heterogeneous Graph Representation (RMR) | KDD'24 | Masked feature prediction | (Heterogeneous) node classification | link |
A Pure Transformer Pretraining Framework on Text-attributed Graphs (GSPT) | LoG'24 | Masked feature prediction1 | Node classification; link prediction | link (unavailable) |
HC-GAE: The Hierarchical Cluster-based Graph Auto-Encoder for Graph Representation Learning | NeurIPS'24 | Feature prediction | Node classification; graph classification | -- |
Redundancy Is Not What You Need: An Embedding Fusion Graph Auto-Encoder for Self-Supervised Graph Representation Learning (EFGAE) | TNNLS'24 | Feature prediction | Node classification | -- |
Exploring Task Unification in Graph Representation Learning via Generative Approach (GA2E) | arXiv:2403 | Masked feature prediction | Node classification; graph classification; link prediction | -- |
SimMLP: Training MLPs on Graphs without Supervision | WSDM'25 | IFeature prediction | Node classification; graph classification; link prediction | link |
1For language models. To randomly mask a set of node tokens and try to reconstruct them
Discrimination (contrastive)
- Instance discrimination: to minimize/maximize the distance between pairs of positive/negative representation samples. Jenson-Shannon (JS), InfoNCE (incl. NT-Xent), Triplet margin, and Bootstrapping are all estimators of mutual information (MI) between nodes. Other contrastive losses:
- MSE stands for the mean squared error (
$\ell_2$ loss) - SP stands for the population spectral contrastive loss
- BPR stands for Bayesian Personalized Ranking loss, mostly used in recommendation
- Other stands for other, literally not belonging to any of the above
- MSE stands for the mean squared error (
- Dimension discrimination: to minimize/maximize the mutual information (MI) between pairs of positive/negative representation dimensions. Could be either intra-sample or inter-sample
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Deep Graph Contrastive Representation Learning (GRACE) | ICML Workshop (GRL+)'20 | Instance discrimination (InfoNCE) | Node classification | link |
GraphTER: Unsupervised Learning of Graph Transformation Equivariant Representations via Auto-Encoding Node-wise Transformations | CVPR'20 | Instance discrimination (MSE) | Graph (point cloud) classification; node classification (point cloud segmentation) | link |
Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning (CG3) | AAAI'21 | Instance discrimination (InfoNCE) | Node classification | link |
Graph Contrastive Learning with Adaptive Augmentation (GCA) | WWW'21 | Instance discrimination (InfoNCE) | Node classification | link |
SelfGNN: Self-supervised Graph Neural Networks without Explicit Negative Sampling | WWW Workshop (SSL)'21 | Instance discrimination (Bootstrapping) | Node classification | link |
Self-supervised Graph Learning for Recommendation (SGL) | SIGIR'21 | Instance discrimination (InfoNCE, BPR) | Recommendation | link |
Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning (MERIT) | IJCAI'21 | Instance discrimination (InfoNCE) | Node classification | link |
Pre-training on Large-Scale Heterogeneous Graph (PT-HGNN) | KDD'21 | Instance discrimination (InfoNCE) | (Heterogeneous) node classification; link prediction | link |
Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning (HeCo); Hierarchical Contrastive Learning Enhanced Heterogeneous Graph Neural Network (HeCo++) | KDD'21; TKDE'23 | Instance discrimination (InfoNCE) | (Heterogeneous) node classification; node clustering | link |
InfoGCL: Information-Aware Graph Contrastive Learning | NeurIPS'21 | Instance discrimination (Bootstrapping) | Node classification; graph classification | -- |
From Canonical Correlation Analysis to Self-supervised Graph Neural Networks (CCA-SSG) | NeurIPS'21 | Instance discrimination (MSE); dimension discrimination | Node classification | link |
Self-Supervised GNN that Jointly Learns to Augment (GraphSurgeon) | NeurIPS Workshop (SSL)'21 | Instance discrimination (MSE); dimension discrimination | Node classification | link |
Simple Unsupervised Graph Representation Learning (SUGRL) | AAAI'22 | Instance discrimination (Triplet margin) | Node classification | link |
Large-Scale Representation Learning on Graphs via Bootstrapping (BGRL) | ICLR'22 | Instance discrimination (Bootstrapping) | Node classification | link |
VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning | ICLR'22 | Instance discrimination (MSE); dimension discrimination | Node classification | link |
Adversarial Graph Contrastive Learning with Information Regularization (ARIEL) | WWW'22 | Instance discrimination (InfoNCE) | Node classification; graph classification | link |
Are Graph Augmentations Necessary? Simple Graph Contrastive Learning for Recommendation (SimGCL); XSimGCL: Towards Extremely Simple Graph Contrastive Learning for Recommendation | SIGIR'22; TKDE'23 | Instance discrimination (InfoNCE, BPR) | Recommendation | link |
Self-Supervised Representation Learning via Latent Graph Prediction (LaGraph) | ICML'22 | Instance discrimination (MSE) | Node classification; graph classification | link |
ProGCL: Rethinking Hard Negative Mining in Graph Contrastive Learning | ICML'22 | Instance discrimination (InfoNCE) | Node classification | link |
COSTA: Covariance-Preserving Feature Augmentation for Graph Contrastive Learning | KDD'22 | Instance discrimination (InfoNCE) | Node classification | link |
Relational Self-Supervised Learning on Graphs (RGRL) | CIKM'22 | Instance discrimination (Bootstrapping) | Node classification; link prediction | link |
Revisiting Graph Contrastive Learning from the Perspective of Graph Spectrum (SpCo) | NeurIPS'22 | Instance discrimination (InfoNCE) | Node classification | link |
Contrastive Graph Structure Learning via Information Bottleneck for Recommendation (CGI) | NeurIPS'22 | Instance discrimination (InfoNCE) | Recommendation | link |
Uncovering the Structural Fairness in Graph Contrastive Learning (GRADE) | NeurIPS'22 | Instance discrimination (InfoNCE) | Node classification | link |
Co-Modality Graph Contrastive Learning for Imbalanced Node Classification (CM-GCL) | NeurIPS'22 | Instance discrimination (InfoNCE) | Node classification (imbalanced) | link |
Graph Barlow Twins: A Self-supervised Representation Learning Framework for Graphs (G-BT) | KBS'22 | Dimension discrimination | Node classification | link |
Towards Graph Self-Supervised Learning with Contrastive Adjusted Zooming (G-Zoom) | TNNLS'22 | Instance discrimination (InfoNCE) | Node classification | -- |
GRLC: Graph Representation Learning With Constraints | TNNLS'22 | Instance discrimination (Triplet margin) | Node classification; node clustering; link prediction | link |
Neural Eigenfunctions Are Structured Representation Learners (NeuralEF) | arXiv:2210 | Dimension discrimination | Node classification; computer vision (object detection, instance segmentation, etc) | link |
MA-GCL: Model Augmentation Tricks for Graph Contrastive Learning | AAAI'23 | Instance discrimination (InfoNCE) | Node classification | link |
ImGCL: Revisiting Graph Contrastive Learning on Imbalanced Node Classification | AAAI'23 | Instance discrimination (InfoNCE) | Node classification (imbalanced) | -- |
Spectral Feature Augmentation for Graph Contrastive Learning and Beyond (SFA) | AAAI'23 | Instance discrimination (Other) | Node classification; node clustering; graph classification; image classification | link |
Beyond Smoothing: Unsupervised Graph Representation Learning with Edge Heterophily Discriminating (GREET) | AAAI'23 | Instance discrimination (Triplet margin) | Node classification | link |
Link Prediction with Non-Contrastive Learning (T-BGRL) | ICLR'23 | Instance discrimination (Bootstrapping) | Link prediction | link |
LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation | ICLR'23 | Instance discrimination (InfoNCE) | Recommendation | link |
GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner | WWW'23 | Instance discrimination (MSE) | Node classification | link |
Graph Self-supervised Learning with Augmentation-aware Contrastive Learning (ABGML) | WWW'23 | Instance discrimination (Bootstrapping) | Node classification; node clustering; similarity search | link |
Randomized Schur Complement Views for Graph Contrastive Learning (rLap) | ICML'23 | Instance discrimination (InfoNCE, Bootstrapping) | Node classification | link |
Graph Contrastive Learning with Generative Adversarial Network (GACN) | KDD'23 | Instance discrimination (InfoNCE, BPR) | Node classification; link prediction | -- |
GiGaMAE: Generalizable Graph Masked Autoencoder via Collaborative Latent Space Reconstruction | CIKM'23 | Instance discrimination (InfoNCE) | Node classification; node clustering; link prediction | link |
Exploring Universal Principles for Graph Contrastive Learning: A Statistical Perspective | MM'23 | Dimension discrimination | Node classification | -- |
Provable Training for Graph Contrastive Learning (POT) | NeurIPS'23 | Instance discrimination (InfoNCE) | Node classification | link |
Graph Contrastive Learning with Stable and Scalable Spectral Encoding (Sp2GCL) | NeurIPS'23 | Instance discrimination (InfoNCE) | Node classification; graph classification; graph regression | link |
RARE: Robust Masked Graph Autoencoder | TKDE'23 | Instance discrimination (MSE) | Node classification; graph classification; image classification | link |
Multi-Scale Self-Supervised Graph Contrastive Learning With Injective Node Augmentation (MS-CIA) | TKDE'23 | Instance discrimination (InfoNCE) | Node classification | -- |
Boosting Graph Contrastive Learning via Adaptive Sampling (AdaS) | TNNLS'23 | Instance discrimination (InfoNCE) | Node classification | -- |
Affinity Uncertainty-Based Hard Negative Mining in Graph Contrastive Learning (AUGCL) | TNNLS'23 | Instance discrimination (InfoNCE) | Node classification | link |
Unsupervised Structure-Adaptive Graph Contrastive Learning | TNNLS'23 | Instance discrimination (InfoNCE) | Node classification; node clustering; graph classification | -- |
Hierarchically Contrastive Hard Sample Mining for Graph Self-Supervised Pretraining (HCHSM) | TNNLS'23 | Instance discrimination (JS) | Node classification; node clustering | link |
Dual Contrastive Learning Network for Graph Clustering (DCLN) | TNNLS'23 | Dimension discrimination | Node classification; node clustering | link |
Graph Contrastive Learning With Adaptive Proximity-Based Graph Augmentation (PA-GCL) | TNNLS'23 | Dimension discrimination | Node classification; link prediction | link |
Augmentation-Free Graph Contrastive Learning of Invariant-Discriminative Representations (iGCL) | TNNLS'23 | Instance discrimination (MSE); dimension discrimination | Node classification | link |
Single-Pass Contrastive Learning Can Work for Both Homophilic and Heterophilic Graph (SP-GCL) | TMLR'23 | Instance discrimination (SP) | Node classification | link |
Calibrating and Improving Graph Contrastive Learning (Contrast-Reg) | TMLR'23 | Instance discrimination (InfoNCE) | Node classification; node clustering; link prediction | link |
Oversmoothing: A Nightmare for Graph Contrastive Learning? (BlockGCL) | arXiv:2306 | Dimension discrimination | Node classification | link |
Rethinking and Simplifying Bootstrapped Graph Latents (SGCL2) | WSDM'24 | Instance discrimination (Bootstrapping) | Node classification | link |
Towards Alignment-Uniformity Aware Representation in Graph Contrastive Learning (AUAR) | WSDM'24 | Instance discrimination (InfoNCE) | Node classification; node clustering | -- |
ReGCL: Rethinking Message Passing in Graph Contrastive Learning | AAAI'24 | Instance discrimination (InfoNCE) | Node classification | link |
A New Mechanism for Eliminating Implicit Conflict in Graph Contrastive Learning (PiGCL) | AAAI'24 | Instance discrimination (InfoNCE) | Node classification; node clustering | link |
ASWT-SGNN: Adaptive Spectral Wavelet Transform-Based Self-Supervised Graph Neural Network | AAAI'24 | Instance discrimination (InfoNCE) | Node classification; graph classification | -- |
Graph Contrastive Invariant Learning from the Causal Perspective (GCIL) | AAAI'24 | Dimension discrimination | Node classification | link |
A Graph is Worth 1-bit Spikes: When Graph Contrastive Learning Meets Spiking Neural Networks (SpikeGCL) | ICLR'24 | Instance discrimination (Triplet margin) | Node classification | link |
Self-supervised Heterogeneous Graph Learning: a Homophily and Heterogeneity View (HERO) | ICLR'24 | Instance discrimination (MSE) | (Heterogeneous) node classification; similarity search | link |
Generative and Contrastive Paradigms Are Complementary for Graph Self-Supervised Learning (GCMAE1) | ICDE'24 | Instance discrimination (InfoNCE) | Node classification; node clustering; graph classification; link prediction | link |
GradGCL: Gradient Graph Contrastive Learning | ICDE'24 | Instance discrimination (InfoNCE) | Node classification; graph classification | link |
Incorporating Dynamic Temperature Estimation into Contrastive Learning on Graphs (GLATE) | ICDE'24 | Instance discrimination (InfoNCE) | Node classification; node clustering; graph classification; link prediction | link |
Graph Augmentation for Recommendation (GraphAug) | ICDE'24 | Instance discrimination (InfoNCE, BPR) | Recommendation | link |
Graph Contrastive Learning with Cohesive Subgraph Awareness (CTAug) | WWW'24 | Instance discrimination (InfoNCE) | Node classification | link |
Towards Expansive and Adaptive Hard Negative Mining: Graph Contrastive Learning via Subspace Preserving (GRAPE) | WWW'24 | Instance discrimination (InfoNCE) | Node classification; node clustering | link |
MARIO: Model Agnostic Recipe for Improving OOD Generalization of Graph Contrastive Learning | WWW'24 | Instance discrimination (InfoNCE) | Node classification; graph classification | link |
Graph Contrastive Learning via Interventional View Generation (GCL-IVG) | WWW'24 | Instance discrimination (InfoNCE) | Node classification; node clustering | -- |
Graph Contrastive Learning with Kernel Dependence Maximization for Social Recommendation (CL-KDM) | WWW'24 | Instance discrimination (InfoNCE, BPR) | Recommendation | -- |
High-Frequency-aware Hierarchical Contrastive Selective Coding for Representation Learning on Text-attributed Graphs (HASH-CODE) | WWW'24 | Instance discrimination (SP) | Node classification; link prediction | -- |
S3GCL: Spectral, Swift, Spatial Graph Contrastive Learning | ICML'24 | Instance discrimination (InfoNCE) | Node classification | link |
Geometric View of Soft Decorrelation in Self-Supervised Learning (LogDet) | KDD'24 | Dimension discrimination | Node classification | -- |
Towards Robust Recommendation via Decision Boundary-aware Graph Contrastive Learning (RGCL2) | KDD'24 | Instance discrimination (InfoNCE, BPR) | Recommendation | link |
Gaussian Mutual Information Maximization for Efficient Graph Self-Supervised Learning: Bridging Contrastive-based to Decorrelation-based (GMIM) | MM'24 | Dimension discrimination | Node classification | -- |
Exploitation of a Latent Mechanism in Graph Contrastive Learning: Representation Scattering (SGRL) | NeurIPS'24 | Instance discrimination (Bootstrapping) | Node classification; node clustering | link |
Leveraging Contrastive Learning for Enhanced Node Representations in Tokenized Graph Transformers (GCFormer) | NeurIPS'24 | Instance discrimination (InfoNCE) | Node classification | -- |
Unified Graph Augmentations for Generalized Contrastive Learning on Graphs (GOUDA) | NeurIPS'24 | Instance discrimination (InfoNCE); Dimension discrimination | Node classification; node clustering; graph classification | link |
Redundancy Is Not What You Need: An Embedding Fusion Graph Auto-Encoder for Self-Supervised Graph Representation Learning (EFGAE) | TNNLS'24 | Dimension discrimination | Node classification | -- |
Multilevel Contrastive Graph Masked Autoencoders for Unsupervised Graph-Structure Learning (MCGMAE) | TNNLS'24 | Instance discrimination (InfoNCE) | Node classification | -- |
SimMLP: Training MLPs on Graphs without Supervision | WSDM'25 | Instance discrimination (MSE) | Node classification; graph classification; link prediction | link |
UniGLM: Training One Unified Language Model for Text-Attributed Graphs | WSDM'25 | Instance discrimination (InfoNCE)1 | Node classification; link prediction | link |
1For language models.
Node properties
- Property prediction: a regression task to predict the property of a node (e.g. degree)
- Centrality ranking: to estimate whether the centrality score of a node is greater/lower than that of another node
- Node order matching: to match the output node order with the input order
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Unsupervised Pre-training of Graph Convolutional Networks (ScoreRank) | ICLR Workshop (RLGM)'19 | Centrality ranking | Node classification | -- |
Self-supervised Learning on Graphs: Deep Insights and New Direction (NodeProperty) | arXiv:2006 | Property prediction (degree, clustering coefficient, etc.) | Node classification | link |
Permutation-Invariant Variational Autoencoder for Graph-Level Representation Learning (PIGAE) | NeurIPS'21 | Node order matching | Graph classification | link |
Graph Auto-Encoder Via Neighborhood Wasserstein Reconstruction (NWR-GAE) | ICLR'22 | Property prediction (degree) | Node classification; structural role identification | link |
What's Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders (MaskGAE) | KDD'23 | Property prediction (degree) | Node classification; link prediction | link |
Links
- Link prediction: a generally binary classification task that predicts if two nodes are connected by a link. For heterogeneous graphs, link prediction is based on meta-paths. For hypergraphs, link prediction searchs for the missing node given other nodes in a hyperedge
- Link denoising: to add (generally continuous) noises to the original edge set and try to reconstruct it
- Masked link prediction: to predict the masked links by node representations propagated on the unmasked graph. It is "autoregressive" if the predicted links are generated one-by-one
- (Masked) edge feature prediction: to predict the original (masked) edge features by node representations
- Replaced edge feature prediction: to replace some edge properties with different ones and learn to find and reconstruct the replaced edges, similar to replaced node prediction
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Variational Graph Auto-Encoders (GAE, VGAE) | NIPS Workshop (BDL)'16 | Link prediction | Link prediction | link |
Adversarially Regularized Graph Autoencoder for Graph Embedding (ARGA, ARVGA) | IJCAI'18 | Link prediction | Link prediction; node clustering | link |
Unsupervised Pre-training of Graph Convolutional Networks (DenoisingRecon) | ICLR Workshop (RLGM)'19 | Masked link prediction | Node classification | -- |
Graphite: Iterative Generative Modeling of Graphs | ICML'19 | Link prediction | Node classification; link prediction | link |
Semi-Implicit Graph Variational Auto-Encoders (SIG-VAE) | NeurIPS'19 | Link prediction | Node classification; link prediction; node clustering; graph generation | link |
Strategies for Pre-training Graph Neural Networks (AttrMask) | ICLR'20 | Masked edge feature prediction | Graph classification; biological function prediction | link |
GPT-GNN: Generative Pre-Training of Graph Neural Networks | KDD'20 | Masked link prediction (autoregressive) | Node classification; link prediction; edge classification | link |
Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs (SELAR) | NeurIPS'20 | Link prediction | (Heterogeneous) node classification; link prediction | link |
Self-supervised Learning on Graphs: Deep Insights and New Direction (EdgeMask) | arXiv:2006 | Masked link prediction | Node classification | link |
Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning (CG3) | AAAI'21 | Link prediction | Node classification | link |
How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision (SuperGAT) | ICLR'21 | Link prediction | Node classification; link prediction | link |
Permutation-Invariant Variational Autoencoder for Graph-Level Representation Learning (PIGAE) | NeurIPS'21 | Link prediction; edge feature prediction | Graph classification | link |
Motif-based Graph Self-Supervised Learning for Molecular Property Prediction (MGSSL) | NeurIPS'21 | Masked edge feature prediction | Graph classification | link |
Self-Supervised Graph Representation Learning via Topology Transformations (TopoTER) | TKDE'21 | Masked link prediction | Node classification; graph classification; link prediction | link |
Directed Graph Auto-Encoders (DiGAE) | AAAI'22 | Link prediction | (Directed) link prediction | link |
GPPT: Graph Pre-training and Prompt Tuning to Generalize Graph Neural Networks | KDD'22 | Masked link prediction | Node classification | link |
Link Prediction with Contextualized Self-Supervision (CSSL2) | TKDE'22 | Link prediction | Link prediction | link |
Interpretable Node Representation with Attribute Decoding (NORAD) | TMLR'22 | Link prediction | Node classification; node clustering; link prediction | -- |
S2GAE: Self-Supervised Graph Autoencoders are Generalizable Learners with Graph Masking | WSDM'23 | Masked link prediction | Node classification; graph classification; link prediction | link |
Dual Low-Rank Graph Autoencoder for Semantic and Topological Networks (DLR-GAE) | AAAI'23 | Link prediction | Node classification | link |
Heterogeneous Graph Masked Autoencoders (HGMAE) | AAAI'23 | Masked link prediction | (Heterogeneous) node classification; node clustering | link |
Deep Manifold Graph Auto-Encoder for Attributed Graph Embedding (DMGAE, DMVGAE) | ICASSP'23 | Link prediction | Node clustering; link prediction | -- |
Multi-head Variational Graph Autoencoder Constrained by Sum-product Networks (SPN-MVGAE) | WWW'23 | Link prediction | Node classification; link prediction | link (unavailable) |
SeeGera: Self-supervised Semi-implicit Graph Variational Auto-encoders with Masking | WWW'23 | Masked link prediction | Node classification; link prediction; attribute prediction | link |
Graph-Aware Language Model Pre-Training on a Large Graph Corpus Can Help Multiple Graph Applications (GALM) | KDD'23 | Link prediction | (Heterogeneous) node classification; link prediction; edge classification | -- |
DiP-GNN: Discriminative Pre-Training of Graph Neural Networks | NeurIPS Workshop (GLFrontiers)'23 | Masked link prediction | Node classification; link prediction | -- |
Maximizing Mutual Information Across Feature and Topology Views for Representing Graphs (MVMI-FT) | TKDE'23 | Link prediction | Node classification; node clustering | link |
Towards Effective and Robust Graph Contrastive Learning With Graph Autoencoding (AEGCL) | TKDE'23 | Link prediction | Node classification; node clustering; link prediction | link |
ULTRA-DP: Unifying Graph Pre-training with Multi-task Graph Dual Prompt | arXiv:2310 | Link prediction | Node classification; link prediction | link |
Incomplete Graph Learning via Attribute-Structure Decoupled Variational Auto-Encoder (ASD-VAE) | WSDM'24 | Edge feature prediction | Node classification; node attribute completion | link |
Generative and Contrastive Paradigms Are Complementary for Graph Self-Supervised Learning (GCMAE1) | ICDE'24 | Link prediction | Node classification; node clustering; graph classification; link prediction | link |
DiscoGNN: A Sample-Efficient Framework for Self-Supervised Graph Representation Learning | ICDE'24 | Replaced edge feature prediction | Graph classification; similarity search | link (unavailable) |
Decoupled Variational Graph Autoencoder for Link Prediction (D-VGAE) | WWW'24 | Link prediction | Node classification; node clustering; link prediction | link |
Masked Graph Autoencoder with Non-discrete Bandwidths (Bandana) | WWW'24 | Link denoising | Node classification; link prediction | link |
HC-GAE: The Hierarchical Cluster-based Graph Auto-Encoder for Graph Representation Learning | NeurIPS'24 | Link prediction | Node classification; graph classification | -- |
Redundancy Is Not What You Need: An Embedding Fusion Graph Auto-Encoder for Self-Supervised Graph Representation Learning (EFGAE) | TNNLS'24 | Link prediction | Node classification | -- |
Context
- Context discrimination: to distinguish between contextual nodes and non-contextual nodes. LE stands for Laplacian Eigenmaps objective
- Contextual subgraph discrimination: to distinguish between representations aggregated from different contextual subgraphs (maybe from different receptive fields). CE stands for cross-entropy
- Context feature prediction: node feature prediction but to reconstruct the features of k-hop neighbors instead
- Contextual property prediction: to predict the properties of contextual subgraphs (e.g. node / edge types contained, total node / edge counts, structural coefficient)
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Inductive Representation Learning on Large Graphs (GraphSAGE) | NIPS'17 | Context discrimination (JS) | Node classification | link |
Strategies for Pre-training Graph Neural Networks (ContextPred) | ICLR'20 | Contextual subgraph discrimination (CE) | Graph classification; biological function prediction | link |
GraphZoom: A Multi-level Spectral Approach for Accurate and Scalable Graph Embedding | ICLR'20 | Contextual subgraph discrimination (CE) | Node classification; link prediction | link |
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training | KDD'20 | Contextual subgraph discrimination (InfoNCE) | Node classification; graph classification; similarity search | link |
Graph Attention Auto-Encoders (GATE) | ICTAI'20 | Context discrimination (JS) | Node classification | link |
Sub-Graph Contrast for Scalable Self-Supervised Graph Representation Learning (Subg-Con) | ICDM'20 | Context discrimination (Triplet margin) | Node classification | link |
Self-Supervised Graph Transformer on Large-Scale Molecular Data (GROVER) | NeurIPS'20 | Contextual property prediction | Graph classification; graph regression | link |
Pre-training on Large-Scale Heterogeneous Graph (PT-HGNN) | KDD'21 | Context discrimination (InfoNCE) | (Heterogeneous) node classification; link prediction | link |
Transfer Learning of Graph Neural Networks with Ego-graph Information Maximization (EGI) | NeurIPS'21 | Context discrimination (JS) | Role identification; relation prediction | link |
Contrastive Laplacian Eigenmaps (COLES) | NeurIPS'21 | Context discrimination (LE) | Node classification; node clustering | link |
Graph-MLP: Node Classification without Message Passing in Graph | arXiv:2106 | Context discrimination (InfoNCE) | Node classification | link |
Augmentation-Free Self-Supervised Learning on Graphs (AFGRL) | AAAI'22 | Context discrimination (Bootstrapping) | Node classification; node clustering; similarity search | link |
Simple Unsupervised Graph Representation Learning (SUGRL) | AAAI'22 | Context discrimination (Triplet margin) | Node classification | link |
SAIL: Self-Augmented Graph Contrastive Learning | AAAI'22 | Neighbor feature prediction (BPR) | Node classification; node clustering; link prediction | -- |
Robust Self-Supervised Structural Graph Neural Network for Social Network Prediction | WWW'22 | Contextual subgraph discrimination (InfoNCE) | Node classification; graph classification; similarity search | -- |
Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization (N2N) | CVPR'22 | Context discrimination (InfoNCE) | Node classification | link |
RoSA: A Robust Self-Aligned Framework for Node-Node Graph Contrastive Learning | IJCAI'22 | Contextual subgraph discrimination (InfoNCE) | Node classification | link |
Graph Auto-Encoder Via Neighborhood Wasserstein Reconstruction (NWR-GAE) | ICLR'22 | Context feature prediction | Node classification; structural role identification | link |
Towards Self-supervised Learning on Graphs with Heterophily (HGRL) | CIKM'22 | Context discrimination (InfoNCE) | Node classification; node clustering | link |
Unifying Graph Contrastive Learning with Flexible Contextual Scopes (UGCL) | ICDM'22 | Context discrimination (InfoNCE) | Node classification | link |
Generalized Laplacian Eigenmaps (GLEN) | NeurIPS'22 | Context discrimination (LE) | Node classification; node clustering | link |
Decoupled Self-supervised Learning for Graphs (DSSL) | NeurIPS'22 | Context discrimination (Other) | Node classification | link |
Towards Graph Self-Supervised Learning with Contrastive Adjusted Zooming (G-Zoom) | TNNLS'22 | Context discrimination (JS) | Node classification | -- |
Link Prediction with Contextualized Self-Supervision (CSSL2) | TKDE'22 | Context discrimination (CE) | Link prediction | link |
Graph Soft-Contrastive Learning via Neighborhood Ranking (GSCL) | arXiv:2209 | Context discrimination (InfoNCE) | Node classification; node clustering | -- |
Localized Graph Contrastive Learning (Local-GCL) | arXiv:2212 | Context discrimination (InfoNCE) | Node classification | link |
Deep Graph Structural Infomax (DGSI) | AAAI'23 | Context discrimination (JS) | Node classification | link |
Neighbor Contrastive Learning on Learnable Graph Augmentation (NCLA) | AAAI'23 | Context discrimination (InfoNCE) | Node classification | link |
Eliciting Structural and Semantic Global Knowledge in Unsupervised Graph Contrastive Learning (S3-CL) | AAAI'23 | Contextual subgraph discrimination (InfoNCE) | Node classification; node clustering | link |
GraphPrompt: Unifying Pre-Training and Downstream Tasks for Graph Neural Networks; Generalized Graph Prompt: Toward a Unification of Pre-Training and Downstream Tasks on Graphs (GraphPrompt+) | WWW'23; TKDE'24 | Contextual subgraph discrimination (InfoNCE) | Node classification; graph classification | link |
Contrastive Learning Meets Homophily: Two Birds with One Stone (NeCo) | ICML'23 | Context discrimination (InfoNCE) | Node classification | -- |
Contrastive Cross-scale Graph Knowledge Synergy (CGKS) | KDD'23 | Context discrimination (LE); contextual subgraph discrimination (InfoNCE) | Node classification; graph classification | -- |
Pretraining Language Models with Text-Attributed Heterogeneous Graphs (THLM) | EMNLP Findings'23 | Context discrimination1 | (Heterogeneous) node classification; link prediction | link |
Simple and Asymmetric Graph Contrastive Learning without Augmentations (GraphACL) | NeurIPS'23 | Context discrimination (InfoNCE) | Node classification | link |
Better with Less: A Data-Active Perspective on Pre-Training Graph Neural Networks (APT) | NeurIPS'23 | Context discrimination (InfoNCE) | Node classification; graph classification | link |
Dual Contrastive Learning Network for Graph Clustering (DCLN) | TNNLS'23 | Context discrimination (InfoNCE) | Node classification; node clustering | link |
Hierarchical Topology Isomorphism Expertise Embedded Graph Contrastive Learning (HTML) | AAAI'24 | Contextual property prediction (structural coefficient) | Graph classification | link |
HGPROMPT: Bridging Homogeneous and Heterogeneous Graphs for Few-shot Prompt Learning | AAAI'24 | Contextual subgraph discrimination (InfoNCE) | (Heterogeneous) node classification; graph classification | link |
Graph Contrastive Learning Reimagined: Exploring Universality (ROSEN) | WWW'24 | Context discrimination (InfoNCE) | Node classification; node clustering | -- |
High-Frequency-aware Hierarchical Contrastive Selective Coding for Representation Learning on Text-attributed Graphs (HASH-CODE) | WWW'24 | Context discrimination (SP); contextual subgraph discrimination (SP) | Node classification; link prediction | -- |
HeterGCL: Graph Contrastive Learning Framework on Heterophilic Graph | IJCAI'24 | Context discrimination (InfoNCE) | Node classification; node clustering | link |
S3GCL: Spectral, Swift, Spatial Graph Contrastive Learning | ICML'24 | Context discrimination (InfoNCE) | Node classification | link (unavailable) |
Efficient Contrastive Learning for Fast and Accurate Inference on Graphs (GraphECL) | ICML'24 | Context discrimination (InfoNCE) | Node classification | link (unavailable) |
Self-Pro: A Self-Prompt and Tuning Framework for Graph Neural Networks | ECML-PKDD'24 | Context discrimination (InfoNCE) | Node classification; link prediction | link |
Smoothed Graph Contrastive Learning via Seamless Proximity Integration (SGCL4) | LoG'24 | Context discrimination (cosine similarity) | Node classification; graph classification | link |
FUG: Feature-Universal Graph Contrastive Pre-training for Graphs with Diverse Node Features | NeurIPS'24 | Context discrimination (MSE) | Node classification | link |
TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing Graph and Text Mutual Transformations | arXiv:2405 | Contextual subgraph discrimination (cosine similarity) | Node classification | -- |
1For language models. A binary classification task to predict if a node is contextual.
Long-range similarities
- Similarity prediction: to predict a similarity matrix between nodes. The pairwise similarity can be defined by shortest path distance, PageRank similarity, Katz index, Jaccard coefficient,
$\ell_2$ distance & cosine similarity between output representations / input-output, etc - Similarity-based discrimination: instance discrimination that is node similarity-aware
- Similarity graph alignment: to construct an additional similarity graph based on pairwise similarities of node features or graph topology, and minimize the distance of representation distributions between them (the original and similarity graph, or two different similarity graphs)
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Adaptive Graph Encoder for Attributed Graph Embedding (AGE) | KDD'20 | Similarity prediction (cosine similarity) | Node clustering; link prediction | link |
AM-GCN: Adaptive Multi-channel Graph Convolutional Networks | KDD'20 | Similarity graph alignment | Node classification | link |
Graph-Bert: Only Attention is Needed for Learning Graph Representations | arXiv:2001 | Similarity prediction (PageRank, etc.) | Node classification; node clustering | link |
Self-supervised Learning on Graphs: Deep Insights and New Direction (PairwiseDistance, PairwiseAttrSim) | arXiv:2006 | Similarity prediction (shortest path distance; cosine similarity) | Node classification | link |
SAIL: Self-Augmented Graph Contrastive Learning | AAAI'22 | Similarity prediction (cosine similarity) | Node classification; node clustering; link prediction | -- |
Self-Supervised Graph Representation Learning via Global Context Prediction; A New Self-supervised Task on Graphs: Geodesic Distance Prediction (S2GRL) | Information Sciences'22 | Similarity prediction (shortest path distance) | Node classification; node clustering; link prediction | -- |
Dual Low-Rank Graph Autoencoder for Semantic and Topological Networks (DLR-GAE) | AAAI'23 | Similarity graph alignment | Node classification | link |
Attribute and Structure Preserving Graph Contrastive Learning (ASP) | AAAI'23 | Similarity graph alignment | Node classification | link |
Beyond Smoothing: Unsupervised Graph Representation Learning with Edge Heterophily Discriminating (GREET) | AAAI'23 | Similarity-based discrimination (cosine similarity) | Node classification | link |
Deep Manifold Graph Auto-Encoder for Attributed Graph Embedding (DMGAE, DMVGAE) | ICASSP'23 | Similarity prediction ( |
Node clustering; link prediction | -- |
Self-Supervised Teaching and Learning of Representations on Graphs (GraphTL) | WWW'23 | Similarity-based discrimination (cosine similarity) | Node classification | -- |
Graph Self-supervised Learning via Proximity Divergence Minimization (PDM) | UAI'23 | Similarity prediction (heat kernel, personalized PageRank, SimRank) | Node classification | link |
Maximizing Mutual Information Across Feature and Topology Views for Representing Graphs (MVMI-FT) | TKDE'23 | Similarity graph alignment | Node classification; node clustering | link |
Towards Effective and Robust Graph Contrastive Learning With Graph Autoencoding (AEGCL) | TKDE'23 | Similarity graph alignment | Node classification; node clustering; link prediction | link |
ULTRA-DP: Unifying Graph Pre-training with Multi-task Graph Dual Prompt | arXiv:2310 | Similarity prediction (cosine similarity) | Node classification; link prediction | link |
Deep Contrastive Graph Learning with Clustering-Oriented Guidance (DCGL) | AAAI'24 | Similarity graph alignment | Node clustering | link |
E2GCL: Efficient and Expressive Contrastive Learning on Graph Neural Networks | ICDE'24 | Similarity-based discrimination | Node classification; graph classification; link prediction | -- |
Improving Graph Contrastive Learning via Adaptive Positive Sampling (HEATS) | CVPR'24 | Similarity-based discrimination (block diagonal affinity) | Node classification; image classification | -- |
ConGraT: Self-Supervised Contrastive Pretraining for Joint Graph and Text Embeddings | ACL Workshop (TextGraphs)'24 | Similarity-based discrimination (common neighbors, SimRank) | Node classification; link prediction | link |
Enhancing Graph Contrastive Learning with Node Similarity (SimEnhancedGCL) | KDD'24 | Similarity-based discrimination (cosine similarity, personalized PageRank) | Node classification | link |
Select Your Own Counterparts: Self-Supervised Graph Contrastive Learning With Positive Sampling (GPS) | TNNLS'24 | Similarity-based discrimination (cosine similarity, personalized PageRank, etc) | Node classification | -- |
Motifs
- Motif prediction: to assign each node (or supernode in the fragment graph) a motif pseudo-label given by unsupervised motif discovery algorithms (e.g. RDKit) and learn to predict them. It is "autoregressive" if the predicted supernodes are generated one-by-one
- Motif-based masked feature prediction: similar to masked feature prediction, but the features are masked in motifs
- Motif-based discrimination: to perform contrast between the original graph view and the fragment graph view
Papers | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Self-Supervised Graph Transformer on Large-Scale Molecular Data (GROVER) | NeurIPS'20 | Motif prediction | Graph classification; graph regression | link |
Motif-based Graph Self-Supervised Learning for Molecular Property Prediction (MGSSL) | NeurIPS'21 | Motif prediction (autoregressive) | Graph classification | link |
Fragment-based Pretraining and Finetuning on Molecular Graphs (GraphFP) | NeurIPS'23 | Motif prediction; motif-based discrimination (InfoNCE) | Graph classification; graph regression | link |
Motif-aware Riemannian Graph Neural Network with Generative-Contrastive Learning (MotifRGC) | AAAI'24 | Motif-based discrimination (InfoNCE) | Node classification; link prediction | link |
Empowering Dual-Level Graph Self-Supervised Pretraining with Motif Discovery (DGPM) | AAAI'24 | Motif prediction | Graph classification | link |
Graph Contrastive Learning with Cohesive Subgraph Awareness (CTAug) | WWW'24 | Motif-based discrimination (InfoNCE) | Graph classification | link |
Motif-aware Attribute Masking for Molecular Graph Pre-training (MoAMa) | LoG'24 | Motif-based masked feature prediction | Graph classification | link |
Motif-Driven Contrastive Learning of Graph Representations (MICRO-Graph) | TKDE'24 | Motif-based discrimination (InfoNCE) | Graph classification | link |
Fine-grained Semantics Enhanced Contrastive Learning for Graphs (FSGCL) | TKDE'24 | Motif-based discrimination (Bootstrapping) | Node classification | -- |
Clusters
- Synthetic graph discrimination: binary classification between two synthetic graphs with different synthesizers (Erdős-Rényi generator / SBM generator)
- Node clustering: to assign each node a cluster centroid (prototype) and - i) minimize the distance between nodes and their corresponding centroids in the latent space; or ii) minimize the distance between the learned centroids and the ground-truth centroids given by unsupervised feature clustering algorithms (e.g. K-means, DeepCluster)
- Graph partitioning: to assign each node a cluster centroid (prototype) and - i) predict the quality of the learned partitions evaluated by some metrics, e.g. maximizing modularity or minimizing the normalized edge weights of a graph cut (spectral clustering); or ii) predict the cluster membership of each node given by unsupervised graph partitioning algorithms (structure-based, e.g. METIS, Louvain)
- Cluster/partition-based instance discrimination: instance discrimination that is aware of graph clustering/partitioning memberships
- Cluster/partition-conditioned link prediction: to maximize the log-likelihood of existing links, but conditioned by the graph cluster/partition distributions
- Partition-conditioned masked link prediction: similar to masked link prediction, but the links are masked in clusters
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
SGR: Self-Supervised Spectral Graph Representation Learning | KDD Workshop (DLD)'18 | Synthetic graph discrimination | Graph classification | -- |
Unsupervised Pre-training of Graph Convolutional Networks (ClusterDetect) | ICLR Workshop (RLGM)'19 | Graph partitioning | Node classification | -- |
Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labeled Nodes (M3S) | AAAI'20 | Node clustering | Node classification | link |
Collaborative Graph Convolutional Networks: Unsupervised Learning Meets Semi-Supervised Learning (CGCN) | AAAI'20 | Partition-conditioned link prediction | Node classification; node clustering | link (deleted) |
When Does Self-Supervision Help Graph Convolutional Networks? (NodeCluster, GraphPar) | ICML'20 | Node clustering; graph partitioning | Node classification | link |
CommDGI: Community Detection Oriented Deep Graph Infomax | CIKM'20 | Cluster-based discrimination (JS); graph partitioning | Node clustering | link |
Dirichlet Graph Variational Autoencoder (DGVAE) | NeurIPS'20 | Partition-conditioned link prediction | Graph generation; node clustering | link |
Self-supervised Learning on Graphs: Deep Insights and New Direction (Distance2Clusters) | arXiv:2006 | Graph partitioning | Node classification | link |
Mask-GVAE: Blind Denoising Graphs via Partition | WWW'21 | Graph partitioning; partition-conditioned masked link prediction | Node clustering; graph denoising | link |
Self-supervised Graph-level Representation Learning with Local and Global Structure (GraphLoG) | ICML'21 | Node clustering | Graph classification; biological function prediction | link |
Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction (GIANT) | ICLR'22 | Node clustering1 | Node classification | link |
Graph Communal Contrastive Learning (gCooL) | WWW'22 | Partition-based discrimination (InfoNCE) | Node classification; node clustering | link |
Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering (SHGP) | NeurIPS'22 | Graph partitioning | (Heterogeneous) node classification; node clustering | link |
Eliciting Structural and Semantic Global Knowledge in Unsupervised Graph Contrastive Learning (S3-CL) | AAAI'23 | Cluster-based discrimination (InfoNCE) | Node classification; node clustering | link |
CSGCL: Community-Strength-Enhanced Graph Contrastive Learning | IJCAI'23 | Partition-based discrimination (InfoNCE) | Node classification; node clustering; link prediction | link |
HomoGCL: Rethinking Homophily in Graph Contrastive Learning | KDD'23 | Node clustering; cluster-based discrimination (InfoNCE) | Node classification; node clustering | link |
CARL-G: Clustering-Accelerated Representation Learning on Graphs | KDD'23 | Node clustering | Node classification; node clustering; similarity search | link |
Towards Alignment-Uniformity Aware Representation in Graph Contrastive Learning (AUAR) | WSDM'24 | Node clustering | Node classification; node clustering | -- |
Deep Contrastive Graph Learning with Clustering-Oriented Guidance (DCGL) | AAAI'24 | Cluster-based discrimination (InfoNCE) | Node clustering | link |
StructComp: Substituting propagation with Structural Compression in Training Graph Contrastive Learning | ICLR'24 | Partition-based discrimination (JS, InfoNCE, etc.) | Node classification | link |
MARIO: Model Agnostic Recipe for Improving OOD Generalization of Graph Contrastive Learning | WWW'24 | Cluster-based discrimination | Node classification; graph classification | link |
Graph Contrastive Learning with Kernel Dependence Maximization for Social Recommendation (CL-KDM) | WWW'24 | Partition-based discrimination (BPR) | Recommendation | -- |
HeterGCL: Graph Contrastive Learning Framework on Heterophilic Graph | IJCAI'24 | Cluster-based discrimination (MSE) | Node classification; node clustering | link |
Community-Invariant Graph Contrastive Learning (CI-GCL) | ICML'24 | Partition-based discrimination (InfoNCE) | Graph classification; graph regression | link |
From Coarse to Fine: Enable Comprehensive Graph Self-supervised Learning with Multi-granular Semantic Ensemble (MGSE) | ICML'24 | Node clustering | Graph classification | link |
Revisiting Self-Supervised Heterogeneous Graph Learning from Spectral Clustering Perspective (SCHOOL) | NeurIPS'24 | Partition-based discrimination (MSE) | Node classification; node clustering | link |
Motif-Driven Contrastive Learning of Graph Representations (MICRO-Graph) | TKDE'24 | Graph partitioning | Graph classification | link |
1Called "neighborhood prediction" in the original paper. A pretext task for language models to match textual attributes with cluster labels based on attributed cSBM.
Global structure
- Global-local instance discrimination: instance discrimination between the representation of each node and a global representation vector, usually aggregated from the whole graph by a readout function
- Group discrimination: a simplified global-local instance discrimination that binarily classifies if a node belongs to the original or the perturbed graph
- Global instance discrimination: to discriminate between global representations of different graph views (generally for small-scale graphs)
- Global dimension discrimination: dimension discrimination of different graph representations
- Graph similarity prediction: to predict various kinds of similarity functions between pairs of graphs, e.g. graph kernels (graphlet kernel, random walk kernel, graph edit distance kernel, etc)
- Half-graph matching: to divide each graph into two halves and predict if two halves are from the same original graph
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Pre-training Graph Neural Networks with Kernels (KernelPred) | arXiv:1811 | Graph similarity prediction | Graph classification | -- |
Deep Graph InfoMax (DGI) | ICLR'19 | Global-local instance discrimination (JS) | Node classification | link |
InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization | ICLR'20 | Global-local instance discrimination (JS) | Graph classification | link |
Graph Contrastive Learning with Augmentations (GraphCL) | NeurIPS'20 | Global instance discrimination (InfoNCE) | Graph classification | link |
Contrastive Multi-View Representation Learning on Graphs (MVGRL) | ICML'20 | Global-local instance discrimination (JS) | Node classification; graph classification | link |
Contrastive Self-supervised Learning for Graph Classification (CSSL1) | AAAI'21 | Global instance discrimination (InfoNCE) | Graph classification | -- |
SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism | WWW'21 | Global-local instance discrimination (JS) | Graph classification | link |
Pairwise Half-graph Discrimination: A Simple Graph-level Self-supervised Strategy for Pre-training Graph Neural Networks (PHD); An Effective Self-Supervised Framework for Learning Expressive Molecular Global Representations to Drug Discovery (MPG) | IJCAI'21; Briefings in Bioinformatics'21 | Half-graph matching | Graph classification | link |
Graph Contrastive Learning Automated (JOAO) | ICML'21 | Global instance discrimination (InfoNCE) | Graph classification | link |
Adversarial Graph Augmentation to Improve Graph Contrastive Learning (AD-GCL) | NeurIPS'21 | Global instance discrimination (InfoNCE) | Graph classification | link |
InfoGCL: Information-Aware Graph Contrastive Learning | NeurIPS'21 | Global instance discrimination (Bootstrapping); global-local instance discrimination (Bootstrapping) | Node classification; graph classification | -- |
Graph Adversarial Self-Supervised Learning (GASSL) | NeurIPS'21 | Global instance discrimination (Bootstrapping) | Graph classification | link (unavailable) |
Disentangled Contrastive Learning on Graphs (DGCL) | NeurIPS'21 | Global instance discrimination (Other) | Graph classification | link |
Bringing Your Own View: Graph Contrastive Learning without Prefabricated Data Augmentations (GraphCL-LP) | WSDM'22 | Global instance discrimination (InfoNCE) | Graph classification | link |
Self-Supervised Graph Neural Networks via Diverse and Interactive Message Passing (DIMP) | AAAI'22 | Global-local instance discrimination (JS) | Node classification; node clustering; graph classification | link |
AutoGCL: Automated Graph Contrastive Learning via Learnable View Generators | AAAI'22 | Global instance discrimination (InfoNCE) | Graph classification | link |
Group Contrastive Self-Supervised Learning on Graphs (GroupCL; GroupIG) | TPAMI'22 | Global instance discrimination (JS; contrastive log-ratio upper bound (CLUB)) | Graph classification | -- |
Towards Graph Self-Supervised Learning with Contrastive Adjusted Zooming (G-Zoom) | TNNLS'22 | Global-local instance discrimination (JS) | Node classification | -- |
SimGRACE: A Simple Framework for Graph Contrastive Learning without Data Augmentation | WWW'22 | Global instance discrimination (InfoNCE, Bootstrapping) | Graph classification | link |
Let Invariant Rationale Discovery Inspire Graph Contrastive Learning (RGCL1) | ICML'22 | Global instance discrimination (InfoNCE) | Graph classification | link |
M-Mix: Generating Hard Negatives via Multi-sample Mixing for Contrastive Learning | KDD'22 | Global instance discrimination (InfoNCE) | Node classification; node clustering; graph classification; graph edit distance prediction | link |
AdaGCL: Adaptive Subgraph Contrastive Learning to Generalize Large-scale Graph Training | CIKM'22 | Global-local instance discrimination (JS) | Node classification | link |
Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination (GGD) | NeurIPS'22 | Group discrimination | Node classification | link |
Graph Self-supervised Learning with Accurate Discrepancy Learning (D-SLA) | NeurIPS'22 | Group discrimination; graph similarity prediction | Graph classification; link prediction | link |
Deep Graph Structural Infomax (DGSI) | AAAI'23 | Global-local instance discrimination (JS) | Node classification | link |
Spectral Augmentation for Self-Supervised Learning on Graphs (SPAN) | ICLR'23 | Global-local instance discrimination (InfoNCE) | Node classification; graph classification; graph regression | link |
Mole-BERT: Rethinking Pre-training Graph Neural Networks for Molecules | ICLR'23 | Global instance discrimination (InfoNCE; Triplet margin) | Graph classification; graph regression | link |
Spectral Augmentations for Graph Contrastive Learning (SGCL1) | AISTATS'23 | Global instance discrimination (InfoNCE) | Node classification; graph classification; similarity search | -- |
Generating Counterfactual Hard Negative Samples for Graph Contrastive Learning (CGC) | WWW'23 | Global instance discrimination (InfoNCE) | Graph classification | link |
Multi-Scale Subgraph Contrastive Learning (MSSGCL) | IJCAI'23 | Global-local instance discrimination (InfoNCE); global instance discrimination (InfoNCE) | Graph classification | link |
Boosting Graph Contrastive Learning via Graph Contrastive Saliency (GCS) | ICML'23 | Global instance discrimination (InfoNCE) | Graph classification | link |
SEGA: Structural Entropy Guided Anchor View for Graph Contrastive Learning | ICML'23 | Global instance discrimination (InfoNCE) | Graph classification | link |
Randomized Schur Complement Views for Graph Contrastive Learning (rLap) | ICML'23 | Global-local instance discrimination (InfoNCE); global instance discrimination (InfoNCE) | Graph classification | link |
Graph Self-Contrast Representation Learning (GraphSC) | ICDM'23 | Global instance discrimination (Triplet margin); global dimension discrimination | Graph classification | -- |
Graph Contrastive Learning with Stable and Scalable Spectral Encoding (Sp2GCL) | NeurIPS'23 | Global instance discrimination (InfoNCE) | Node classification; graph classification; graph regression | link |
Maximizing Mutual Information Across Feature and Topology Views for Representing Graphs (MVMI-FT) | TKDE'23 | Global-local instance discrimination (JS) | Node classification; node clustering | link |
Multi-Scale Self-Supervised Graph Contrastive Learning With Injective Node Augmentation (MS-CIA) | TKDE'23 | Global-local instance discrimination (JS) | Node classification | -- |
Hierarchically Contrastive Hard Sample Mining for Graph Self-Supervised Pretraining (HCHSM) | TNNLS'23 | Global-local instance discrimination (JS) | Node classification; node clustering | link |
Dual Contrastive Learning Network for Graph Clustering (DCLN) | TNNLS'23 | Global-local instance discrimination (JS) | Node classification; node clustering | link |
HeGCL: Advance Self-Supervised Learning in Heterogeneous Graph-Level Representation | TNNLS'23 | Global-local instance discrimination (JS) | (Heterogeneous) node classification; graph classification | link |
Affinity Uncertainty-Based Hard Negative Mining in Graph Contrastive Learning (AUGCL) | TNNLS'23 | Global instance discrimination (InfoNCE) | Graph classification | link |
Hierarchical Topology Isomorphism Expertise Embedded Graph Contrastive Learning (HTML) | AAAI'24 | Global instance discrimination (InfoNCE); graph similarity prediction (Jaccard coef-based isomorphic similarity) | Graph classification | link |
TopoGCL: Topological Graph Contrastive Learning | AAAI'24 | Global instance discrimination (InfoNCE) | Graph classification | link |
DiscoGNN: A Sample-Efficient Framework for Self-Supervised Graph Representation Learning | ICDE'24 | Global instance discrimination (InfoNCE) | Graph classification; similarity search | link (unavailable) |
Masked Graph Modeling with Multi-View Contrast (GCMAE2) | ICDE'24 | Global instance discrimination (InfoNCE) | Node classification; graph classification; link prediction | link |
SGCL: Semantic-aware Graph Contrastive Learning with Lipschitz Graph Augmentation (SGCL3) | ICDE'24 | Global instance discrimination (InfoNCE) | Graph classification | -- |
Graph Contrastive Learning with Reinforcement Augmentation (GA2C) | IJCAI'24 | Global instance discrimination (InfoNCE) | Graph classification | -- |
Disentangled Graph Self-supervised Learning for Out-of-Distribution Generalization (OOD-GCL) | ICML'24 | Global instance discrimination (InfoNCE) | Graph classification | -- |
Uncovering Capabilities of Model Pruning in Graph Contrastive Learning (LAMP1) | MM'24 | Global instance discrimination (InfoNCE) | Graph classification | -- |
A Sample-driven Selection Framework: Towards Graph Contrastive Networks with Reinforcement Learning (GraphSaSe) | MM'24 | Global instance discrimination (InfoNCE) | Graph classification | link (private) |
Graph Contrastive Learning with Personalized Augmentation (GPA) | TKDE'24 | Global instance discrimination (InfoNCE) | Graph classification | link |
Graph Contrastive Learning with Min-Max Mutual Information (GCLMI) | Information Sciences'24 | Global instance discrimination (InfoNCE) | Graph classification | link |
Manifolds
- Cross-manifold discrimination: to perform instance discrimination between different manifolds (e.g. Euclidean vs. Hyperbolic)
- Ricci curvature prediction: to predict the aggregated Ricci curvature of each node's neighborhood
- Curvature-based node clustering: to assign each node a cluster centroid and maximize/minimize the curvature-based density within/across clusters
- Hyperbolic angle prediction: to pool representations to 2-dimensional angle vectors in a unit hyperbola. These vectors serve as pseudo-labels for regression
Paper | Venue | Pretext | Downstream | Code |
---|---|---|---|---|
Enhancing Hyperbolic Graph Embeddings via Contrastive Learning (HGCL) | NeurIPS Workshop (SSL)'21 | Cross-manifold discrimination (InfoNCE) | Node classification | -- |
A Self-supervised Mixed-curvature Graph Neural Network (SelfMGNN) | AAAI'22 | Cross-manifold discrimination (InfoNCE) | Node classification | -- |
Dual Space Graph Contrastive Learning (DSGC) | WWW'22 | Cross-manifold discrimination (InfoNCE) | Graph classification | link |
CONGREGATE: Contrastive Graph Clustering in Curvature Spaces | IJCAI'23 | Ricci curvature prediction; cross-manifold discrimination (InfoNCE); curvature-based node clustering | Node clustering | link |
Graph-level Representation Learning with Joint-Embedding Predictive Architectures (GraphJEPA) | arXiv:2309 | Hyperbolic angle prediction | Graph classification; graph regression | link |
Motif-aware Riemannian Graph Neural Network with Generative-Contrastive Learning (MotifRGC) | AAAI'24 | Cross-manifold discrimination (InfoNCE) | Node classification; link prediction | link |
Task generalization strategies
- Multi-task learning: to combine a set of different pre-training tasks with bespoke algorithms / architectures
- Ensemble learning: to jointly pre-train a set of alternative models with a bespoke unified framework (e.g. Mixture-of-Experts)
Paper | Venue | Strategy | Downstream | Code |
---|---|---|---|---|
Adaptive Transfer Learning on Graph Neural Networks (AUX-TS) | KDD'21 | Multi-task learning | Node classification; link prediction | link |
Automated Self-Supervised Learning for Graphs (AutoSSL) | ICLR'22 | Multi-task learning | Node classification; node clustering | link |
Automated Graph Self-supervised Learning via Multi-teacher Knowledge Distillation (AGSSL) | arXiv:2210 | Multi-task learning | Node classification | -- |
Multi-task Self-supervised Graph Neural Networks Enable Stronger Task Generalization (ParetoGNN) | ICLR'23 | Multi-task learning | Node classification; node clustering; graph partition; link prediction | link |
ULTRA-DP: Unifying Graph Pre-training with Multi-task Graph Dual Prompt | arXiv:2310 | Multi-task learning | Node classification; link prediction | link |
Decoupling Weighing and Selecting for Integrating Multiple Graph Pre-training Tasks (WAS) | ICLR'24 | Multi-task learning | Node classification; graph classification | link |
MultiGPrompt for Multi-Task Pre-Training and Prompting on Graphs | WWW'24 | Multi-task learning | Node classification; graph classification | link |
Exploring Correlations of Self-Supervised Tasks for Graphs (GraphTCM) | ICML'24 | Multi-task learning | Node classification; link prediction | link |
UniGM: Unifying Multiple Pre-trained Graph Models via Adaptive Knowledge Aggregation | MM'24 | Multi-task learning | Graph classification | link (private) |
GFT: Graph Foundation Model with Transferable Tree Vocabulary | NeurIPS'24 | Multi-task learning | Node classification; graph classification; link prediction | link |
GraphAlign: Pretraining One Graph Neural Network on Multiple Graphs via Feature Alignment | arXiv:2406 | Ensemble learning | Node classification; link prediction | link |
AnyGraph: Graph Foundation Model in the Wild | arXiv:2408 | Ensemble learning | Node classification; graph classification; link prediction | link |
- Fine-tuning: to jointly learn downstream branches as well as the original pre-trained model. Parameter-efficient fine-tuning (PEFT) only updates part of the pre-trained model, e.g. adapter layers
- Prompting: to jointly encode downstream data and extra learnable task-specific components to instruct the behavior of pre-trained models for downstream generalization
❤️ Contributions by issues and pull requests to this source list are always welcome! Feel free to initiate a discussion with me, or give me a reminder if there are oversights of papers/hyperlinks or categorical mistakes.