This repository explores node classification on the Cora citation graph using Node Embeddings (Node2Vec), Graph Attention Networks (GAT), and Recurrent Graph Neural Networks (RGNN). The project includes data preprocessing, graph analysis, visualization, and advanced graph neural network techniques.
The dataset used is the Cora citation graph:
- Nodes: Represent papers.
- Edges: Represent citation links between papers.
- Features: Bag-of-words representation of each paper.
- Labels: Categories to which each paper belongs.
-
Graph Analysis:
- Compute graph centrality measures: Degree, Closeness, Betweenness, and Eigenvector centralities.
- Visualize the graph using a spring layout.
-
Node Embedding Generation:
- Generate node embeddings using Node2Vec for structural representation.
- Parameters optimized for the Cora dataset.
-
Node Classification with GNNs:
- Graph Attention Networks (GAT):
- Utilizes attention mechanisms for selective aggregation of neighbor features.
- Includes multi-head attention and residual connections.
- Recurrent Graph Neural Networks (RGNN):
- Implements multi-step recurrent updates with GRU cells.
- Focuses on iterative message passing and aggregation.
- Graph Attention Networks (GAT):
- Node2Vec is used to generate node embeddings by simulating random walks and learning structural representations.
- GAT:
- Multi-head attention for neighbor aggregation.
- Residual connections to preserve initial node information.
- RGNN:
- Single-layer recurrent GNN with GRU-based updates.
- Multi-step message passing for iterative aggregation.
- Test Accuracy:
- GAT: Achieved high performance with attention-based aggregation.
- RGNN: Used recurrent updates for deeper message propagation.
- Node Embeddings: Provided a robust feature set for classification.
- Python 3.8+
- Libraries:
torch
,torch-geometric
node2vec
networkx
matplotlib
This project is licensed under the MIT License. See the LICENSE file for details.