Skip to content

Commit

Permalink
typos fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
nicolasdugue committed Mar 4, 2024
1 parent f19a1a4 commit e6064c1
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 7 deletions.
4 changes: 2 additions & 2 deletions examples/structral_node_embedding/sinr_example.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""SINr illustrative example.
Nodes in both cliques (barbell graph) will get the same embedding vectors, except for the ones connected to the path.
Nodes in the paths are in distinct communities with sufficient gamma, and get thus distinct vectors.
Nodes in both cliques (barbell graph) will get the same embedding vectors, except for those connected to the path.
Nodes in the path are in distinct communities with a high-enough gamma, and will thus get distinct vectors.
"""

import networkx as nx
Expand Down
9 changes: 4 additions & 5 deletions karateclub/node_embedding/structural/sinr.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,12 @@
class SINr(Estimator):
r"""An implementation of `"SINr" <https://inria.hal.science/hal-03197434/>`_
from the IDA '21 best paper "SINr: Fast Computing of Sparse Interpretable Node Representations is not a Sin!".
The procedure computes community detection using Louvain algorithm, and calculates the distribution of edges of each node across communities.
The algorithm is one of the fastest, because it relies mostly on Louvain community detection. It thus runs in
quasi-linear time. Regarding space complexity, it requires to be able to store the adjacency matrix and the community membership matrix, it is also quasi-linear.
The procedure performs community detection using the Louvain algorithm, and computes the distribution of edges of each node across all communities.
The algorithm is one of the fastest, because it mostly relies on Louvain community detection. It thus runs in quasi-linear time. Regarding space complexity, the adjacency matrix and the community membership matrix need to be stored, it is also quasi-linear.
Args:
gamma (int): modularity multi-resolution parameter. Default is 1.
The dimension parameter does not exist for SINr, gamma should be use instead: the number of dimensions of the embedding space is based on the number of communities uncovered. The higher gamma is, the more communities are detected, the higher the number of dimensions of the latent space uncovered. For small graphs, setting gamma to 1 is usually a good fit. For bigger graphs, it is recommended to increase gamma (5 or 10 for instance). For word co-occurrence graphs, to deal with word embedding, gamma is isually set to 50 to get a lot of small communities.
The dimension parameter does not exist for SINr, gamma should be used instead: the number of dimensions of the embedding space is based on the number of communities uncovered. The higher gamma is, the more communities are detected, the higher the number of dimensions of the latent space are uncovered. For small graphs, setting gamma to 1 is usually sufficient. For bigger graphs, it is recommended to increase gamma (5 or 10 for example). For word co-occurrence graphs, to deal with word embedding, gamma is usually set to 50 in order to get many small communities.
seed (int): Random seed value. Default is 42.
"""

Expand Down Expand Up @@ -50,7 +49,7 @@ def fit(self, graph: nx.classes.graph.Graph):
self._embedding = norm_adjacency.dot(membership_matrix)

def _get_matrix_membership(self, list_of_communities:List[Set[int]]):
r"""Getting the membership matrix describing for each node (rows), to which community (columns) it belongs.
r"""Getting the membership matrix describing for each node (rows), in which community (column) it belongs.
Return types:
* **Membership matrix** *(scipy sparse matrix csr)* - Size nodes, communities
Expand Down

0 comments on commit e6064c1

Please sign in to comment.