Skip to content

Euler Model EN

Siran Yang edited this page Jun 4, 2019 · 1 revision

In this section, we provide an brief introduction for all of the graph representation learning models implemented in tf_euler. Users can use these models directly in TensorFlow.

Basic type

All graph representation learning models inherit the base class Model, and Model further inherits Layer. Layer is the basic unit encapsulated in tf_euler. Layer and its subclasses have three main functions:

class Layer(object):

  def __init__(self, name=None, **kwargs):
    """Initialize the meta information, and other layers contained in the object."""
    pass
  
  def build(self, input_shape):
    """Initialize the model weights that the object itself will use, which will be called when __call__() is first called."""
    pass

  def call(self, inputs):
    """Applying this object to the input, and will be called when __call__() is called."""
    pass

The call methods of all Model's subclasses use a set of source nodes: 1-D tf.Tensor as the input, and return a 4-D tuple ModelOutput defined as:

ModelOutput = collections.namedtuple(
    'ModelOutput', [
        'embedding', # 2-D tf.Tensor, embedding of input nodes
        'loss', # scala tf.Tensor, loss of the mini-batch
        'metric_name', # str,name of the metric used by model
        'metric' # scala tf.Tensor, value of the (streaming) metric used by model
    ])

All the unsupervised graph embedding learning models inherit from the class UnsupervisedModel, which itself inherits Model:

class UnsupervisedModel(Model):
  def __init__(self,
               node_type, # int, node type, used for negative sampling
               edge_type, # 1-D tf.Tensor, edge types used to sample neighbor (positive sampling)
               max_id, # int, maximum id of nodes in graph
               num_negs=5, # int, number of negative sampling per source node
               **kwargs):
    pass

The call function pf UnsuperVisedModel uses the tf_euler.sample_neighbor to sample positive examples for the source node, and uses tf_euler.sample_node to sample the negative examples, and then use a specific encode/decode method to calculate the loss.

All the supervised graph representation learning models inherit from SupervisedModel, which itself inherits Model:

class SupervisedModel(Model):
  def __init__(self,
               label_idx, # int, feature id of label in dense features
               label_dim, # int, dimension of label
               num_class=None, # int, number of classes, mandatory when label is scalr
               sigmoid_loss=False, # bool, defaults To False,
                                   # whether use simoid to calculate loss,
                                   # True if use sigmoid, False if use softmax
               **kwargs):
    pass

The call function of SupervisedModel uses tf_euler.get_dense_feature to sample node labels from the graph for node classification, and then use the specific encode/decode method to calculate the loss.

Concrete model

tf_euler provides six commonly used graph embedding learning algorithms and two embedding learning algorithms for heterogeneous graphs, which are included in the following classes:

tf_euler.models.LINE( # unsupervised
    node_type, edge_type, max_id, # the same as base class
    dim, # int, dimension of embedding
    order # 1 / 2, LINE order, defaults to 1
)

tf_euler.models.Node2Vec( # unsupervised
    node_type, edge_type, max_id, # the same as base class
    dim, # int, dimension of embedding
    walk_len, # int, length random walk, defaults to 3
    walk_p, # int, return parameter, defaults to 1
    walk_q, # int, in-out parameter, defaults to 1
    left_win_size, # int, left slide window size, defaults to 1
    right_win_size # int, right slide window size, defaults to 1
)

tf_euler.models.GraphSage( # unsupervised
    node_type, edge_type, max_id, # the same as base class
    metapath, # list of 1-D int64 tf.Tensor, edge types used for sampling in each hop
    fanouts, # list of int, number of neighbor samples for each hop
    dim, # int, dimension of embedding
    aggregator, # boolname of aggregator, defaults to mean
    concat, # bool, whether use concat as combiner, defaults to False
    feature_idx, # int, feature id of dense feature, defaults to -1
    feature_dim, # int, dimension of dense feature, defaults to 0
    use_feature, # deprecated
    use_id # whether use embedding of node id as part of H^0
)

tf_euler.models.SupervisedGraphSage( # supervised
    label_idx, label_dim, # the same as base class
    metapath, # list of 1-D int64 tf.Tensor, edge types used for sampling in each hop
    fanouts, # list of int, number of neighbor samples for each hop
    dim, # int, dimension of embedding
    aggregator, # str, name of aggregator, mean / meanpool / maxpool, defaults to mean
    concat, # bool, whether use concat as combiner, defaults to False
    feature_idx, # int, feature id of dense feature, defaults to -1
    feature_dim, # int, dimension of dense feature, defaults to 0
)

tf_euler.models.ScalableSage( # supervised
    label_idx, label_dim, # the same as base class
    edge_type, # 1-D int64 tf.Tensor, edge types used for sampling
    fanout, # int, number of neighbor samples
    num_layers, # int, number of gnn layers
    dim, # int, dimension of embedding
    aggregator, # str, name of aggregator, defaults to mean
    concat, # bool, whether use concat as combiner, defaults to False
    feature_idx, # int, feature id of dense feature, defaults to -1
    feature_dim, # int, dimension of dense feature, defaults to 0
)
# After applying ScalableSage to the input, the below method needs to be called:
scalable_sage.make_session_run_hook()
# and add the returned tf.train.SessionRunHook to tf.train.MonitoredTrainingSession.

tf_euler.models.GAT( # supervised
    label_idx, label_dim, # the same as base class
    feature_idx, # int, feature id of dense feature, defaults to -1
    feature_dim, # int, dimension of dense feature, defaults to 0
    max_id, # int, maximum id of nodes in graph
    hidden_dim, # int, dimension of hidden layer
    nb_num, # int, number of samples, defaults to 5
    edge_type # int, edge type used for sampling, defaults to 0
)
Clone this wiki locally