美文网首页
Graph Convolutional Network

Graph Convolutional Network

作者: 魏鹏飞 | 来源:发表于2020-03-11 11:27 被阅读0次

    This is a gentle introduction of using DGL to implement Graph Convolutional Networks (Kipf & Welling et al., Semi-Supervised Classification with Graph Convolutional Networks). We build upon the earlier tutorial on DGLGraph and demonstrate how DGL combines graph with deep neural network and learn structural representations.

    Model Overview

    Graph Convolutional Network
    GCN from the perspective of message passing

    We describe a layer of graph convolutional neural network from a message passing perspective; the math can be found here. It boils down to the following step, for each node u:

    1. Aggregate neighbors' representations h_v to produce an intermediate representation \hat{h}_u.
    2. Transform the aggregated representation \hat{h}_u with a linear projection followed by a non-linearity: h_u=f(W_u\hat{h}_u).

    We will implement step 1 with DGL message passing, and step 2 with the apply_nodes method, whose node UDF will be a PyTorch nn.Module.

    GCN implementation with DGL

    We first define the message and reduce function as usual. Since the aggregation on a node u only involves summing over the neighbors’ representations h_v, we can simply use builtin functions:

    import dgl
    import dgl.function as fn
    import torch as th
    import torch.nn as nn
    import torch.nn.functional as F
    from dgl import DGLGraph
    
    gcn_msg = fn.copy_src(src='h', out='m')
    gcn_reduce = fn.sum(msg='m', out='h')
    

    We then define the node UDF for apply_nodes, which is a fully-connected layer:

    class NodeApplyModule(nn.Module):
        def __init__(self, in_feats, out_feats, activation):
            super(NodeApplyModule, self).__init__()
            self.linear = nn.Linear(in_feats, out_feats)
            self.activation = activation
    
        def forward(self, node):
            h = self.linear(node.data['h'])
            if self.activation is not None:
                h = self.activation(h)
            return {'h' : h}
    

    We then proceed to define the GCN module. A GCN layer essentially performs message passing on all the nodes then applies the NodeApplyModule. Note that we omitted the dropout in the paper for simplicity.

    class GCN(nn.Module):
        def __init__(self, in_feats, out_feats, activation):
            super(GCN, self).__init__()
            self.apply_mod = NodeApplyModule(in_feats, out_feats, activation)
    
        def forward(self, g, feature):
            g.ndata['h'] = feature
            g.update_all(gcn_msg, gcn_reduce)
            g.apply_nodes(func=self.apply_mod)
            return g.ndata.pop('h')
    

    The forward function is essentially the same as any other commonly seen NNs model in PyTorch. We can initialize GCN like any nn.Module. For example, let’s define a simple neural network consisting of two GCN layers. Suppose we are training the classifier for the cora dataset(the input feature size is 1433 and the number of classes is 7). The last GCN layer computes node embeddings, so the last layer in general doesn’t apply activation.

    class Net(nn.Module):
        def __init__(self):
            super(Net, self).__init__()
            self.gcn1 = GCN(1433, 16, F.relu)
            self.gcn2 = GCN(16, 7, None)
    
        def forward(self, g, features):
            x = self.gcn1(g, features)
            x = self.gcn2(g, x)
            return x
    net = Net()
    print(net)
    
    # Results:
    Net(
      (gcn1): GCN(
        (apply_mod): NodeApplyModule(
          (linear): Linear(in_features=1433, out_features=16, bias=True)
        )
      )
      (gcn2): GCN(
        (apply_mod): NodeApplyModule(
          (linear): Linear(in_features=16, out_features=7, bias=True)
        )
      )
    )
    

    We load the cora dataset using DGL’s built-in data module.

    from dgl.data import citation_graph as citegrh
    import networkx as nx
    def load_cora_data():
        data = citegrh.load_cora()
        features = th.FloatTensor(data.features)
        labels = th.LongTensor(data.labels)
        train_mask = th.BoolTensor(data.train_mask)
        test_mask = th.BoolTensor(data.test_mask)
        g = data.graph
        # add self loop
        g.remove_edges_from(nx.selfloop_edges(g))
        g = DGLGraph(g)
        g.add_edges(g.nodes(), g.nodes())
        return g, features, labels, train_mask, test_mask
    

    When a model is trained, we can use the following method to evaluate the performance of the model on the test dataset:

    def evaluate(model, g, features, labels, mask):
        model.eval()
        with th.no_grad():
            logits = model(g, features)
            logits = logits[mask]
            labels = labels[mask]
            _, indices = th.max(logits, dim=1)
            correct = th.sum(indices == labels)
            return correct.item() * 1.0 / len(labels)
    

    We then train the network as follows:

    import time
    import numpy as np
    g, features, labels, train_mask, test_mask = load_cora_data()
    optimizer = th.optim.Adam(net.parameters(), lr=1e-3)
    dur = []
    for epoch in range(50):
        if epoch >=3:
            t0 = time.time()
    
        net.train()
        logits = net(g, features)
        logp = F.log_softmax(logits, 1)
        loss = F.nll_loss(logp[train_mask], labels[train_mask])
    
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    
        if epoch >=3:
            dur.append(time.time() - t0)
    
        acc = evaluate(net, g, features, labels, test_mask)
        print("Epoch {:05d} | Loss {:.4f} | Test Acc {:.4f} | Time(s) {:.4f}".format(
                epoch, loss.item(), acc, np.mean(dur)))
    
    
    # Results:
    Epoch 00000 | Loss 1.9180 | Test Acc 0.3090 | Time(s) nan
    Epoch 00001 | Loss 1.8872 | Test Acc 0.3100 | Time(s) nan
    Epoch 00002 | Loss 1.8609 | Test Acc 0.3200 | Time(s) nan
    Epoch 00003 | Loss 1.8365 | Test Acc 0.3310 | Time(s) 0.1041
    Epoch 00004 | Loss 1.8127 | Test Acc 0.3420 | Time(s) 0.1042
    Epoch 00005 | Loss 1.7886 | Test Acc 0.3520 | Time(s) 0.1038
    Epoch 00006 | Loss 1.7641 | Test Acc 0.3880 | Time(s) 0.1066
    Epoch 00007 | Loss 1.7396 | Test Acc 0.4090 | Time(s) 0.1093
    Epoch 00008 | Loss 1.7151 | Test Acc 0.4110 | Time(s) 0.1094
    Epoch 00009 | Loss 1.6901 | Test Acc 0.4210 | Time(s) 0.1088
    Epoch 00010 | Loss 1.6649 | Test Acc 0.4300 | Time(s) 0.1081
    Epoch 00011 | Loss 1.6396 | Test Acc 0.4370 | Time(s) 0.1089
    ......
    ......
    ......
    Epoch 00044 | Loss 0.9971 | Test Acc 0.6620 | Time(s) 0.1101
    Epoch 00045 | Loss 0.9845 | Test Acc 0.6690 | Time(s) 0.1100
    Epoch 00046 | Loss 0.9721 | Test Acc 0.6710 | Time(s) 0.1099
    Epoch 00047 | Loss 0.9599 | Test Acc 0.6710 | Time(s) 0.1099
    Epoch 00048 | Loss 0.9480 | Test Acc 0.6740 | Time(s) 0.1097
    Epoch 00049 | Loss 0.9364 | Test Acc 0.6780 | Time(s) 0.1114
    

    GCN in one formula

    Mathematically, the GCN model follows this formula:

    H^{(l+1)}=\sigma(\tilde{D}^{-\frac{1}{2}}\tilde{A}\tilde{D}^{-\frac{1}{2}}H^{(l)}W^{(l)})

    Here, H(l) denotes the l^{th} layer in the network, sigma is the non-linearity, and W is the weight matrix for this layer. D and A, as commonly seen, represent degree matrix and adjacency matrix, respectively. The ~ is a renormalization trick in which we add a self-connection to each node of the graph, and build the corresponding degree and adjacency matrix. The shape of the input H^{(0)} is N×D, where N is the number of nodes and D is the number of input features. We can chain up multiple layers as such to produce a node-level representation output with shape :mathN times F, where F is the dimension of the output node feature vector.

    The equation can be efficiently implemented using sparse matrix multiplication kernels (such as Kipf’s pygcn code). The above DGL implementation in fact has already used this trick due to the use of builtin functions. To understand what is under the hood, please read our tutorial on PageRank.

    原文链接:
    https://docs.dgl.ai/tutorials/models/1_gnn/1_gcn.html

    相关文章

      网友评论

          本文标题:Graph Convolutional Network

          本文链接:https://www.haomeiwen.com/subject/akkqjhtx.html