GCN在推荐系统中的应用之GraphSage-based

作者: syat_e6da | 来源:发表于2020-12-22 10:31 被阅读0次

GCN在推荐系统中的应用之GraphSage-based
GCN在推荐系统中的应用
[Paper] || KGCN详解 Knowledge Grap
基于GCN的工业级推荐系统
Arxiv网络科学论文摘要4篇(2019-02-27)
2021-08-26 一些想法和记录
为什么是GCN（转）
文章推荐系统 | 二、同步业务数据
PinSage：首次在商业推荐系统成功应用的GCN模型实现分析
ad

GraphSage

GraphSage是在论文Inductive Representation Learning on Large Graphs
William中提出的一种归纳式的embedding表示训练方法。

在上一篇所讲的GCN是transductive learning（直推式学习），即通过一个固定的图，直接训练每个节点的embedding。但是在很多图场景下，图节点是实时更新的，所以本文提出了inductive learning（归纳式学习）。不是在一个静态图上训练每个节点的embedding，而是通过训练得到一个由neighbood到embedding的映射关系（aggregator），使得结果可以仅通过邻居关系得到新加入的节点的embedding。

1. Embedding generation (i.e., forward propagation) algorithm

forward propagation

2. Learning the parameters of GraphSAGE

针对无监督学习，训练loss用的是pair-wise，认为有联系的2个节点间embedding应该是接近的，而没有联系的2个节点间embedding应该是远离的。（用内积表达的是余弦距离）

Loss

3. Aggregator Architectures

Mean aggregator

mean aggregator
LSTM aggregator
因为aggregator应该与输入列表顺序无关，而LSTM本身带有顺序，所以每次输入前都需要打乱列表顺序。
Pooling aggregator

pooling aggregator

4.问题

在训练数据集中有每个节点的feature information，然后用feature information来训练得到用户的节点，那如果没有feature information怎么办呢？用index来表示吗？

读GraphSage在推荐系统领域应用的论文

Graph Convolutional Neural Networks for Web-Scale Recommender Systems（KDD2018）（PinSage）

这篇论文做融合的总体框架还是GraphSage：从neighbor中抽取一定数量进行融合。但是与Graph有所区别在于不是随机抽取而是importance pooling. 以下说一下这篇论文的主要创新点：

importance pooling. 在抽取一定量的邻居做融合时按照权重排序，将最重要的neighbors进行融合。权重按照随机游走走到次数的L1规则化作为打分。并且在融合时 $\gamma()$ 来作为weighted-mean。具体的convolve函数如下：

convolve
用Producer-consumer minibatch construction合理分配GPU和CPU的资源。先用CPU生成子图，然后将子图传入GPU进行embedding运算。这样的好处是在训练时不用进行CPU和GPU的通信，提升训练效率。

minibatch
论文算法的训练方式还是用的pair-wise的监督学习训练方式。

loss
其中 $z_{n_k}$ 是negative item， $z_{i}$ 是positive item.训练目标是使query和positive item之间的距离越来越近，query和negative item之间的目标越来越远。为了更有效的训练，本文还提出了一种curriculum learning（课程学习）—— 其中负采样中有一部分是hard negative samples.这些样本与query不相关，但是他们又有一定的相似度（文中对这些sample的采样方式是取与该query的ranking score排名前2000-5000名之间）。刚开始就用这些sample会比较不好训练，所以训练的时候先用random negative samples训练。等参数收敛到一定程度再用这些hard negative samples训练，以取得更好的效果。从易到难就是课程学习。
Node Embeddings via MapReduce. 这个技术是为了计算所有节点embedding的时候加快计算速度，减少计算时间。但是具体怎么个减少重复运算方法我没看懂...

Neural Graph Collaborative Filtering（SIGIR2019）（NGCF）

这篇论文的总体框架其实很经典：

an embedding layer that offers and initialization of user embeddings and item embeddings.首先定义一个 $E$ 矩阵： $E:[e_{u_1},...,e_{u_N},e_{i_1},...,e_{i_M}]$ .N是user number，M是item number。
multiple embedding propagation layers that refine the embeddings by injecting high- order connectivity relations.

NGCF model architecture

message
对于a connected user-item pair $(u, i)$ ,定义一个message from $i$ to $u$ :
$m_{u←i} = f(e_i , e_u,p_{ui})$
在文中的message的具体形式为：
message in NICF
message aggregration
message aggregation
其中 $m_{u\leftarrow u}=W_1e_u$ ， $W_1$ 和上述message中的 $W_1$ 保持一致。
multi-order propagation
Propagation Rule in Matrix Form

写成矩阵形式大大加快了计算速度。（不过其实这有点像是GCN的拓展。）

the prediction layer that aggregates the refined embeddings from different propagation layers and outputs the affinity score of a user-item pair.
将propagation layer的每一层输出concat起来作为最后的embedding：

concatenate
dot product

LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation（2020）（LightGCN）

这篇文章是对上述NGCF所做的一个改进。文章发现NGCF中的feature transformation（W1, W2）和nonlinear activation（ $\sigma$ ）操作不但使训练更困难，并且降低了准确度。这个的主要原因在于：GCN is originally proposed for node classification on attributed graph, where each node has rich attributes as input features; whereas in user-item interaction graph for CF, each node (user or item) is only described by a one-hot ID, which has no concrete semantics besides being an identifier. In such a case, given the ID embedding as the input, performing multiple layers of nonlinear feature transformation —which is the key to the success of modern neural networks — will bring no benefits, but negatively increases the difficulty for model training.
优化后的LightGCN的forward propagation layer：

forward propagation layer
layer embedding aggregator：和NGCF不同的是：NGCF是之间concat，而LightGCN用的是weighted sum：

layer embedding aggregator
matrix form：

注：在forward propagation中并没有加入self connection是因为layer embedding 的weighted sum操作其实包含了self connection。具体证明过程如下：

weighted sum包含了self connection

So inserting self-connection into A and propagating embeddings on it is essentially equivalent to a weighted sum of the embeddings propagated at each LGC layer.

Second-Order Embedding Smoothness.
因为本算法较简单，所以可以轻易看出embeding的传播：

coefficient
This coefficient is rather interpretable: the influence of a second- order neighbor v on u is determined by 1) the number of co-interacted items, the more the larger; 2) the popularity of the co-interacted items, the less popularity (i.e., more indicative of user personalized preference) the larger; and 3) the activity of $v$ , the less active the larger. Such interpretability well caters for the assumption of CF in measuring user similarity and evidences the reasonability of LightGCN. Due to the symmetric formulation of LightGCN, we can get similar analysis on the item side.

Beyond Clicks: Modeling Multi-Relational Item Graph for Session-Based Target Behavior Prediction(WWW2020)(MGNN-SPred)

这篇论文旨在解决2个问题：

Existing methods for session-based behavior prediction focus on only utilizing the same type of user behavior for prediction, but ignore the potential of taking other behavior data as auxiliary information.
item-to-item relations are modeled separately and locally in one behavior sequence, and they lack a principled way to globally encode these relations more effectively.

So MGNN-SPred jointly considers target behavior and auxiliary behavior sequences and explores global item-to-item relations for accurate prediction.

本文算法框架：

The architecture of our model. a toy MRIG and two current behavior sequences are used as input. The number of recommended items is set to 2.

构图算法：

graph construction

Item Representation Learning:
for each node $v \in V$ , one-hot representation $\overline e_v \in R^{|V|}$ ；

convert each of them into a low-dimensional dense vector $e_v ∈ R^{d}$ by a learnable embedding matrix $E \in R_{|V|×d}: e_v = E^{T}*\overline e_v$ .
GNN forward propagation layer:

four neighbor set:“target-forward”, “target- backward”, “auxiliary-forward”, and “auxiliary-backward”
target-forward set definition
obtain the representation of this group
combine these four representations of different neighbor groups by sum-pooling
update the representation of the center node v by:
After performing $K$ iterations, we take the node representation of the last step as the representation of the corresponding item: $g_v = h_v^K$

Sequence Representation Learning:

sequence representation

It was found that simple mean-pooling could already achieve comparable performance compared with attention mechanism while retaining low complexity.

It is self-evident that the contributions of auxiliary behavior sequence for the next item prediction are different in different situations.So a gated mechanism is designed to calculate the relative importance weight $\alpha$ :

importance weight
The user preference representation

o

is obtained for the current session by the weighted summation of

p

and

q

user preference representation
Model Prediction and Training:
A bi-linear matching scheme is employed by:

bi-linear matching scheme
得到对每个item

v

的score。
To learn the parameters of the model, a softmax function is applied to nomalize the scores over all item sto get the probability distribution

\hat y

Backpropagation for neural networks is adopted to optimize the model by minimizing the cross-entropy loss of the predicted probability distribution

\hat y

w.r.t. the ground truth:

Loss

where $y = (y_1, · · · , y_{|V|})$ denotes the one-hot representation of the ground truth.

Memory Augmented Graph Neural Networks for Sequential Recommendation(AAAI 2020)(MA-GNN)

这篇论文是解决sequential recommendation.主要贡献点在于：

model the short-term user interests by GNN;
capture the long-term user interests by memory units;
Interest fusion by gate layer;
Item co-occurrence modeling.
论文算法的总体框图：

structure

model the short-term user interests by GNN

首先在一个序列中用sliding window strategy取出子序列 $L_{u, l} = (I_l, I_{l+1},...,I_{l+|L|-1})$ ，然后对每个子序列如下图所示对item添加边

Short-term Interest Aggregation：一个2层的GCN.

Long-term Interest Modeling

用external memory units去存储随时间变化的用户长期兴趣。用户随时间接触的序列为： $H_{u, l} = (I_1, I_2, ...,I_{l-1})$ .
则首先用multi-dimensional attention model生成query embedding：

其中 $PE(.)$ 是sinusoidal positional encoding function that maps the item positions into position embeddings.
然后对memory unit作操作：

where

k_i, v_i \in R^d

are the

i^{th}

memory unit and the superscript

H

denotes the representation is from the user long- term interest.

Interest Fusion

The superscript C denotes the fusion of long- and short-term interests.

Item Co-occurrence Modeling

表示短期内接触的item与所要求的item的关系远近

Prediction and Training

prediction

Training

GCN在推荐系统中的应用之GraphSage-based
GraphSage GraphSage是在论文Inductive Representation Learning ...
GCN在推荐系统中的应用
图网络（graph neural network， GNN） Category： Recurrent Graph ...
[Paper] || KGCN详解 Knowledge Grap
一句话总结：用GCN对Knowledge Graph做Model Motivation 推荐系统中存在的问题用户...
基于GCN的工业级推荐系统
PinSage:第一个基于GCN的工业级推荐系统 https://zhuanlan.zhihu.com/p/624...
Arxiv网络科学论文摘要4篇(2019-02-27)
共同发明人网络的社区结构影响了首次引用专利的时间; 用于推荐系统的考虑交互的分解机; GCN-LASE：在图卷积网...
2021-08-26 一些想法和记录
20210825: 1、推荐算法涉及基于GCN的推荐该怎么搞？https://www.163.com/dy/ar...
为什么是GCN（转）
一、GCN中的Graph指什么？为什么要研究GCN？ CNN是Computer Vision里的大法宝，效果为什么...
文章推荐系统 | 二、同步业务数据
推荐阅读：文章推荐系统 | 一、推荐流程设计在推荐系统架构中，推荐系统的数据库和业务系统的数据库是分离的，这样才...
PinSage：首次在商业推荐系统成功应用的GCN模型实现分析
微信公众号：平凡而诗意微博：Jackpopc 之前有过两篇文章分别介绍了GCN模型PinSage和图神经网络框架D...
ad
01. 程序化广告交易中的点击率预估02. CTR预估中GBDT与LR融合方案03. 浅谈矩阵分解在推荐系统中的应...

GCN在推荐系统中的应用之GraphSage-based

GraphSage

1. Embedding generation (i.e., forward propagation) algorithm

2. Learning the parameters of GraphSAGE

3. Aggregator Architectures

4.问题

读GraphSage在推荐系统领域应用的论文

Graph Convolutional Neural Networks for Web-Scale Recommender Systems（KDD2018）（PinSage）

Neural Graph Collaborative Filtering（SIGIR2019）（NGCF）

LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation（2020）（LightGCN）

Beyond Clicks: Modeling Multi-Relational Item Graph for Session-Based Target Behavior Prediction(WWW2020)(MGNN-SPred)

Memory Augmented Graph Neural Networks for Sequential Recommendation(AAAI 2020)(MA-GNN)

model the short-term user interests by GNN

Long-term Interest Modeling

Interest Fusion

Item Co-occurrence Modeling

Prediction and Training

相关文章

GCN在推荐系统中的应用之GraphSage-based

GCN在推荐系统中的应用

[Paper] || KGCN详解 Knowledge Grap

基于GCN的工业级推荐系统

Arxiv网络科学论文摘要4篇(2019-02-27)

2021-08-26 一些想法和记录

为什么是GCN（转）

文章推荐系统 | 二、同步业务数据

PinSage：首次在商业推荐系统成功应用的GCN模型实现分析

ad

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读