推荐系统中双塔模型网络损失函数

作者: _a_monkey_b_ | 来源:发表于2021-04-18 10:36 被阅读0次

推荐系统中双塔模型网络损失函数
【吴恩达机器学习】第五周—神经网络反向传播算法
使用tensorflow创建基本神经网络模型并训练
机器学习中常见函数
用单因子线性回归演示梯度下降和反向传播
BPR算法笔记
Bias和Variance
用PyTorch编写自定义的损失函数(Custom Loss F
人工智能之数学(三) ------ 凸优化
损失函数、代价函数、目标函数

image.png

1. 基于距离的损失函数：Hinge Loss

如果负样本只采样一个

三元输入 loss(user, item+, item-) label已经在triple中了，因为规定了第二个位置必须是正样本
$L_{hinge} = max(0, margin-user·item_+ + user·item_-)$
2元输入，label是 0 / 1, y=0时，user与item距离越大loss越小，y=1时user与item距离越大loss越大
$L_{hinge} = y*(user·item) + (1-y) * max(margin-(user·item))$

如果负样本采样多个

可以有两种方式，一种方式是在生成输入样本的时候，为同一个(user, item+)匹配多个item-，另一种方式是直接改变输入为neg+2元输入，并且同步更新loss的计算逻辑

因为是基于距离的计算方式，为了保障空间一致性，所以一般情况下是共享网络结构和参数的，那反向更新的时候就通过求导更新就好了，因为共享各个层的W，底层有不共享的地方就相当于各自小网络的输出接入共享结构，那么各自小网络反向更新

如下代码中通过共享同一个baseModel，共享了网络结构和参数。

def build_model_1d(input_shape, filters):
    """
        Model architecture
    """
    # Define the tensors for the two input images
    left_inputs = Input(input_shape, name="left_inputs")
    right_inputs = Input(input_shape, name="right_inputs")

    # Convolutional Neural Network
    inputs = Input(input_shape, name="input")
    x = Conv1D(filters[0], 7, activation="elu", padding="same", name="conv1")(inputs)
    x = MaxPooling1D(pool_size=2, name="mp1")(x)
    x = Conv1D(filters[1], 5, activation="elu", padding="same", name="conv2")(x)
    x = MaxPooling1D(pool_size=2, name="mp2")(x)
    x = Conv1D(filters[2], 3, activation="elu", padding="same", name="conv3")(x)
    x = MaxPooling1D(pool_size=2, name="mp3")(x)
    x = Conv1D(filters[3], 3, activation="elu", padding="same", name="conv4")(x)
    x = MaxPooling1D(pool_size=2, name="mp5")(x)
    x = Flatten(name="flat")(x)

    # Generate the encodings (feature vectors) for the two images
    basemodel = Model(inputs, x, name="basemodel")

    # using same instance of "basemodel" to share weights between left/right networks
    encoded_l = basemodel(left_inputs)
    encoded_r = basemodel(right_inputs)

    # Add a customized layer to compute the absolute difference between the encodings
    distance_layer = Lambda(k_euclidean_dist, name="distance")([encoded_l, encoded_r])

    siamese_net = Model(inputs=[left_inputs, right_inputs], outputs=distance_layer)

    # return the model
    return siamese_net, basemodel


def k_contrastive_loss(y_true, dist):
    """Contrastive loss from Hadsell-et-al.'06
    http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
    """
    margin = P.margin
    return K.mean(y_true * K.square(dist) + (1 - y_true) * K.square(K.maximum(margin - dist, 0)))