以下是协同过滤推荐系统的学习笔记
-
公式
image.png
-
逻辑图
image.png

- 原理理解
- 使用用户对物品的评分,分解出用户感兴趣的物品类型特征,和物品在不同物品类型的分数。例如:电影分为动作电影类型、情感电影类型,某一电影在动作电影类型分数为9,情感电影类型分数为1。同理某一用户对动作类型电影分数为1分,情感电影为9分。这些我把它理解为物对-物品类型特征和用户-物品类型特征。
- 使用LightFM
- LightFM使用这边比较简单,就是给用户电影的评分数据,LightFM自动计算出用户对不同物品的分数
- 一下是从LightFM官网粘帖的代码
from lightfm import LightFM
from lightfm.datasets import fetch_movielens
from lightfm.evaluation import precision_at_k
import numpy as np
# Load the MovieLens 100k dataset. Only five
# star ratings are treated as positive.
data = fetch_movielens(data_home='./data', min_rating=5.0)
print(data['train'])
# Instantiate and train the model
model = LightFM(loss='warp')
model.fit(data['train'], epochs=30, num_threads=2)
# Evaluate the trained model
test_precision = precision_at_k(model, data['test'], k=5).mean()
print("Train precision: %.2f" % precision_at_k(model, data['train'], k=5).mean())
print("Test precision: %.2f" % precision_at_k(model, data['test'], k=5).mean())
def sample_recommendation(model, data, user_ids):
n_users, n_items = data['train'].shape
for user_id in user_ids:
known_positives = data['item_labels'][data['train'].tocsr()[user_id].indices]
print(data['train'].tocsr())
print(data['train'].tocsr()[user_id])
print(data['train'].tocsr()[user_id].indices)
scores = model.predict(user_id, np.arange(n_items))
top_items = data['item_labels'][np.argsort(-scores)]
print("User %s" % user_id)
print(" Known positives:")
for x in known_positives[:3]:
print(" %s" % x)
print(" Recommended:")
for x in top_items[:3]:
print(" %s" % x)
sample_recommendation(model, data, [3, 25, 450])
- github对应源码和需要的数据地址
https://github.com/wengmingdong/tf2-stu/tree/master/recommender
网友评论