https://arxiv.org/pdf/1703.04247v1.pdf
2 Approach
1 Feature
X, y
categorical field is represented as a vector of one-hot encoding
continuous field is represented as the value itself or a vector of one-hot encoding after discretization
x is high-dimensional and extremely sparse
yhat = sigmod(yFM +yDNN)
2 DeepFM
-
FM component
- Deep Component
因为数据太高维稀疏. 需要 embedding layer to compress- different input vectors can be different, their embeddings are of the same size k
- use latent feature vectors in FM can be used to compress the input field vectors. but in this work, we eliminate the need of pre-training
3 Difference between other NN
FNN is a FM-initialized feedforward
limitations
- embedding parameters might be over affected by FM
- efficiency is reduced by the overhead introduced by the pre-training stage
- only capture high order feature
DeepFM
Needs no pre-training and learns both high- and low-order feature interactions
PNN
Wide & Deep
need for expertise feature engineering on the input to the “wide” part
Parameter setting
dropout 0.5
netword 400-400-400
optimizer Adam
activation function: tanh for IPNN, relu for other deep models
网友评论