笔记:Wide & Deep Learning for Recommender Systems

前两天自从看到一张图后:
timg?image&quality=80&size=b9999_10000&s
就一直想读一下相关论文,这两天终于有时间把论文看了一下,就是这篇Wide & Deep Learning for Recommender Systems

首先简介,主要说了什么是Wide和Deep:
Wide就是:wide是指高维特征+特征组合的LR, 原文Generalized linear models with nonlinear feature transformations are widely used for large-scale regression and classification problems with sparse inputs. Memorization of feature interactions through a wide set of cross-product feature transformations are effective and interpretable, while generalization requires more feature engineering effort. 
Deep就是:深度神经网络,原文:With less feature engineering, deep neural networks can generalize better to unseen feature combinations through low-dimensional dense dings learned for the sparse features. However, deep neural networks with dings can over-generalize and recommend less relevant items when the user-item interactions are sparse and high-rank. 
然后就是本文介绍如何整合Wide和Deep

主要内容:
两个有意思的概念Memorization和Generalization:
Memorization can be loosely defined as learning the frequent co-occurrence of items or features and exploiting the correlation available in the historical data.
Generalization, on the other hand, is d on transitivity of correlation and explores new feature combinations that have never or rarely occurred in the past.

回顾LR和深度学习的方法。

介绍他们的实践,一些细节
目标App Acquisitions
对比join training和ensemble。ensemble是disjoint的。join training可以一起优化整个模型。
训练时候LR部分是FTRL+L1正则,深度学习用的AdaGrad?
训练数据有500 个billion。这是怎么算的,这么NB?
连续值先用累计分布函数CDF归一化到[0,1],再划档离散化。这个倒是不错的trick。

文章不长写的挺有意思的,大家可以下来细读一下。
收藏 打印