2020-2021年顶会上关于解决偏差(bias)问题的文献整理

Posted 白水baishui

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了2020-2021年顶会上关于解决偏差(bias)问题的文献整理相关的知识,希望对你有一定的参考价值。

文章目录

1. 偏差分析

(1) Bias-Variance Decomposition for Ranking. WSDM 2021;

(2) Transfer Learning in Collaborative Recommendation for Bias Reduction. RecSys 2021;
code: https://csse.szu.edu.cn/staff/panwk/publications/TJR/ .

2. 数据偏差

2.1. 选择偏差 Selection Bias

用户的显式反馈往往很稀疏,并且只有点击的样本才可能有显式反馈。假如用户对样本的反馈是打分,当用户点击样本时,意味着用户已经比较喜欢该样本了,因此打分可能偏高,若不合预期,打分也可能偏低,此时用户的反馈存在选择偏差。这部分观测数据打分的分布,和全体样本的打分分布是有较大差异的,相当于采样不随机,不能很好预估总体。

(1) Measuring and Mitigating Item Under-Recommendation Bias in Personalized Ranking Systems. SIGIR 2020;

(2) E-commerce Recommendation with Weighted Expected Utility. CIKM 2020;
code: https://github.com/zhichaoxu-shufe/E-commerce-Rec-with-WEU.

(3) Combating Selection Biases in Recommender Systems with a Few Unbiased Ratings. WSDM 2021;

(4) Non-Clicks Mean Irrelevant Propensity Ratio Scoring As a Correction. WSDM 2021;

(5) Mitigating Confounding Bias in Recommendation via Information Bottleneck. RecSys 2021;
code: https://github.com/dgliu/RecSys21_DIB .

(6) Pessimistic Reward Models for Off-Policy Learning in Recommendation. RecSys 2021.

2.2. 一致性偏差 Conformity Bias

人是社会性动物,在推荐系统中也是如此。同样以打分为例,用户的评分很容易受大众/朋友的影响,大家都说好的电影,可能你觉得没那么好,但由于不想那么突出,因此也选择从众。这就导致样本数据无法反映用户真实的偏好。

(1) Debiasing Item-to-Item Recommendations With Small Annotated Datasets. RecSys 2020;
code: https://github.com/microsoft/debiasing-item2item .

2.3. 曝光偏差 Exposure Bias

相对于全量索引,推荐系统返回的结果是极其有限的,用户的反馈也只是在这部分曝光样本上产生,对于更多未曝光的样本,由于缺少反馈数据,导致模型无法很好学习未曝光数据的分布。模型训练时是在曝光样本空间,但是预估时是全量样本空间,这就导致曝光偏差。曝光偏差对于新物品不太友好,模型容易对旧物品打高分。

(1) Unbiased Learning for the Causal Effect of Recommendation. RecSys 2020;
code: https://www.dunnhumby.com/source-files/.

(2) Debiased Explainable Pairwise Ranking from Implicit Feedback. RecSys 2021;
code: https://github.com/KhalilDMK/EBPR .

(3) Top-K Contextual Bandits with Equity of Exposure. RecSys 2021;
code: https://github.com/deezer/carousel_bandits .

2.4. 位置偏差 Position Bias

推荐物品的位置也会影响用户的点击率。出于对平台推荐内容的信任,以及用户的使用习惯,位置靠前的视频,往往更容易被点击,但并不代表用户真的喜欢。

(1) Attribute-based Propensity for Unbiased Learning in Recommender Systems Algorithm and Case Studies. KDD 2020;

(2) Unbiased Ad Click Prediction for Position-aware Advertising Systems. RecSys 2020;

(3) Unbiased Learning to Rank in Feeds Recommendation. WSDM 2021;
code: https://github.com/flamewei123/Unbaised-LTR-in-Feeds-Recommendation-WSDM21 .

(4) Cross-Positional Attention for Debiasing Clicks. WWW 2021;

3. 模型偏差

3.1. 归纳偏差 Inductive Bias

归纳偏差来源于模型,定义模型时为了简化问题会人为增加假设,在预测时可能产生泛化误差。

(1) A General Knowledge Distillation Framework for Counterfactual Recommendation via Uniform Data. SIGIR 2020;
code: https://github.com/dgliu/SIGIR20_KDCRec .

(2) Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions. KDD 2020;
code: https://github.com/spotify-research/RIPS_KDD2020 .

4. 推荐结果的偏差与不公平性

4.1. 流行度偏差 Popularity Bias

即长尾效应。通常推荐系统分发内容中,头部曝光现象严重,导致样本分布不均匀。热门样本数量更多,这也导致模型倾向给热门样本打高分,分发频率远超其他。通常会对热门样本进行降采样,缓解长尾效应。

(1) Keeping Dataset Biases out of the Simulation : A Debiased Simulator for Reinforcement Learning based Recommender Systems. RecSys 2020;
code: https://github.com/BetsyHJ/SOFA .

(2) Popularity-Opportunity Bias in Collaborative Filtering. WSDM 2021;

(3) Diverse User Preference Elicitation with Multi-Armed Bandits. WSDM 2021;

(4) ProtoCF: Prototypical Collaborative Filtering for Few-shot Item Recommendation. RecSys 2021;
code: https://github.com/aravindsankar28/ProtoCF .

(5) Analyzing Item Popularity Bias of Music Recommender Systems: Are Different Genders Equally Affected? RecSys 2021;

(6) The Idiosyncratic Effects of Adversarial Training on Bias in Personalized Recommendation Learning. RecSys 2021;
code: https://github.com/sisinflab/The-Idiosyncratic-Effects-of-Adversarial-Training .

(7) Biases in Recommendation System. RecSys 2021;

4.2. 偏见 Unfairness (用户偏差 User Bias)

人工智能的伦理问题近年来热度也比较高,例如模型在人种、性别、年龄方面存在歧视,本质上是样本不均匀导致。推荐系统的偏见来源于数据,数据不够多样不够充分,导致模型无法学习充分,在预估时产生了偏见。

(1) Debiasing Career Recommendations with Neural Fair Collaborative Filtering. WWW 2021;
code: https://github.com/rashid-islam/nfcf .

(2) User Bias in Beyond-Accuracy Measurement of Recommendation Algorithms. RecSys 2021;

(3) Measuring and Mitigating Bias and Harm in Personalized Advertising. RecSys 2021;

(4) I Want to Break Free! Recommending Friends from Outside the Echo Chamber. RecSys 2021;

(5) Leave No User Behind Towards Improving the Utility of Recommender Systems for Non-mainstream Users. WSDM 2021;
code: https://github.com/roger-zhe-li/wsdm21-mainstream .

5. 论文下载

以上所有提到的论文都可以在 2020-2021顶会关于推荐系统中的解决偏差(bias)问题的文献汇总.zip 中下载到。

以上是关于2020-2021年顶会上关于解决偏差(bias)问题的文献整理的主要内容,如果未能解决你的问题,请参考以下文章

推荐系统去流行度偏差(bias)文献四篇

推荐系统去流行度偏差(bias)文献四篇

推荐系统去流行度偏差(bias)文献四篇

总结:Bias(偏差),Error(误差),Variance(方差)及CV(交叉验证)

机器学习5 正则化的线性回归(Regularized Linear Regression)和偏差对方差(Bias v.s. Variance)

偏差(bias)和方差(variance)的区别