Countering Mainstream Bias via End-to-End Adaptive Local Learning: Hyper-parameter Study

Written by mediabias | Published 2024/08/21
Tech Story Tags: mainstream-bias | collaborative-filtering | adaptive-local-learning | discrepancy-modeling | unsynchronized-learning | rawlsian-max-min-fairness | mixture-of-experts | loss-driven-models

TLDRvia the TL;DR App

Table of Links

Abstract and 1 Introduction

2 Preliminaries

3 End-to-End Adaptive Local Learning

3.1 Loss-Driven Mixture-of-Experts

3.2 Synchronized Learning via Adaptive Weight

4 Debiasing Experiments and 4.1 Experimental Setup

4.2 Debiasing Performance

4.3 Ablation Study

4.4 Effect of the Adaptive Weight Module and 4.5 Hyper-parameter Study

5 Related Work

6 Conclusion, Acknowledgements, and References

4.4 Effect of the Adaptive Weight Module

Last, we turn our attention to investigating the effect of the adaptive weight module, studying how it synchronizes the learning paces of different users. We run TALL on the ML1M dataset and present the average weights for the five subgroups with the gap window (#gap = 40) in Figure 3. It can be observed that the adaptive weight module assigns weights dynamically to different types of users to synchronize their learning paces. Initially, mainstream users receive higher weights because they are easier to learn and have a higher upper bound of performance than niche users. Then, when mainstream users reach the peak, the model switches the attention to niche users who are more difficult to learn, gradually increasing the weights for β€˜low’, β€˜med-low’, and β€˜medium’ users until the end of the training procedure. However, β€˜med-high’ and β€˜high’ users, approaching converged, need a slower learning pace to avoid overfitting, leading to a decrease in the weights. Figure 3 illuminates the effectiveness and dynamic nature of the proposed adaptive weight module in synchronizing the learning procedures for different types of users.

4.5 Hyper-parameter Study

Additionally, we have also conducted a comprehensive hyper-parameter study investigating the impacts of three hyper-parameters in TALL: (1) the gap window in the adaptive weight module; (2) Ξ± in the adaptive weight module; and (3) the number of experts. The complete results are in https://github.com/JP-25/ end-To-end-Adaptive-Local-Leanring-TALL-/blob/main/Hyperparameter Study. pdf.

Authors:

(1) Jinhao Pan [0009 βˆ’0006 βˆ’1574 βˆ’6376], Texas A&M University, College Station, TX, USA;

(2) Ziwei Zhu [0000 βˆ’0002 βˆ’3990 βˆ’4774], George Mason University, Fairfax, VA, USA;

(3) Jianling Wang [0000 βˆ’0001 βˆ’9916 βˆ’0976], Texas A&M University, College Station, TX, USA;

(4) Allen Lin [0000 βˆ’0003 βˆ’0980 βˆ’4323], Texas A&M University, College Station, TX, USA;

(5) James Caverlee [0000 βˆ’0001 βˆ’8350 βˆ’8528]. Texas A&M University, College Station, TX, USA.


This paper is available on arxiv under CC BY 4.0 DEED license.


Written by mediabias | We publish deeply researched (and often vastly underread) academic papers about our collective omnipresent media bias.
Published by HackerNoon on 2024/08/21