paint-brush
Deep Mutual Learning Optimizes Multi-Task Recommender Systems with Cross Task Feature Miningby@kmeans

Deep Mutual Learning Optimizes Multi-Task Recommender Systems with Cross Task Feature Mining

by KMeansJanuary 23rd, 2025
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

This paper presents Deep Mutual Learning (DML), a framework for multi-task recommender systems. It improves ranking by introducing Cross Task Feature Mining (CTFM) to share task-specific inputs while ensuring task awareness. The design avoids negative task interactions, enhances predictions, and encourages effective knowledge transfer via scaled dot-product attention and task embeddings.
featured image - Deep Mutual Learning Optimizes Multi-Task Recommender Systems with Cross Task Feature Mining
KMeans HackerNoon profile picture
0-item

Abstract and 1 Introduction

  1. Methodology
  2. Experiments
  3. Conclusion and References

2. METHODOLOGY

In this section, we first introduce the problem of multi-objective ranking for recommender systems. Second, we describe the general design of DML. Finally, we elaborate on the introduced components.

2.1 Multi-Objective Ranking for Recsys

2.2 Overall Design of DML

Please refer to figure 1 for the overall network architecture of DML. With the existing MTL algorithms[25], the equation (1) can be further decomposed as below. For simplicity, we omit the subscript 𝑛 in this section.



In this research, rather than 𝐺, we focus on the enhancement of upper-level networks for improved prediction performance. First, the shared component of CTFM is introduced, which leverages the attention mechanism to extract relevant information from the inputs of other task towers (the results of Equation (3)) as a complement to the target task. Please note that this attention is welldesigned to solve the task-awareness missing issue, for which the excessive encouragement of knowledge sharing is not conducive to the extraction of task-specific knowledge. With our design, the gradients computed from the target task’s loss will not impact the inputs of other task towers.


2.3 Cross Task Feature Mining



Table 1: The overall performance. The bold-face font denotes the winner in that column. Moreover, the "*" symbol denotes introducing DML achieves significant (p < 0.05 for one-tailed t-test) gain over the corresponding baseline.


Authors:

(1) Yi Ren, Tencent, Beijing, China ([email protected]);

(2) Ying Du, Tencent, Beijing, China ([email protected]);

(3) Bin Wang, Tencent, Beijing, China ([email protected]);

(4) Shenzheng Zhang, Tencent, Beijing, China ([email protected]).


This paper is available on arxiv under CC BY 4.0 DEED license.