Author:
(1) Mingda Chen.
3.1 Improving Language Representation Learning via Sentence Ordering Prediction
3.2 Improving In-Context Few-Shot Learning via Self-Supervised Training
4.2 Learning Discourse-Aware Sentence Representations from Document Structures
5 DISENTANGLING LATENT REPRESENTATIONS FOR INTERPRETABILITY AND CONTROLLABILITY
5.1 Disentangling Semantics and Syntax in Sentence Representations
5.2 Controllable Paraphrase Generation with a Syntactic Exemplar
This chapter describes our contributions to improving self-supervised training objectives for language model pretraining. Prior work has found that the next sentence prediction loss used for pretraining is ineffective in improving downstream task performance (Yang et al., 2019; Liu et al., 2019). In Section 3.1, we propose to replace it with the sentence ordering prediction loss and show the improved model leads to state-of-the-art performance.
Recent work has discovered that pretrained language models are capable of performing in-context few-shot learning (Brown et al., 2020) and the performance can be improved by finetuning the models on human-annotated datasets (Mishra et al., 2021; Ye et al., 2021; Wei et al., 2022). Section 3.2 shows that pretraining the models on self-supervised tasks can also lead to improved performance on downstream tasks.
The material in this chapter is adapted from Lan et al. (2020) and Chen et al. (2022b).
This paper is available on arxiv under CC 4.0 license.