Leveraging Natural Supervision for Language Representation Learning: Background Summary

Written by textmodels | Published 2024/06/01
Tech Story Tags: llm-natural-supervision | llm-self-supervision | llm-language-pretraining | llm-word-prediction | ai-language-modeling | ai-vector-representations | ai-neural-models | ai-sentence-representations

TLDRIn this study, researchers describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.via the TL;DR App

Author:

(1) Mingda Chen.

Table of Links

2.4 Summary

In this chapter, we describe the background materials needed for the remainder of this thesis. In Chapter 3, we present our contributions to improving self-supervised training objectives for language model pretraining. The new training objectives help enhance the quality of general language representations and model performance on few-shot learning. Chapter 4 presents our contributions to exploit naturallyoccurring data structures on Wikipedia for entity and sentence representations and textual entailment. Chapter 5 presents our contributions on leveraging freelyavailable parallel corpora for disentangling semantic and syntactic representations. Then we apply the technique to controlling the syntax of generated sentences using a sentential exemplar. Chapter 6 presents our contributed datasets for data-to-text generation, abstractive summarization, and story generation. They are tailored from naturally-occurring textual resources and have unique challenges in their respective task settings.

This paper is available on arxiv under CC 4.0 license.


Written by textmodels | We publish the best academic papers on rule-based techniques, LLMs, & the generation of text that resembles human text.
Published by HackerNoon on 2024/06/01