paint-brush
Few-shot In-Context Preference Learning Using Large Language Models: Environment Detailsby@languagemodels
146 reads

Few-shot In-Context Preference Learning Using Large Language Models: Environment Details

by Language Models1mDecember 3rd, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

This section presents environment details for 9 tasks in IsaacGym, including observation and action dimensions, task descriptions, and evaluation metrics. Learn how these elements contribute to preference-based reinforcement learning experiments.
featured image - Few-shot In-Context Preference Learning Using Large Language Models: Environment Details
Language Models HackerNoon profile picture
0-item
  1. Abstract and Introduction
  2. Related Work
  3. Problem Definition
  4. Method
  5. Experiments
  6. Conclusion and References


A. Appendix

A.1. Full Prompts and A.2 ICPL Details

A. 3 Baseline Details

A.4 Environment Details

A.5 Proxy Human Preference

A.6 Human-in-the-Loop Preference

A.4 ENVIRONMENT DETAILS

In Table 4, we present the observation and action dimensions, along with the task description and task metrics for 9 tasks in IsaacGym.


Table 4: Details of IsaacGym Tasks.


Authors:

(1) Chao Yu, Tsinghua University;

(2) Hong Lu, Tsinghua University;

(3) Jiaxuan Gao, Tsinghua University;

(4) Qixin Tan, Tsinghua University;

(5) Xinting Yang, Tsinghua University;

(6) Yu Wang, with equal advising from Tsinghua University;

(7) Yi Wu, with equal advising from Tsinghua University and the Shanghai Qi Zhi Institute;

(8) Eugene Vinitsky, with equal advising from New York University ([email protected]).


This paper is available on arxiv under CC 4.0 license.