paint-brush
Detailing the Primary Methodology Implemented in Our Models: Octopus v2by@languagemodels

Detailing the Primary Methodology Implemented in Our Models: Octopus v2

by Language Models (dot tech)April 3rd, 2025
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

In this section, we detail the primary methodology implemented in our models, followed by the dataset collection process essential for fine-tuning these models.

Company Mentioned

Mention Thumbnail
featured image - Detailing the Primary Methodology Implemented in Our Models: Octopus v2
Language Models (dot tech) HackerNoon profile picture
0-item

Abstract and 1. Introduction

2 Related works

3 Methodology and 3.1 Causal language model as a classification model

3.2 Functional token

3.3 Dataset collection

3.4 Model development and training

4 Experiments and 4.1 Android function calls

4.2 Extension to Vehicle, Yelp, and DoorDash function sets

4.3 Full and partial training datasets and 4.4 Full training and LoRA training

4.5 Parallel and nested function call and 4.6 Weighted loss function for special tokens

5 Discussion and future works and References


Appendix

A.1 Android function examples

A.2 Vehicle function examples

3 Methodology

In this section, we detail the primary methodology implemented in our models, followed by the dataset collection process essential for fine-tuning these models. We illustrate this through examples drawn from the Android API. Subsequently, we delve into the specifics of our model training approach.

3.1 Causal language model as a classification model

To successfully invoke a function, it’s essential to accurately select the appropriate function from all available options and to generate the correct function parameters. This entails a two-stage process: a function selection stage and a parameter generation stage. The initial step involves understanding the function’s description and its arguments, using information from the user’s query to create parameters for an executable function. A direct strategy might combine a classification model with a causal language model. We can envision the N available functions as a selection pool, transforming the selection challenge into a softmax classification problem.



To choose a correct functional token, the language model must grasp the meaning associated with that token. We decided to incorporate the function descriptions into the training dataset, enabling the model to learn the importance of these specialized tokens. We designed a prompt template that accommodates three different response styles, facilitating parallel and nested function calls. Detailed examples of the dataset are provided in the Appendix.


This methodology presents an additional critical benefit. After the model is fine-tuned to understand the significance of functional tokens, it can conduct inference by employing the added special token,, as the early stopping criterion. This strategy negates the necessity to analyze tokens from function descriptions, removing the retrieval of relevant functions and the processing of their descriptions. Consequently, this considerably diminishes the number of tokens needed to accurately identify a function name. The difference between the conventional retrieval-based method and our current proposed model is shown in Figure (2).


Figure 2: The comparison of the retrieval-based function calling process and the function calling process of the Octopus model.


This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license.

Authors:

(1) Wei Chen, Stanford University, with equal contribution and a corresponding author {weichen6}@stanford.edu;

(2) Zhiyuan Li, Stanford University and a corresponding author {zhiyuan8}@stanford.edu.