Model delegation
This guide explains how to use another Writer model as a tool in chat sessions using the tool calling capability of the Chat completion endpoint.
With model delegation, you can use another Writer model as a tool in a chat application with a separate Palmyra model. Writer has a predefined LLM tool that allows you to delegate specific tasks to other models during a chat session.
For example, in a chat application using Palmyra-X-004, you can delegate image generation tasks to the palmyra-vision
model.
This guide will help you understand how to perform model delegation using the Writer API.
You need an API key to use LLM tool delegation. Follow these steps to generate an API key.
Overview
You can use the LLM tool to delegate specific tasks to another model when using the chat endpoint. Using tool calling, you can specify the Writer model you want to use for a given task. When the primary chat model calls the LLM tool based on the user’s input, it signals it in the chat API response.
The LLM tool works especially well with domain-specific Writer models, such as palmyra-fin
, palmyra-med
, or palmyra-creative
.
To use the LLM tool, add it to the tools
array in your chat-completion
endpoint request.
The LLM tool supports Palmyra-X-004 and later and supports streaming and non-streaming responses.
Tool object
The LLM tool object is defined as follows:
Parameter | Type | Description |
---|---|---|
type | string | The type of tool, which is llm for LLM tool |
function | object | An object containing the tool’s description and model |
function.description | string | (optional) A description of what the model will be used for |
function.model | string | The ID of the Writer model to be used for this tool |
Usage example
Here’s an example of how to use the LLM tool in your application.
Create a tools array containing an LLM tool
To use the LLM tool, create a tools
array that specifies the Writer model you want to use:
Send the request using chat completions
Then, add the tools
array to the chat method or endpoint call along with your array of messages.
Process the response
Finally, process the response from the chat endpoint. The text results will be the first option in the choices
array within the response object.
The code samples below demonstrate how to process the response depending on whether streaming is enabled or not. Learn more about processing tool calling responses in the tool calling guide.
The response contains the generated content from the specialized model you specified in the tool configuration, as well as an llm_data
object that provides information about the delegated model execution:
prompt
: The prompt that was sent to the delegated modelmodel
: The model that was used to generate the response
By following this guide, you can use specialized, fine-tuned Writer models for specific tasks within your chat applications.
Was this page helpful?