POST
/
v1
/
chat
curl --location --request POST 'https://api.writer.com/v1/chat' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <token>' \
--data '{
    "model": "palmyra-x-002-32k",
    "temperature": 0.7,
    "messages": [
        {
            "role": "user",
            "content": "You are an expert at writing product descriptions for an E-Commerce Retailer"
        },
        {
            "role": "assistant",
            "content": "Okay, great I can help write these descriptions. Do you have a specific product in mind?"
        },
        {
            "role": "user",
            "content": "Please write a one sentence product description for a cozy, stylish sweater suitable for both casual and formal occasions"
        }
    ],
    "stream": true
}'
{
  "id": "57e4f58f-f7b1-41d8-be17-a6279c073aad",
  "choices": [
    {
      "finish_reason": "length",
      "message": {
        "content": "The earnings report shows...",
        "role": "assistant"
      }
    }
  ],
  "created": 1715361795,
  "model": "palmyra-x-002-32k"
}

Authorizations

Authorization
string
headerrequired

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
model
string
default: palmyra-x-002-32krequired

Specifies the model to be used for generating responses. The chat model is always palmyra-x-002-32k for conversational use.

messages
object[]
required

An array of message objects that form the conversation history or context for the model to respond to. The array must contain at least one message.

max_tokens
integer
default: 16

Defines the maximum number of tokens (words and characters) that the model can generate in the response. The default value is set to 16, but it can be adjusted to allow for longer or shorter responses as needed.

temperature
number
default: 1

Controls the randomness or creativity of the model's responses. A higher temperature results in more varied and less predictable text, while a lower temperature produces more deterministic and conservative outputs.

top_p
number

Sets the threshold for "nucleus sampling," a technique to focus the model's token generation on the most likely subset of tokens. Only tokens with cumulative probability above this threshold are considered, controlling the trade-off between creativity and coherence.

n
integer

Specifies the number of completions (responses) to generate from the model in a single request. This parameter allows multiple responses to be generated, offering a variety of potential replies from which to choose.

stop

A token or sequence of tokens that, when generated, will cause the model to stop producing further content. This can be a single token or an array of tokens, acting as a signal to end the output.

stream
boolean

Indicates whether the response should be streamed incrementally as it is generated or only returned once fully complete. Streaming can be useful for providing real-time feedback in interactive applications.

Response

200 - application/json
id
string
required

A globally unique identifier (UUID) for the response generated by the API. This ID can be used to reference the specific operation or transaction within the system for tracking or debugging purposes.

choices
object[]
required

An array of objects representing the different outcomes or results produced by the model based on the input provided.

created
integer
required

The Unix timestamp (in seconds) when the response was created. This timestamp can be used to verify the timing of the response relative to other events or operations.

model
string
required

Identifies the specific model used to generate the response.