Generate text from a prompt

You can use the text generation endpoint to generate text with a Palmyra LLM.

You need an API key to access the Writer API. Get an API key by following the steps in the API quickstart.

We recommend setting the API key as an environment variable in a .env file with the name WRITER_API_KEY.

Text generation vs. chat completion

The text generation endpoint is appropriate when you need to generate a single text response based on a given prompt, or when you want to ask a specific LLM a question.

The chat completion endpoint can generate single messages, or create more complex conversations between a user and an general-purpose LLM. Additionally, the chat completion endpoint offers tool calling, which you can use to access domain-specific LLMs, Knowledge Graphs, and custom functions.

Endpoint overview

URL: POST https://api.writer.com/v1/completions

Using the /completions endpoint results in charges for model usage. See the pricing page for more information.

curl --location 'https://api.writer.com/v1/completions' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $WRITER_API_KEY" \
--data '{
  "model": "palmyra-x-003-instruct",
  "prompt": "Tell me a story",
  "max_tokens": 1000,
  "temperature": 0.7,
  "stream": true
}'

Request body

Below are the required and commonly used optional parameters for the text generation endpoint.

Parameter	Type	Description
`model`	string	Required. The ID of the model to use for text generation.
`prompt`	string	Required. The prompt to generate text from.
`max_tokens`	int	The maximum number of tokens to generate for the response. Defaults to `100`.
`temperature`	float	Temperature influences the randomness in generated text. Defaults to `1`. Increase the value for more creative responses, and decrease the value for more predictable responses.
`stream`	Boolean	A Boolean value that indicates whether to stream the response. Defaults to `false`.

See the full list of available parameters in the text generation endpoint reference.

Response parameters

Non-streaming response

If you set the stream parameter to false, the response is a single JSON object with the following parameters:

Parameter	Type	Description
`model`	string	The ID of the model used to generate the response.
`choices`	array	An array of choices objects.
`choices[0].text`	string	The generated text.
`choices[0].log_probs`	object	The log probabilities of the tokens in the generated text.

{
  "choices": [
    {
      "text": "Camping Gear: The Ultimate Guide\n\nCamping is a great way to get outdoors and enjoy nature",
      "log_probs": null
    }
  ],
  "model": "palmyra-x-003-instruct"
}

Streaming response

If you set the stream parameter to true, the response is delivered as server-sent events with the following parameters:

Parameter	Description
`value`	The content of the chunk.

data: {"value":"Camping Gear: The Ultimate Guide\n\nCamping is a great way to get outdoors and enjoy nature"}

Example request to a specific LLM

The examples below generate a single message from the palmyra-med model, using the prompt “How can I treat a cold?”

Streaming response

The text generation endpoint supports streaming responses. The response comes in chunks until the entire response finishes.

Streaming responses are useful when you want to display the generated text in real-time, or when you want to stream the response to a client, rather than waiting for the entire response to finish.

curl --location 'https://api.writer.com/v1/completions' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $WRITER_API_KEY" \
--data '{
  "model": "palmyra-med",
  "prompt": "How can I treat a cold?",
  "stream": true
}'

Non-streaming response

For non-streaming responses, the response returns as a single JSON object after the entire response is complete. The text is in the choices[0].text field.

curl --location 'https://api.writer.com/v1/completions' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $WRITER_API_KEY" \
--data '{
  "model": "palmyra-med",
  "prompt": "How can I treat a cold?",
  "stream": false
}'

Next steps

Now that you’ve generated text with a Palmyra LLM, try out the following:

Create a chat with an AI assistant using the chat completion endpoint
Learn more about the tool calling feature of the chat completion endpoint

Billing

Content

Cowrite

Snippet

Styleguide

Terminology

User

Text generation vs. chat completion

Endpoint overview

Request body

Response parameters

Non-streaming response

Streaming response

Example request to a specific LLM

Streaming response

Non-streaming response

Next steps

Billing

Content

Cowrite

Snippet

Styleguide

Terminology

User

​Text generation vs. chat completion

​Endpoint overview

​Request body

​Response parameters

​Non-streaming response

​Streaming response

​Example request to a specific LLM

​Streaming response

​Non-streaming response

​Next steps

Text generation vs. chat completion

Endpoint overview

Request body

Response parameters

Non-streaming response

Streaming response

Example request to a specific LLM

Streaming response

Non-streaming response

Next steps