Stream responses from the API

The Writer API supports streaming responses from the API. Streams use Server-Sent Events so that you can display content as the API generates it in real time. Streaming improves user experience by reducing latency from the time the user submits a request to the time the response is available.

Overview

To stream a response from the API, set the stream parameter to true in the request body. The following endpoints support streaming:

Text generation
Chat completions
Generate from no-code agent: Currently, only no-code research agents support streaming.
Knowledge Graph question
Web search

Sample request and response

The code below shows a streaming request and response from a text generation request using curl. The response is a stream of server-sent events.

curl --location 'https://api.writer.com/v1/completions' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $WRITER_API_KEY" \
--data '{
  "model": "palmyra-x5",
  "prompt": "Respond to a customer chat request about a delayed shipment with a message that apologizes for the delay, offers a tracking number, and provides a new estimated delivery date.",
  "stream": true
}'

response

data: {"value":""}

data: {"value":"Hello"}

data: {"value":" ["}

data: {"value":"Customer"}

data: {"value":"'s"}

data: {"value":" Name"}
...

If you’re using the chat completions endpoint instead of text generation, the streamed response format is slightly different. The content for each chunk appears in choices[0].delta.content. See the Generate chat completions guide for the full streaming response object.

Streaming with SDKs

When you stream a response using a Writer SDK, the SDK creates an iterator that yields chunks of the response. You can iterate over the stream to receive the response. See below for examples of streaming with the Python and JavaScript SDKs for each endpoint that supports streaming.

Text Generation
Chat Completions
Generate from no-code agent
Knowledge Graph question

from writerai import Writer

# Initialize the client. If you don't pass the `api_key` parameter,
# the client looks for the `WRITER_API_KEY` environment variable.
client = Writer()

text_generation = client.completions.create(
  model="palmyra-x5",
  prompt="Respond to a customer chat request about a delayed shipment with a message that apologizes for the delay, offers a tracking number, and provides a new estimated delivery date.",
  stream=True
)

for chunk in text_generation:
    if chunk.value:  # Only print non-empty chunks
        print(chunk.value, end="", flush=True)

Streaming helpers for chat completions

The Python and Node SDKs include streaming helpers for chat completions. These helpers provide more granular details about the streaming events and accumulate the response. To use the streaming helpers, call client.chat.stream. Then, include all the same parameters as you would for a non-streaming chat completion request, except omit the stream parameter.

from writerai import Writer

# Initialize the client. If you don't pass the `api_key` parameter,
# the client looks for the `WRITER_API_KEY` environment variable.
client = Writer()

with client.chat.stream(
  model="palmyra-x5",
  messages=[{"role": "user", "content": "Respond to a customer chat request about a delayed shipment with a message that apologizes for the delay, offers a tracking number, and provides a new estimated delivery date."}]
) as stream:
    for event in stream:
        if event.type == "content.delta":
            print(event.delta, end="", flush=True)

# print the final response
completion = stream.get_final_completion()
print(completion.choices[0].message.content)

For more information about the streaming helpers for chat completions, see the Python and Node SDKs.

Getting started

Manage your platform

Core concepts

Models and pricing

Connect to your data

Chat completions

No-code agents

Knowledge Graphs

Tool calling

Additional capabilities

Integrations

Security and compliance

Resources

Stream responses from the API

Overview

Sample request and response

Streaming with SDKs

Streaming helpers for chat completions

Getting started

Manage your platform

Core concepts

Models and pricing

Connect to your data

Chat completions

No-code agents

Knowledge Graphs

Tool calling

Additional capabilities

Integrations

Security and compliance

Resources

​Overview

​Sample request and response

​Streaming with SDKs

​Streaming helpers for chat completions

Overview

Sample request and response

Streaming with SDKs

Streaming helpers for chat completions