Documentation Index
Fetch the complete documentation index at: https://dev.writer.com/llms.txt
Use this file to discover all available pages before exploring further.
The Writer API supports streaming responses from the API. Streams use Server-Sent Events so that you can display content as the API generates it in real time.
Streaming improves user experience by reducing latency from the time the user submits a request to the time the response is available.
Overview
To stream a response from the API, set the stream parameter to true in the request body.
The following endpoints support streaming:
Sample request and response
The code below shows a streaming request and response from a text generation request using curl. The response is a stream of server-sent events.
curl --location 'https://api.writer.com/v1/completions' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $WRITER_API_KEY" \
--data '{
"model": "palmyra-x5",
"prompt": "Respond to a customer chat request about a delayed shipment with a message that apologizes for the delay, offers a tracking number, and provides a new estimated delivery date.",
"stream": true
}'
data: {"value":""}
data: {"value":"Hello"}
data: {"value":" ["}
data: {"value":"Customer"}
data: {"value":"'s"}
data: {"value":" Name"}
...
If you’re using the chat completions endpoint instead of text generation, the streamed response format is slightly different. The content for each chunk appears in choices[0].delta.content. See the Generate chat completions guide for the full streaming response object.
Streaming with SDKs
When you stream a response using a Writer SDK, the SDK creates an iterator that yields chunks of the response. You can iterate over the stream to receive the response.
See below for examples of streaming with the Python and JavaScript SDKs for each endpoint that supports streaming.
from writerai import Writer
# Initialize the client. If you don't pass the `api_key` parameter,
# the client looks for the `WRITER_API_KEY` environment variable.
client = Writer()
text_generation = client.completions.create(
model="palmyra-x5",
prompt="Respond to a customer chat request about a delayed shipment with a message that apologizes for the delay, offers a tracking number, and provides a new estimated delivery date.",
stream=True
)
for chunk in text_generation:
if chunk.value: # Only print non-empty chunks
print(chunk.value, end="", flush=True)
from writerai import Writer
# Initialize the client. If you don't pass the `api_key` parameter,
# the client looks for the `WRITER_API_KEY` environment variable.
client = Writer()
stream = client.chat.chat(
model="palmyra-x5",
messages=[{"role": "user", "content": "Respond to a customer chat request about a delayed shipment with a message that apologizes for the delay, offers a tracking number, and provides a new estimated delivery date."}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
from writerai import Writer
# Initialize the client. If you don't pass the `api_key` parameter,
# the client looks for the `WRITER_API_KEY` environment variable.
client = Writer()
stream = client.applications.generate_content(
application_id="<application-id>",
inputs=[
{
"id": "query",
"value": ["Provide a list of three hotels in San Francisco near Union Square within the price range of $100 to $200 per night"]
}
],
stream=True
)
for chunk in stream:
if chunk.delta.content:
print(chunk.delta.content, end="", flush=True)
elif chunk.delta.stages:
print(chunk.delta.stages[0].content)
from writerai import Writer
# Initialize the client. If you don't pass the `api_key` parameter,
# the client looks for the `WRITER_API_KEY` environment variable.
client = Writer()
stream = client.graphs.question(
graph_ids=["<graph-id>"],
question="What is the generic name for the drug Bavencio?",
stream=True
)
for chunk in stream:
if chunk.answer: # Only print non-empty answers
print(chunk.answer, end="", flush=True)
Streaming helpers for chat completions
The Python and Node SDKs include streaming helpers for chat completions. These helpers provide more granular details about the streaming events and accumulate the response.
To use the streaming helpers, call client.chat.stream. Then, include all the same parameters as you would for a non-streaming chat completion request, except omit the stream parameter.
from writerai import Writer
# Initialize the client. If you don't pass the `api_key` parameter,
# the client looks for the `WRITER_API_KEY` environment variable.
client = Writer()
with client.chat.stream(
model="palmyra-x5",
messages=[{"role": "user", "content": "Respond to a customer chat request about a delayed shipment with a message that apologizes for the delay, offers a tracking number, and provides a new estimated delivery date."}]
) as stream:
for event in stream:
if event.type == "content.delta":
print(event.delta, end="", flush=True)
# print the final response
completion = stream.get_final_completion()
print(completion.choices[0].message.content)
For more information about the streaming helpers for chat completions, see the Python and Node SDKs.