Analyze images

With the vision endpoint, you can analyze single or multiple images with a prompt. Palmyra Vision allows you to ask questions about an image, generate captions, compare images, and more.

You need an API key to access the Writer API. Get an API key by following the steps in the API quickstart.

We recommend setting the API key as an environment variable in a .env file with the name WRITER_API_KEY.

Vision endpoint

Endpoint: POST /v1/vision

curl -X POST \
  'https://api.writer.com/v1/vision' \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $WRITER_API_KEY" \
  --data-raw '{
    "model": "palmyra-vision",
    "prompt": "What's the difference between the image {{image_1}} and the image {{image_2}}?",
    "variables": [
      {"name": "image_1", "file_id": "f1234"},
      {"name": "image_2", "file_id": "f5678"}
    ]
  }'

Request body

The request body is a JSON object with the following fields:

Parameter	Type	Description
`model`	string	The model to use for the analysis. Must be `palmyra-vision`.
`prompt`	string	The prompt to use for the analysis. The prompt must include the names of the images you’re analyzing, referencing them as `{{name}}`. For example: `What's the difference between the image {{image_1}} and the image {{image_2}}?`
`variables`	array	An array of image variables with a `name` and `file_id`.
`variables[].name`	string	The name of the image. You must use the same name in the prompt, referencing it as `{{name}}`.
`variables[].file_id`	string	The File ID of the uploaded image. The maximum allowed file size is 7MB. You must upload the image to Writer before passing it to the Vision endpoint. Learn how to upload images below.

Response format

The response is a JSON object with a data field that contains the analysis results as a string.

{
    "data": "The analysis results"
}

Example: Extract text from an image

This example shows how to extract text from an image using the vision endpoint.

Upload an image

Before you can analyze an image, you need to upload it to Writer.

The following code samples demonstrate how to upload an image and print the File ID. You need the File ID to pass to the Vision endpoint.

curl -X POST 'https://api.writer.com/v1/files' \
  -H 'Content-Type: image/jpeg' \
  -H 'Content-Disposition: attachment; filename=handwriting.jpg' \
  -H "Authorization: Bearer $WRITER_API_KEY" \
  --data-binary "@path/to/file/handwriting.jpg"

Learn more about uploading and managing files.

Generate a caption with the Vision endpoint

Once you have the File IDs for any images you want to analyze, you can pass them to the Vision endpoint along with a prompt.

The prompt must include the names of the images you’re analyzing, referencing them as {{name}}, where name is the name you provided in the variables array. For example: Extract the text from the image {{name}}. If you include files in the variables array that you don’t include in the prompt, the API returns an error.

The following code sample shows the API call to extract text from an image.

curl -X POST \
  'https://api.writer.com/v1/vision' \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $WRITER_API_KEY" \
  --data '{
    "model": "palmyra-vision",
    "prompt": "Extract the text from the image {{handwriting}}.",
    "variables": [{"name": "handwriting", "file_id": "f1234"}]
  }'

Next steps

Learn how to analyze images during a chat completion with the Vision tool.

Getting started

Core concepts

Models and pricing

Chat completions

No-code agents

Knowledge Graphs

Tool calling

Additional capabilities

Integrations

Supervise

Security and compliance

Resources

Vision endpoint

Request body

Response format

Example: Extract text from an image

Upload an image

Generate a caption with the Vision endpoint

Next steps

Getting started

Core concepts

Models and pricing

Chat completions

No-code agents

Knowledge Graphs

Tool calling

Additional capabilities

Integrations

Supervise

Security and compliance

Resources

​Vision endpoint

​Request body

​Response format

​Example: Extract text from an image

​Upload an image

​Generate a caption with the Vision endpoint

​Next steps

Vision endpoint

Request body

Response format

Example: Extract text from an image

Upload an image

Generate a caption with the Vision endpoint

Next steps