Analyze images
With the vision
endpoint, you can analyze single or multiple images with a prompt. Palmyra Vision allows you to ask questions about an image, generate captions, compare images, and more.
/vision
endpoint will be available in versions 2.1+ of the Writer SDKs.You need an API key to access the Writer API. Get an API key by following the steps in the API quickstart.
We recommend setting the API key as an environment variable in a .env
file with the name WRITER_API_KEY
.
Vision endpoint
Endpoint: POST /v1/vision
Request body
The request body is a JSON object with the following fields:
Parameter | Type | Description |
---|---|---|
model | string | The model to use for the analysis. Must be palmyra-vision . |
prompt | string | The prompt to use for the analysis. The prompt must include the names of the images you’re analyzing, referencing them as {{name}} . For example: What's the difference between the image {{image_1}} and the image {{image_2}}? |
variables | array | An array of image variables with a name and file_id . |
variables[].name | string | The name of the image. You must use the same name in the prompt, referencing it as {{name}} . |
variables[].file_id | string | The File ID of the uploaded image. You must upload the image to Writer before passing it to the Vision endpoint. Learn how to upload images below. |
Response format
The response is a JSON object with a data
field that contains the analysis results as a string.
Example: Extract text from an image
This example shows how to extract text from an image using the vision
endpoint.
Upload an image
Before you can analyze an image, you need to upload it to Writer.
The following code samples demonstrate how to upload an image and print the File ID. You need the File ID to pass to the Vision endpoint.
Learn more about uploading and managing files.
Generate a caption with the Vision endpoint
Once you have the File IDs for any images you want to analyze, you can pass them to the Vision endpoint along with a prompt.
The prompt must include the names of the images you’re analyzing, referencing them as {{name}}
, where name
is the name you provided in the variables
array. For example: Extract the text from the image {{name}}.
If you include files in the variables
array that you don’t include in the prompt, the API returns an error.
The following code sample shows the API call to extract text from an image.
The response is a JSON object with a data
field that contains the analysis results as a string.
Was this page helpful?