Writer Palmyra LLM is designed to support over 30 languages, including Arabic, French, Spanish, Hindi, Simplified Chinese, Traditional Chinese, and more. This page provides an overview of our capabilities, performance benchmarks, and prompting examples on how to leverage these features. When it comes to multi-language capabilities, there are two primary categories to consider: generation and translation. Generation typically refers to the ability to understand/create content, answer questions, and converse, all within the same language. Translation typically refers to the ability to transform text to and from English, where either the input or output language is English. On this page, we display two of the many benchmarks we use to evaluate multi-language performance in our Palmyra LLMs. Writer Palmyra has the highest performance of any production LLM in the Holistic Evaluation of Language Models (HELM), an LLM evaluation framework developed by Stanford CRFM to serve as a living benchmark for the community, continuously updated with new scenarios, metrics, and models. While there are limited benchmarks available for evaluating text generation and translation in different languages, we have achieved some of the highest scores in both MMLU and BLEU for other languages. One benchmark that Writer uses to evaluate text generation performance is MMLU (Massive Multitask Language Understanding). The MLMM evaluation covers 57 tasks including elementary mathematics, U.S. history, computer science, law, and more. To attain high accuracy on this test, models must possess extensive world knowledge and problem solving ability. One benchmark that Writer uses to evaluate text translation performance is BLEU (Bilingual Evaluation Understudy). It’s worth noting that any BLEU score above 60 indicates a higher quality translation than a human translation. While Palmyra’s core competency lies in the text generation realm, translation use cases are possible. However, it’s important to exercise caution in languages where benchmarks are not yet established (we are actively working on establishing these benchmarks). We believe in transparency and advise potential users to be aware of this caveat. Therefore, any outputs or usage of Writer LLM should always be accompanied by the guidance of a human expert. We are continuously evaluating and refining our capabilities, and we are committed to learning with our customers.

Language	MMLU/MLMM	BLEU (source \ English)
Arabic	68.9	61.2
Bengali	63.3	54.4
Bulgarian	76.3	64.2
Chinese simplified	71.7	63.8
Chinese traditional	73.7	57.0
Croatian	64.9	66.4
Czech	-	52.5
Danish	77.7	70.5
Dutch	73.6	73.9
English	70.2	-
Finnish	-	68.9
French	69.1	63.1
German	70.4	71.3
Greek	-	60.4
Hebrew	-	67.8
Hindi	77.9	68.4
Hungarian	67.7	65.3
Indonesian	67.8	63.5
Italian	72.5	70.9
Japanese	73.5	66.8
Korean	-	56.8
Lithuanian	-	59.3
Polish	-	60.6
Portuguese	-	66.2
Romanian	70.9	67.6
Russian	75.1	65.2
Spanish	72.5	79.3
Swahili	-	62.8
Swedish	-	63.2
Thai	-	54.7
Turkish	64.1	57.5
Ukrainian	75.2	68.0
Vietnamese	72.5	60.3

Dialect support

Writer Palmyra LLM also supports outputting in specific language dialects. The best results come from using a prompt with the following characteristics:

The prompt itself is in the desired language and dialect
The prompt clearly describes the type of dialect (e.g. “It’s essential that you use the Spanish spoken in Spain.”)
The prompt provides specific examples of the dialect, both vocabulary and grammatical differences

The following example, although not in the desired language for simplicity’s sake, is an example of an optimal prompt that asks for a translation in Spanish spoken in Spain.

Hello, good afternoon! I need you to help me translate the following text. It’s essential that you use the Spanish spoken in Spain. For example, you should use words like “coche” and/or “patata” instead of “carro” and/or “pap.” Additionally, you need to pay attention to grammatical differences, such as the use of “voy a por” (Spain) instead of “voy por” (Latin America), or the structure of sentences like “hoy he comido una manzana” instead of “hoy comí una manzana.” I prefer that you use “vosotros” (speak) instead of “ustedes” (speak), unless it’s necessary to write very formally. Here is the text to be translated:
[text you want translated]

Basic prompt examples

Translation

Read the content of this source. Provide me with a translation of all its contents in French: https://writer.com/blog/ai-guardrails/

Text generation

Please write a blog post about the importance of productivity for small businesses in Arabic.

Native multi-language support

人工知能の歴史と大規模言語モデルの開発について、短い段落を書いてください。読者はビジネステクノロジーニュースに興味がありますが、技術的なバックグラウンドはありません。技術的な概念を8年生の読解レベルで簡潔に説明してください。

Getting started

Core concepts

Models and pricing

Chat completions

No-code agents

Knowledge Graphs

Tool calling

Additional capabilities

Integrations

Supervise

Security and compliance

Resources

Language support

Dialect support

Basic prompt examples

Translation

Text generation

Native multi-language support

Getting started

Core concepts

Models and pricing

Chat completions

No-code agents

Knowledge Graphs

Tool calling

Additional capabilities

Integrations

Supervise

Security and compliance

Resources

​Dialect support

​Basic prompt examples

​Translation

​Text generation

​Native multi-language support

Dialect support

Basic prompt examples

Translation

Text generation

Native multi-language support