Authorization: Bearer YOUR_API_KEY
engine_id
to operate. The
following engines are currently available:
mistral_7B
: Mistral 7B
is a 7 billion parameter language model with a 8K context length
outperforming Llama2 13B on many tests.
llama3_8B
: Llama3 8B
is a 8 billion parameter language model with a 8K context length
trained on 15T tokens. There are
specific use
restrictions associated with this model.
llama3.1_8B_instruct
: Llama3.1
8B Instruct is a 8 billion parameter chat model. The context
length is currently limited to 8K. There are
specific use
restrictions associated with this model.
mixtral_47B_instruct
: Mixtral
47B Instruct is a 47 billion parameter language model with a 32K context length. It was fine-tuned for chat.
llama3.3_70B_instruct
: Llama3.3
70B instruct is a 70 billion parameter chat model. The
context length is currently limited to 8K. There are
specific use
restrictions associated with this model.
gptj_6B
: GPT-J
is a 6 billion parameter language model with a 2K context length
trained on the Pile (825
GB of text data) published
by EleutherAI.
madlad400_7B
: MADLAD400 7B is a 7 billion parameter language model specialized for translation. It supports multilingual translation between about 400 languages. See the translate endpoint.
stable_diffusion
: Stable Diffusion is a 1 billion parameter text to image model trained to generate 512x512 pixel images from English text (sd-v1-4.ckpt checkpoint). See the text_to_image endpoint. There are specific use restrictions associated with this model.
whisper_large_v3
: Whisper Large v3 is a 1.5 billion parameter model for speech to text transcription in 100 languages. See the transcript endpoint.parler_tts_large
: Parler-TTS v1 is a 2.2 billion parameter model for Text to Speech in English. See the speech endpoint. POST https://api.textsynth.com/v1/engines/{engine_id}/completionswhere
engine_id
is the selected engine.
prompt
: string or array of string.
The input text(s) to complete.
max_tokens
: optional int (default = 100)
Maximum number of tokens to generate. A token represents about 4 characters for English texts. The total number of tokens (prompt + generated text) cannot exceed the model's maximum context length. See the model list to know their maximum context length.
If the prompt length is larger than the model's maximum context length, the beginning of the prompt is discarded.
stream
: optional boolean (default = false)
If true, the output is streamed so that it is possible to display the result before the complete output is generated. Several JSON answers are output. Each answer is followed by two line feed characters.
stop
: optional string or array of string (default = null)
Stop the generation when the string(s) are encountered. The generated text does not contain the string. The length of the array is at most 5.
n
: optional integer (range: 1 to 16, default = 1)
Generate n completions from a single prompt.
temperature
: optional number (default = 1)
Sampling temperature. A higher temperature means the model
will select less common tokens leading to a larger diversity
but potentially less relevant output. It is usually better to
tune top_p
or top_k
.
top_k
: optional integer (range: 1 to 1000, default = 40)
Select the next output token among the top_k
most likely ones. A higher top_k
gives more
diversity but a potentially less relevant output.
top_p
: optional number (range: 0 to 1, default = 0.9)
Select the next output token among the most probable ones
so that their cumulative probability is larger
than top_p
. A higher top_p
gives
more diversity but a potentially less relevant
output. top_p
and top_k
are
combined, meaning that at most top_k
tokens are
selected. A value of 1 disables this sampling.
seed
: optional integer (default = 0).
Random number seed. A non zero seed always yields the same completions. It is useful to get deterministic results and try different sets of parameters.
logit_bias
: optional object (default = {})
Modify the likelihood of the specified tokens in the
completion. The specified object is a map between the token
indexes and the corresponding logit bias. A negative bias
reduces the likelihood of the corresponding token. The bias
must be between -100 and 100. Note that the token indexes are
specific to the selected model. You can use
the tokenize
API endpoint to retrieve the token
indexes of a given model.
Example: if you want to ban the " unicorn" token for
GPT-J, you can use: logit_bias: { "44986": -100 }
presence_penalty
: optional number (range: -2 to 2, default = 0)
A positive value penalizes tokens which already appeared in the generated text. Hence it forces the model to have a more diverse output.
frequency_penalty
: optional number (range: -2 to 2, default = 0)
A positive value penalizes tokens which already appeared in the generated text proportionaly to their frequency. Hence it forces the model to have a more diverse output.
repetition_penalty
: optional number (default = 1)
Divide by repetition_penalty the logits corresponding to tokens which already appeared in the generated text. A value of 1 effectively disables it. See this article for more details.
typical_p
: optional number (range: 0 to 1, default = 1)
Alternative to top_p
sampling: instead of
selecting the tokens starting from the most probable one,
start from the ones whose log likelihood is the closest to
the symbol entropy. As with top_p
, at
most top_k
tokens are selected. A value of 1
disables this
sampling. See this
article for more details.
grammar
: optional string
Specify a grammar that the completion should match. More information about the grammar syntax is available in section 1.3.1.
schema
: optional object
Specify a JSON schema that the completion should match. Only
a subset of the JSON schema specification is supported as
defined in section
1.3.2. grammar
and schema
cannot be both present.
text
: string or array of string
It is the completed text. If the n
parameter is larger than 1 or if an array of string was provided as prompt
, an array of strings is returned.
reached_end
: boolean
If true, indicate that it is the last answer. It is only
useful in case of streaming output (stream = true
in the request).
truncated_prompt
: bool (default = false)
If true, indicate that the prompt was truncated because it was too large compared to the model's maximum context length. Only the end of the prompt is used to generate the completion.
finish_reason
: string or array or string
Indicate the reason why the generation was finished. An
array of string is returned if text
is an
array. Possible values: "stop"
(end-of-sequence
token reached), "length"
(the maximum specified
length was reached), "grammar"
(no suitable token
satisfies the specified grammar or stack overflow when
evaluating the grammar).
input_tokens
: integer
Indicate the number of input tokens. It is useful to estimate the number of compute resources used by the request.
output_tokens
: integer
Indicate the total number of generated tokens. It is useful to estimate the number of compute resources used by the request.
curl https://api.textsynth.com/v1/engines/gptj_6B/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_API_KEY" \ -d '{"prompt": "Once upon a time, there was", "max_tokens": 20 }'Answer:
{ "text": " a woman who loved to get her hands on a good book. She loved to read and to tell", "reached_end": true, "input_tokens": 7, "output_tokens": 20 }
Python example: completion.py
A Bakus-Naur Form (BNF) grammar can be used to constrain the generated output.
The grammar definition consists in production rules defining how
non non-terminals can be replaced by other non-terminals or
terminals (characters). The special root
non-terminal
represents the whole output.
Here is an example of grammar matching the JSON syntax:
# BNF grammar to parse JSON objects root ::= ws object value ::= object | array | string | number | ("true" | "false" | "null") object ::= "{" ws ( string ":" ws value ws ("," ws string ":" ws value ws )* )? "}" array ::= "[" ws ( value ws ("," ws value ws )* )? "]" string ::= "\"" ( [^"\\] | "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) # escapes )* "\"" number ::= ("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? # whitespace ws ::= ([ \t\n] ws)?
A production rule has the syntax:
value ::= object | array | "null"where
value
is the non-terminal name. A newline terminates the
rule definition. Alternatives are indicated with |
between sequence of terms. Newlines are interpreted as whitespace in
parenthesis or after |
.
A term is either:
\xNN
, \uNNNN
or \UNNNNNNNN
.(...)
to embed alternatives.[...]
) or excluded
character list ([^...]
) like in regular
expressions.*
to repeat the term 0 or more times+
to repeat the term 1 or more times?
to repeat the term 0 or 1 time.
Comments are introduced with the #
character.
Grammar restriction:
expr ::= [0-9]+ | expr "+" exprFortunately it is always possible to transform left recursion into right recursion by adding more non-terminals:
expr ::= number | number "+" expr number ::= [0-9]+
A JSON schema can be used to constrain the generated output. It is recommended to also include it in your prompt so that the language model knows the JSON format which is expected in its reply.
Here is an example of supported JSON schema:{ "type": "object", "properties": { "id": { "type": "string" }, "name": { "type": "string" }, "age": { "type": "integer", "minimum": 16, "maximum": 150, }, "phone_numbers": { "type": "array", "items": { "type": "object", "properties": { "number": { "type": "string", }, "type": { "type": "string", "enum": ["mobile", "home"], }, }, "required": ["number", "type"] /* at least one property must be required */ }, "minItems": 1, /* only 0 or 1 are supported, default = 0 */ }, "hobbies": { "type": "array", "items": { "type": "string" } } }, "required": ["id", "name", "age"] }The following types are supported:
object
. The required
parameter must be
present with at least one property in it.array
. The minimum number of elements may be
constrained with the optional minItems
parameter. Only the
values 0 or 1 are supported.string
. The optional enum
parameter indicates the allowed values.integer
. The optional minimum
and maximum
parameters may be present to restrict the range. The maximum range is -2147483648 to 2147483647.
number
: floating point numbers.
boolean
: true
or false
values.
null
: the null
value.
This endpoint provides completions for chat applications. The prompt is automatically formatted according to the model preferred chat prompt template.
The API syntax is:
POST https://api.textsynth.com/v1/engines/{engine_id}/chatwhere
engine_id
is the
selected engine. The API is identical to
the completions endpoint except that
the prompt
property is removed and replaced by:
messages
: array of strings.
The conversation history. At least one element must be present. If the number of elements is odd, the model generates the response of the assistant. Otherwise, it completes it.
system
: optional string.
Override the default model system prompt which gives general advices to the model.
curl https://api.textsynth.com/v1/engines/falcon_40B-chat/chat \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_API_KEY" \ -d '{"messages": ["What is the translation of hello in French ?"]}'Answer:
{ "text": " \"Bonjour\" is the correct translation for \"hello\" in French. It is commonly used as a greeting in both formal and informal settings. \"Bonjour\" can be used when addressing a single person, a group of people, or even when answering the phone.", "reached_end": true, "input_tokens": 45, "output_tokens": 56 }
POST https://api.textsynth.com/v1/engines/{engine_id}/translatewhere
engine_id
is the
selected engine.
text
: array of strings.
Each string is an independent text to translate. Batches of at most 64 texts can be provided.
source_lang
: string.
Two or three
character ISO
language code for the source language. The special
value "auto"
indicates to auto-detect the source
language. The language auto-detection does not support all
languages and is based on heuristics. Hence if you know the
source language you should explicitly indicate it.
The madlad400_7B
model supports the following languages:
Code | Language | Code | Language | Code | Language | Code | Language |
---|---|---|---|---|---|---|---|
ace | Achinese | ada | Adangme | adh | Adhola | ady | Adyghe |
af | Afrikaans | agr | Aguaruna | msm | Agusan Manobo | ahk | Akha |
sq | Albanian | alz | Alur | abt | Ambulas | am | Amharic |
grc | Ancient Greek | ar | Arabic | hy | Armenian | frp | Arpitan |
as | Assamese | av | Avar | kwi | Awa-Cuaiquer | awa | Awadhi |
quy | Ayacucho Quechua | ay | Aymara | az | Azerbaijani | ban | Balinese |
bm | Bambara | bci | Baoulé | bas | Basa (Cameroon) | ba | Bashkir |
eu | Basque | akb | Batak Angkola | btx | Batak Karo | bts | Batak Simalungun |
bbc | Batak Toba | be | Belarusian | bzj | Belize Kriol English | bn | Bengali |
bew | Betawi | bho | Bhojpuri | bim | Bimoba | bi | Bislama |
brx | Bodo (India) | bqc | Boko (Benin) | bus | Bokobaru | bs | Bosnian |
br | Breton | ape | Bukiyip | bg | Bulgarian | bum | Bulu |
my | Burmese | bua | Buryat | qvc | Cajamarca Quechua | jvn | Caribbean Javanese |
rmc | Carpathian Romani | ca | Catalan | qxr | Cañar H. Quichua | ceb | Cebuano |
bik | Central Bikol | maz | Central Mazahua | ch | Chamorro | cbk | Chavacano |
ce | Chechen | chr | Cherokee | hne | Chhattisgarhi | ny | Chichewa |
zh | Chinese (Simplified) | ctu | Chol | cce | Chopi | cac | Chuj |
chk | Chuukese | cv | Chuvash | kw | Cornish | co | Corsican |
crh | Crimean Tatar | hr | Croatian | cs | Czech | mps | Dadibi |
da | Danish | dwr | Dawro | dv | Dhivehi | din | Dinka |
tbz | Ditammari | dov | Dombe | nl | Dutch | dyu | Dyula |
dz | Dzongkha | bgp | E. Baluchi | gui | E. Bolivian Guaraní | bru | E. Bru |
nhe | E. Huasteca Nahuatl | djk | E. Maroon Creole | taj | E. Tamang | enq | Enga |
en | English | sja | Epena | myv | Erzya | eo | Esperanto |
et | Estonian | ee | Ewe | cfm | Falam Chin | fo | Faroese |
hif | Fiji Hindi | fj | Fijian | fil | Filipino | fi | Finnish |
fip | Fipa | fon | Fon | fr | French | ff | Fulah |
gag | Gagauz | gl | Galician | gbm | Garhwali | cab | Garifuna |
ka | Georgian | de | German | gom | Goan Konkani | gof | Gofa |
gor | Gorontalo | el | Greek | guh | Guahibo | gub | Guajajára |
gn | Guarani | amu | Guerrero Amuzgo | ngu | Guerrero Nahuatl | gu | Gujarati |
gvl | Gulay | ht | Haitian Creole | cnh | Hakha Chin | ha | Hausa |
haw | Hawaiian | he | Hebrew | hil | Hiligaynon | mrj | Hill Mari |
hi | Hindi | ho | Hiri Motu | hmn | Hmong | qub | Huallaga Huánuco Quechua |
hus | Huastec | hui | Huli | hu | Hungarian | iba | Iban |
ibb | Ibibio | is | Icelandic | ig | Igbo | ilo | Ilocano |
qvi | Imbabura H. Quichua | id | Indonesian | inb | Inga | iu | Inuktitut |
ga | Irish | iso | Isoko | it | Italian | ium | Iu Mien |
izz | Izii | jam | Jamaican Creole English | ja | Japanese | jv | Javanese |
kbd | Kabardian | kbp | Kabiyè | kac | Kachin | dtp | Kadazan Dusun |
kl | Kalaallisut | xal | Kalmyk | kn | Kannada | cak | Kaqchikel |
kaa | Kara-Kalpak | kaa_Latn | Kara-Kalpak (Latn) | krc | Karachay-Balkar | ks | Kashmiri |
kk | Kazakh | meo | Kedah Malay | kek | Kekchí | ify | Keley-I Kallahan |
kjh | Khakas | kha | Khasi | km | Khmer | kjg | Khmu |
kmb | Kimbundu | rw | Kinyarwanda | ktu | Kituba (DRC) | tlh | Klingon |
trp | Kok Borok | kv | Komi | koi | Komi-Permyak | kg | Kongo |
ko | Korean | kos | Kosraean | kri | Krio | ksd | Kuanua |
kj | Kuanyama | kum | Kumyk | mkn | Kupang Malay | ku | Kurdish (Kurmanji) |
ckb | Kurdish (Sorani) | ky | Kyrghyz | quc | K’iche’ | lhu | Lahu |
quf | Lambayeque Quechua | laj | Lango (Uganda) | lo | Lao | ltg | Latgalian |
la | Latin | lv | Latvian | ln | Lingala | lt | Lithuanian |
lu | Luba-Katanga | lg | Luganda | lb | Luxembourgish | ffm | Maasina Fulfulde |
mk | Macedonian | mad | Madurese | mag | Magahi | mai | Maithili |
mak | Makasar | mgh | Makhuwa-Meetto | mg | Malagasy | ms | Malay |
ml | Malayalam | mt | Maltese | mam | Mam | mqy | Manggarai |
gv | Manx | mi | Maori | arn | Mapudungun | mrw | Maranao |
mr | Marathi | mh | Marshallese | mas | Masai | msb | Masbatenyo |
mbt | Matigsalug Manobo | chm | Meadow Mari | mni | Meiteilon (Manipuri) | min | Minangkabau |
lus | Mizo | mdf | Moksha | mn | Mongolian | mfe | Morisien |
meu | Motu | tuc | Mutu | miq | Mískito | emp | N. Emberá |
lrc | N. Luri | qvz | N. Pastaza Quichua | se | N. Sami | nnb | Nande |
niq | Nandi | nv | Navajo | ne | Nepali | new | Newari |
nij | Ngaju | gym | Ngäbere | nia | Nias | nog | Nogai |
no | Norwegian | nut | Nung (Viet Nam) | nyu | Nyungwe | nzi | Nzima |
ann | Obolo | oc | Occitan | or | Odia (Oriya) | oj | Ojibwa |
ang | Old English | om | Oromo | os | Ossetian | pck | Paite Chin |
pau | Palauan | pag | Pangasinan | pa | Panjabi | pap | Papiamento |
ps | Pashto | fa | Persian | pis | Pijin | pon | Pohnpeian |
pl | Polish | jac | Popti’ | pt | Portuguese | qu | Quechua |
otq | Querétaro Otomi | raj | Rajasthani | rki | Rakhine | rwo | Rawa |
rom | Romani | ro | Romanian | rm | Romansh | rn | Rundi |
ru | Russian | rcf | Réunion Creole French | alt | S. Altai | quh | S. Bolivian Quechua |
qup | S. Pastaza Quechua | msi | Sabah Malay | hvn | Sabu | sm | Samoan |
cuk | San Blas Kuna | sxn | Sangir | sg | Sango | sa | Sanskrit |
skr | Saraiki | srm | Saramaccan | stq | Saterfriesisch | gd | Scottish Gaelic |
seh | Sena | nso | Sepedi | sr | Serbian | crs | Seselwa Creole French |
st | Sesotho | shn | Shan | shp | Shipibo-Conibo | sn | Shona |
jiv | Shuar | smt | Simte | sd | Sindhi | si | Sinhala |
sk | Slovak | sl | Slovenian | so | Somali | nr | South Ndebele |
es | Spanish | srn | Sranan Tongo | acf | St Lucian Creole French | su | Sundanese |
suz | Sunwar | spp | Supyire Senoufo | sus | Susu | sw | Swahili |
ss | Swati | sv | Swedish | gsw | Swiss German | syr | Syriac |
ksw | S’gaw Karen | tab | Tabassaran | tg | Tajik | tks | Takestani |
ber | Tamazight (Tfng) | ta | Tamil | tdx | Tandroy-Mahafaly Malagasy | tt | Tatar |
tsg | Tausug | te | Telugu | twu | Termanu | teo | Teso |
tll | Tetela | tet | Tetum | th | Thai | bo | Tibetan |
tca | Ticuna | ti | Tigrinya | tiv | Tiv | toj | Tojolabal |
to | Tonga (Tonga Islands) | sda | Toraja-Sa’dan | ts | Tsonga | tsc | Tswa |
tn | Tswana | tcy | Tulu | tr | Turkish | tk | Turkmen |
tvl | Tuvalu | tyv | Tuvinian | ak | Twi | tzh | Tzeltal |
tzo | Tzotzil | tzj | Tz’utujil | tyz | Tày | udm | Udmurt |
uk | Ukrainian | ppk | Uma | ubu | Umbu-Ungu | ur | Urdu |
ug | Uyghur | uz | Uzbek | ve | Venda | vec | Venetian |
vi | Vietnamese | knj | W. Kanjobal | wa | Walloon | war | Waray (Philippines) |
guc | Wayuu | cy | Welsh | fy | Western Frisian | wal | Wolaytta |
wo | Wolof | noa | Woun Meu | xh | Xhosa | sah | Yakut |
yap | Yapese | yi | Yiddish | yo | Yoruba | yua | Yucateco |
zne | Zande | zap | Zapotec | dje | Zarma | zza | Zaza |
zu | Zulu |
target_lang
: string.
Two or three character ISO language code for the target language.
num_beams
: integer (range: 1 to 5, default = 4).
Number of beams used to generate the translated text. The translation is usually better with a larger number of beams. Each beam requires generating a separate translated text, hence the number of generated tokens is multiplied by the number of beams.
split_sentences
: optional boolean (default = true).
The translation model only translates one sentence at a
time. Hence the input must be split into sentences. When
split_sentences = true (default), each input text is
automatically split into sentences using source language
specific heuristics.
If you are sure that each input text contains
only one sentence, it is better to disable the automatic
sentence splitting.
translations
: array of objects.
Each object has the following properties:
text
: string
Translated text
detected_source_lang
: string
ISO language code corresponding to the detected lang (identical to source_lang
if language auto-detection is not enabled)
input_tokens
: integer
Indicate the total number of input tokens. It is useful to estimate the number of compute resources used by the request.
output_tokens
: integer
Indicate the total number of generated tokens. It is useful to estimate the number of compute resources used by the request.
curl https://api.textsynth.com/v1/engines/m2m100_1_2B/translate \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_API_KEY" \ -d '{"text": ["The quick brown fox jumps over the lazy dog."], "source_lang": "en", "target_lang": "fr" }'Answer:
{ "translations": [{"detected_source_lang":"en","text":"Le renard brun rapide saute sur le chien paresseux."}], "input_tokens": 18, "output_tokens": 85 }
Python example: translate.py
continuation
is generated after
a context
. It can be used to answer questions when
only a few answers (such as yes/no) are possible. It can also be
used to benchmark the models.
The API syntax to get the log probabilities is:
POST https://api.textsynth.com/v1/engines/{engine_id}/logprobwhere
engine_id
is the
selected engine.
context
: string or array of string.
If empty string, the context is set to the End-Of-Text token.
continuation
: string or array of string.
Must be a non empty string. If an array is provided, it must have the same number of elements as context
.
logprob
: double or array of double
Logarithm of the probability of generation
of continuation
preceeded
by context
. It corresponds to the sum of the
logarithms of the probabilities of the tokens
of continuation
. It is always <= 0. An array is
returned if context
was an array.
num_tokens
: integer or array of integer
Number of tokens in continuation
. An array is
returned if context
was an array.
is_greedy
: boolean or array of boolean
true if continuation
would be generated by
greedy sampling from continuation
. An array is
returned if context
was an array.
input_tokens
: integer
Indicate the total number of input tokens. It is useful to estimate the number of compute resources used by the request.
curl https://api.textsynth.com/v1/engines/gptj_6B/logprob \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_API_KEY" \ -d '{"context": "The quick brown fox jumps over the lazy", "continuation": " dog"}'Answer:
{ "logprob": -0.0494835916522837, "is_greedy": true, "input_tokens": 9 }
POST https://api.textsynth.com/v1/engines/{engine_id}/tokenizewhere
engine_id
is the
selected engine.
text
: string.
Input text.
token_content_type
: optional string (default = "none").
If set to "base64", also output the content of each token encoded as a base64 string. Note: tokens do not necessarily contain full UTF-8 characters so it is not always possible to represent their content as an UTF-8 string.
tokens
: array of integers.
Token indexes corresponding to the input text.
token_content
: array of strings.
Base64 strings corresponding to the content of each token if token_content_type
was set to "base64".
curl https://api.textsynth.com/v1/engines/gptj_6B/tokenize \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_API_KEY" \ -d '{"text": "The quick brown fox jumps over the lazy dog"}'Answer:
{"tokens":[464,2068,7586,21831,18045,625,262,16931,3290]}Note: the tokenize endpoint is free.
POST https://api.textsynth.com/v1/engines/{engine_id}/text_to_imagewhere
engine_id
is the
selected engine. Currently only stable_diffusion
is supported.
prompt
: string.
The text prompt. Only the first 75 tokens are used.
image_count
: optional integer (default = 1).
Number of images to generate. At most 4 images can be generated with one request. The generation of an image takes about 2 seconds.
width
: optional integer (default = 512).height
: optional integer (default = 512).
Width and height in pixels of the generated images. The only accepted values are 384, 512, 640 and 768. The product width by height must be <= 393216 (hence a maximum size of 512x768 or 768x512). The model is trained with 512x512 images, so the best results are obtained with this size.
timesteps
: optional integer (default = 50).
Number of diffusion steps. Larger values usually give a better result but the image generation takes longer.
guidance_scale
: optional number (default = 7.5).
Guidance Scale. A larger value gives a larger importance to the text prompt with respect to a random image generation.
seed
: optional integer (default = 0).
Random number seed. A non zero seed always yields the same images. It is useful to get deterministic results and try different sets of parameters.
negative_prompt
: optional string (default = "").
Negative text prompt. It is useful to exclude specific items from the generated image. Only the first 75 tokens are used.
image
: optional string (default = none).
Optional base 64 encoded JPEG image serving as seed for the generated image. It must have the same width and height as the generated image.
strength
: optional number (default = 0.5, range 0 to 1).
When using an image as seed (see the image
parameter), specifies the ponderation between the noise and
the image seed. The value 0 is equivalent to not using the
image seed.
images
: array of objects.
Each object has the following property:
data
: string
Base64 encoded generated JPEG image.
curl https://api.textsynth.com/v1/engines/stable_diffusion/text_to_image \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_API_KEY" \ -d '{"prompt": "an astronaut riding a horse" }'Answer:
{ "images": [{"data":"..."}], }
Python example: sd.py
This endpoint does speech to text transcription. The input consists in an audio file and optional parameters. The JSON output contains the text transcription with timestamps.
The API syntax is:
POST https://api.textsynth.com/v1/engines/{engine_id}/transcriptwhere
engine_id
is the
selected engine. Currently only whisper_large_v3
is supported.
The content type of the posted data should be
multipart/form-data
. It should contain at least one
file of name file
with the audio file to
transcript. The supported file formats are: mp3, m4a, mp4, wav
and opus. The maximum file size is 50 MBytes. The maximum
supported duration is 2 hours.
Additional parameters may be provided either as form data or
inside an additional file of name json
containing
JSON data.
The following additional parameters are supported:
language
: optional string (default = "auto").
The special value auto
indicates that the language is automatically detected on the first 30 seconds of audio. Otherwise it is an ISO language code. The following languages are available: af, am, ar, as, az, ba, be, bg, bn, bo, br, bs, ca, cs, cy, da, de, el, en, es, et, eu, fa, fi, fo, fr, gl, gu, ha, haw, he, hi, hr, ht, hu, hy, id, is, it, ja, jw, ka, kk, km, kn, ko, la, lb, ln, lo, lt, lv, mg, mi, mk, ml, mn, mr, ms, mt, my, ne, nl, nn, no, oc, pa, pl, ps, pt, ro, ru, sa, sd, si, sk, sl, sn, so, sq, sr, su, sv, sw, ta, te, tg, th, tk, tl, tr, tt, uk, ur, uz, vi, yi, yo, yue, zh.
text
: string.
Transcripted text.
segments
: array of objects.
transcripted text segments with timestamps. Each segment has the following properties:
id
: integer.
Segment ID.
start
: float.
Start time in seconds.
end
: float.
End time in seconds.
text
: string.
Transcripted text for this segment.
language
: string.
ISO language code.
duration
: float.
Transcription duration in seconds
curl https://api.textsynth.com/v1/engines/whisper_large_v3/transcript \ -H "Authorization: Bearer YOUR_API_KEY" \ -F language=en -F file=@input.mp3Where
input.mp3
is the audio file to transcript.{ "text": "...", "segments": [...], ... }
Python example: transcript.py
This endpoint does text to speech output. The output is a MP3 stream containing the generated speech.
The API syntax is:
POST https://api.textsynth.com/v1/engines/{engine_id}/speechwhere
engine_id
is the
selected engine. Currently only parler_tts_large
is supported. Only the English language is supported.
input
: string.
The input text. It must contain less than 4096 unicode characters.
voicet
: string.
Select the voice name. The following voices are available: Will Eric Laura Alisa Patrick Rose Jerry Jordan Lauren Jenna Karen Rick Bill James Yann Emily Anna Jon Brenda Barbara.
seed
: optional integer (default = 0).
Random number seed. A non zero seed yields the same output for a given input text. It is useful to get deterministic results.
curl https://api.textsynth.com/v1/engines/parler_tts_large/speech \ -H "Authorization: Bearer YOUR_API_KEY" \ -d '{"input": "Hello world.", "voice": "Will" }'Answer:
Python example: speech.py
credits
: integer
Number of remaining credits multiplied by 1e9.
curl https://api.textsynth.com/v1/credits \ -H "Authorization: Bearer YOUR_API_KEY"Answer:
{"credits":123456789}
In addition to pure text completion, you can tune your prompt (input text) so that the model solves a precise task such as:
For text to image, see the Stable Diffusion Prompt Book.
We present in this section the objective results of the various models on tasks from the Language Model Evaluation Harness. These results were computed using the TextSynth API so that they can be fully reproduced (patch: lm_evaluation_harness_textsynth.tar.gz).
Zero-shot performance:
Model | LAMBADA (acc) | Hellaswag (acc_norm) | Winogrande (acc) | PIQA (acc) | COQA (f1) | Average ↑ |
---|---|---|---|---|---|---|
llama3_8B | 75.2% | 78.2% | 73.5% | 78.8% | 80.4% | 77.2% |
mistral_7B | 74.9% | 80.1% | 73.9% | 80.7% | 80.3% | 78.0% |
Five-shot performance:
Model | MMLU (exact match) |
---|---|
llama3.3_70B_instruct | 81.9% |
mixtral_47B_instruct | 67.6% |
llama3.1_8B_instruct | 67.1% |
Note that these models have been trained with data which contains possible test set contamination. So not all these results might reflect the actual model performance.
parler_tts_large
Text to Speech model.
llama3.3_70B_instruct
and llama3.1_8B_instruct
models were added. The llama3_8B_instruct
model was removed and is redirected to llama3.1_8B_instruct
. The llama2_70B
model was removed and is redirected to llama3.3_70B_instruct
.completions
and logprob
endpoints. Automatic language detection is supported in
the transcript
endpoint. Transcription parameters
can now be provided as form data without an additional JSON
file.llama3_8B
and llama3_8B_instruct
models were added. The mistral_7B_instruct
model was removed and is redirected to llama3_8B_instruct
.transcript
endpoint with the whisper_large_v3
model.mixtral_47B_instruct
and llama2_70B
models were added. The m2m100_1_2B
model was removed and is redirected to madlad400_7B
. The flan_t5_xxl
and falcon_7B
models were removed and are redirected to the mistral_7B
model. The falcon_40B
model was removed and is redirected to llama2_70B
. The falcon_40B-chat
model was removed and is redirected to mixtral_47B_instruct
.madlad400_7B
translation model.token_content_type
parameter to
the tokenize
endpoint. finish_reason
property.negative_prompt
, image
and strength
parameters to
the text_to_image
endpoint. Added
the seed
parameter to
the completions
endpoint. Added
the mistral_7B
and mistral_7B_instruct
models. The boris_6B
and gptneox_20B
models were removed because newer
models give better overall performance.falcon_7B
, falcon_40B
and llama2_7B
models. The fairseq_gpt_13B
and codegen_6B_mono
models were removed. fairseq_gpt_13B
is redirected to falcon_7B
and codegen_6B_mono
is redirected to llama2_7B
.flan_t5_xxl
model.codegen_6B_mono
model.text_to_image
endpoint.credits
endpoint.num_tokens
property in the logprob endpoint. Fixed handling of escaped surrogate pairs in the JSON request body.m2m100_1_2B
model.repetition_penalty
and typical_p
parameters.n
parameter.stop
parameter can now be used with streaming output.logit_bias
, presence_penalty
, frequency_penalty
parameters to the completion
endpoint.tokenize
endpoint.