POST tts/create

Generate speech from text

July 4, 2025

Request Headers
Request Body
Responses
Model
Examples
Try It

This endpoint generates speech from text using text-to-speech technology.

Notes

When generating a very long text, some models tend to generate speech that gets gradually quieter. To avoid that, we suggest slicing the prompt into 1K chunks.

https://api.useapi.net/v1/heygen/tts/create

Request Headers

Authorization: Bearer {API token}
Content-Type: application/json
# Alternatively you can use multipart/form-data
# Content-Type: multipart/form-data

API token is required, see Setup useapi.net for details.

Request Body

{
  "email": "[email protected]",
  "voice_id": "en-US-AriaNeural",
  "prompt": "Text to be converted to speech",
  "speed": 100,
  "pitch": 0,
  "volume": 100,
  "language_code": "en-US",
  "emotion": "happy"
}

email is optional when only one account configured.
However, if you have multiple accounts configured, this parameter becomes required.
voice_id is required, a valid voice_id from GET /tts/voices.
prompt is required, the text to be converted to speech (maximum 5000 characters).
speed is optional, range from 50 to 150.
Default is 100 (normal speed).
pitch is optional, range from -100 to 100.
Default is 0 (normal pitch).
volume is optional, range from 0 to 100.
Default is 100 (full volume).
language_code is optional, must be one of the supported language codes from GET /tts/languages.
emotion is optional, must be one of the supported emotion names for the selected voice.
Use valid voice.settings.clone_emotions.name from GET /tts/voices/?voice_id=voice_id.

Responses

200 OK

{
  "audio_url": "https://heygen-media.s3.amazonaws.com/audio/abc123.mp3",
  "duration": 5.2,
  "is_pass": true,
  "job_id": null,
  "word_timestamps": [
    {
      "word": "Hello",
      "start": 0.0,
      "end": 0.5
    },
    {
      "word": "world",
      "start": 0.6,
      "end": 1.1
    }
  ]
}

400 Bad Request

{
  "error": "Invalid emotion (angry), supported values: happy, sad, neutral"
}

401 Unauthorized

{
  "error": "Unauthorized",
  "code": 401
}

Field audio_url will contain URL with generated mp3 audio file.

Model

{ // TypeScript, all fields are optional
  audio_url: string         // URL to the generated MP3 audio file
  duration: number          // Duration of the audio in seconds
  is_pass: boolean          // Whether the generation was successful
  job_id: string | null     // Job ID (usually null for synchronous requests)
  word_timestamps: {        // Timing information for each word
    word: string            // The spoken word
    start: number           // Start time in seconds
    end: number             // End time in seconds
  }[]
}

Examples

curl -X POST "https://api.useapi.net/v1/heygen/tts/create" \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer …" \
   -d '{"email":"[email protected]","voice_id":"en-US-AriaNeural","prompt":"Hello, world!"}'

const token = "API token";
const email = "Previously configured account email";
const apiUrl = "https://api.useapi.net/v1/heygen/tts/create"; 
const response = await fetch(apiUrl, {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": `Bearer ${token}`,
  },
  body: JSON.stringify({
    email: email,
    voice_id: "en-US-AriaNeural",
    prompt: "Hello, world!"
  })
});
const result = await response.json();
console.log("response", {response, result});

import requests
token = "API token"
email = "Previously configured account email"
apiUrl = "https://api.useapi.net/v1/heygen/tts/create"
headers = {
    "Content-Type": "application/json", 
    "Authorization" : f"Bearer {token}"
}
data = {
    "email": email,
    "voice_id": "en-US-AriaNeural",
    "prompt": "Hello, world!"
}
response = requests.post(apiUrl, headers=headers, json=data)
print(response, response.json())

Generate speech from text

Table of contents

Request Headers

Request Body

Responses

Model

Examples

Try It