Generate
Text-to-image, image-to-video, text-to-speech, and image-to-3D generation. Use Generate as the starting point for most workflows.
Every endpoint returns a job_id and has a matching GET …/status/{job_id} route, see
Jobs. The fields output_format (webp, jpeg, or png),
callback_url (a webhook), and server_id (Enterprise: pin
to a dedicated pod) are accepted by every endpoint below.
Text → Image
POST /generate/image/v1, only prompt is required.
curl https://api.imagepipeline.io/generate/image/v1 \
-H "X-API-Key: $IMAGEPIPELINE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "a person in a red jacket on a rooftop at golden hour",
"width": 1024,
"height": 1024,
"output_format": "webp"
}'

| Field | Type | Default | Notes |
|---|---|---|---|
prompt | string | , | Required. Text prompt describing the image. |
width / height | integer | 1024 | Output dimensions in pixels (max 1024). |
num_inference_steps | integer | model | Number of denoising steps. More = more detail, slower. |
guidance_scale | number | model | How strictly to follow the prompt. Omit for the model default. |
seed | integer | -1 | -1 randomizes; set a value for reproducibility. |
enhance_prompt | boolean | false | Run the prompt through a lightweight AI enhancer first. |
logo_url | string | , | Public URL of a logo (PNG/WebP with transparency) to include. |
palette | string[] | , | Brand colours as hex codes, e.g. ["#FF5733", "#3498DB"]. |
output_format | string | webp | webp, jpeg, or png. |
profile_id | string | , | Apply an identity profile. |
callback_url | string | , | Receive a webhook on completion. |
server_id | string | , | Enterprise: pin to a dedicated pod. |
Image → Video
POST /generate/video/v1, animate a still image (input_image required).
{
"input_image": "https://.../still.webp",
"prompt": "make this image come alive, cinematic motion",
"duration_seconds": 2.0,
"width": 896,
"height": 512
}
| Field | Type | Default | Notes |
|---|---|---|---|
input_image | string | , | Required. Public URL of the image to animate. |
prompt | string | cinematic motion | Describe the animation style or motion. |
width / height | integer | 896 / 512 | Max 1536, must be divisible by 32. |
duration_seconds | number | 2.0 | Video duration in seconds (0.1 to 10.0). |
seed | integer | 42 | Seed for reproducibility. |
callback_url | string | , | Webhook URL on completion. |
server_id | string | , | Enterprise: pin to a dedicated pod. |
Text → Speech
POST /generate/speech/v1, synthesize speech from text.
{ "text": "Welcome to ImagePipeline.", "language_id": "en", "exaggeration": 0.5 }
| Field | Type | Default | Notes |
|---|---|---|---|
text | string | , | Required. Text to convert to speech. |
language_id | string | en | Language code, e.g. en, zh, ja, ko, he. |
target_voice_path | string | , | Public URL of a reference voice audio file for cloning. |
max_new_tokens | integer | 256 | Maximum tokens to generate. |
exaggeration | number | 0.5 | Expressiveness (0.0 neutral to 1.0 maximum). |
apply_watermark | boolean | true | Embed an inaudible audio watermark (recommended). |
callback_url | string | , | Webhook URL on completion. |
server_id | string | , | Enterprise: pin to a dedicated pod. |
Image → 3D
POST /generate/3d/v1, turn an image into a 3D mesh (image_path required).
{ "image_path": "https://.../object.webp", "mode": "generate" }
| Field | Type | Default | Notes |
|---|---|---|---|
image_path | string | , | Required. Public URL of the input image. |
mode | string | generate | generate (mesh only) or paint (texture and mesh). |
mesh_save_name | string | , | Optional filename for the output mesh, e.g. model.obj. |
painted_save_name | string | , | Optional filename for the textured mesh. |
auto_unload | boolean | true | Unload the pipeline from GPU after completion. |
callback_url | string | , | Webhook URL on completion. |
server_id | string | , | Enterprise: pin to a dedicated pod. |