Skip to main content

Generate

Text-to-image, image-to-video, text-to-speech, and image-to-3D generation. Use Generate as the starting point for most workflows.

Every endpoint returns a job_id and has a matching GET …/status/{job_id} route, see Jobs. The fields output_format (webp, jpeg, or png), callback_url (a webhook), and server_id (Enterprise: pin to a dedicated pod) are accepted by every endpoint below.

Text → Image

POST /generate/image/v1, only prompt is required.

curl https://api.imagepipeline.io/generate/image/v1 \
-H "X-API-Key: $IMAGEPIPELINE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "a person in a red jacket on a rooftop at golden hour",
"width": 1024,
"height": 1024,
"output_format": "webp"
}'

Example 1024×1024 result generated from a text prompt

FieldTypeDefaultNotes
promptstring,Required. Text prompt describing the image.
width / heightinteger1024Output dimensions in pixels (max 1024).
num_inference_stepsintegermodelNumber of denoising steps. More = more detail, slower.
guidance_scalenumbermodelHow strictly to follow the prompt. Omit for the model default.
seedinteger-1-1 randomizes; set a value for reproducibility.
enhance_promptbooleanfalseRun the prompt through a lightweight AI enhancer first.
logo_urlstring,Public URL of a logo (PNG/WebP with transparency) to include.
palettestring[],Brand colours as hex codes, e.g. ["#FF5733", "#3498DB"].
output_formatstringwebpwebp, jpeg, or png.
profile_idstring,Apply an identity profile.
callback_urlstring,Receive a webhook on completion.
server_idstring,Enterprise: pin to a dedicated pod.

Image → Video

POST /generate/video/v1, animate a still image (input_image required).

{
"input_image": "https://.../still.webp",
"prompt": "make this image come alive, cinematic motion",
"duration_seconds": 2.0,
"width": 896,
"height": 512
}
FieldTypeDefaultNotes
input_imagestring,Required. Public URL of the image to animate.
promptstringcinematic motionDescribe the animation style or motion.
width / heightinteger896 / 512Max 1536, must be divisible by 32.
duration_secondsnumber2.0Video duration in seconds (0.1 to 10.0).
seedinteger42Seed for reproducibility.
callback_urlstring,Webhook URL on completion.
server_idstring,Enterprise: pin to a dedicated pod.

Text → Speech

POST /generate/speech/v1, synthesize speech from text.

{ "text": "Welcome to ImagePipeline.", "language_id": "en", "exaggeration": 0.5 }
FieldTypeDefaultNotes
textstring,Required. Text to convert to speech.
language_idstringenLanguage code, e.g. en, zh, ja, ko, he.
target_voice_pathstring,Public URL of a reference voice audio file for cloning.
max_new_tokensinteger256Maximum tokens to generate.
exaggerationnumber0.5Expressiveness (0.0 neutral to 1.0 maximum).
apply_watermarkbooleantrueEmbed an inaudible audio watermark (recommended).
callback_urlstring,Webhook URL on completion.
server_idstring,Enterprise: pin to a dedicated pod.

Image → 3D

POST /generate/3d/v1, turn an image into a 3D mesh (image_path required).

{ "image_path": "https://.../object.webp", "mode": "generate" }
FieldTypeDefaultNotes
image_pathstring,Required. Public URL of the input image.
modestringgenerategenerate (mesh only) or paint (texture and mesh).
mesh_save_namestring,Optional filename for the output mesh, e.g. model.obj.
painted_save_namestring,Optional filename for the textured mesh.
auto_unloadbooleantrueUnload the pipeline from GPU after completion.
callback_urlstring,Webhook URL on completion.
server_idstring,Enterprise: pin to a dedicated pod.