Tutorial: Using Real Faces with Seedance 2.0
Seedance 2.0 is now available on both Runway API v1 and Dreamina API v1. The two integrations share the same underlying model but differ in one important detail for end users:
- Runway — no real-people restriction, lighter content moderation,
exploreModemakes 15-second generations effectively free on the Unlimited plan - Dreamina — real human faces are moderated; use illustrated, anime or stylized character references
This post walks through a single cinematic finger-snap time-freeze scene driven by one character reference image, submitted to both APIs from the same prompt.
Table of contents
- Tutorial: Using Real Faces with Seedance 2.0
- The reference image
- How to use real faces in Seedance on Dreamina
- Fully automated grid — free face detection via Kling
- The prompt
- Runway Seedance 2.0 • 15s 9:16
- Dreamina Seedance 2.0 • 15s 9:16
- When to use which
The reference image
A single 9:16 editorial still of the subject — long dark hair, pale complexion — is all Seedance 2.0 needs to carry the character identity through the entire 15-second clip.
Two versions of the same portrait were used — the unaltered original for Runway, and the same image with a sparse grid overlaid on the face for Dreamina:
| Runway input (original) | Dreamina input (face grid) |
|---|---|
![]() | ![]() |
On Runway the unaltered image is uploaded once via POST /assets; the returned assetId is passed as imageAssetId1. On Dreamina the grid-overlaid image is uploaded via POST /assets/account as type image; the returned imageRef is passed as omni_1_imageRef.
How to use real faces in Seedance on Dreamina
Dreamina’s Seedance 2.0 input filter scans the reference image with a face-detection classifier; a high-confidence real-human-face hit is rejected before generation even starts with a moderation error. On Runway there is no such restriction, so the unmodified portrait goes straight through.
Compare the two images above. The Dreamina version has a sparse mesh of thin crisscrossing lines overlaid only on the face region — roughly from the hairline down to the chin. The grid does not cover the hair, the dress, or any of the background. Everything else about the reference is identical to the Runway original.
Why it works:
- Face classifiers are brittle to structured occlusion. A sparse grid overlaid on key facial landmarks (eyes, nose, mouth contours) drops the classifier’s confidence below the moderation threshold — the image is no longer tagged as a “real face”.
- Seedance’s reference conditioning is not a face classifier. It is a dense-pixel visual encoder (similar to Kontext / IP-Adapter approaches) that still absorbs the underlying identity, hair color, skin tone, bone structure and wardrobe from the pixels between the grid lines.
- The generated video — conditioned on the image as a whole — reconstructs a coherent face in every frame because Seedance has enough un-occluded signal to infer the identity. The grid itself never appears in the output.
Fully automated grid — free face detection via Kling
Hard-coding face-box coordinates is fine for a one-off, but for a pipeline you want the face location auto-detected. Kling provides a free, unlimited face-detection endpoint at POST /images/recognize-faces that returns exact face bounding boxes — no credit cost, works on the free Kling plan.
Step 1 — detect the face box. Upload the portrait via POST /assets to get a public URL, then:
curl -sS -X POST 'https://api.useapi.net/v1/kling/images/recognize-faces' \
-H 'Authorization: Bearer user:12345-…' \
-H 'Content-Type: application/json' \
--data '{"imageReference": "https://.../portrait.jpg"}'
Response — the first face’s face_bound is a JSON string carrying {x, y, width, height} in image pixels:
{
"status": 0,
"faceItems": [{
"name": "face_0",
"arguments": [{
"name": "face_bound",
"value": "{\"x\":120,\"y\":80,\"width\":200,\"height\":240}"
}]
}],
"mediaInfo": { "width": 724, "height": 1268 }
}
Step 2 — apply the grid only on those bounds with ffmpeg:
# Pull the numbers out of the Kling response (requires jq)
eval "$(echo "$KLING_RESPONSE" | jq -r '
.faceItems[0].arguments[] | select(.name=="face_bound").value | fromjson |
"FX=\(.x); FY=\(.y); FW=\(.width); FH=\(.height)"')"
ffmpeg -y -i portrait.jpg -vf "\
geq=lum='p(X,Y)*if(between(X,${FX},${FX}+${FW})*between(Y,${FY},${FY}+${FH})*\
(lt(mod(X-${FX},40),1)+lt(mod(Y-${FY},40),1)),0.92,1)':\
cb='p_cb(X,Y)':cr='p_cr(X,Y)'" portrait_grid.jpg
The grid is clipped exactly to the detected face rectangle and lined up on a 40-pixel pitch originating from the face-box top-left — tune the 40 for a coarser or finer mesh. The subtler the grid, the safer it is against future classifier updates, but too subtle and it stops defeating the current one.
The whole pipeline — detect face → apply grid → upload to Dreamina → submit Seedance generation — can run in a single script with no manual tuning per image.
This technique is documented for legitimate use with stylized characters, your own likeness, or licensed model imagery. It is not a license to violate Dreamina’s terms of service around celebrity faces or non-consensual real-person content. Use accordingly.
The prompt
The only reference token differs between the two APIs: Runway uses @IMG_1, Dreamina uses @image1.
A cinematic continuous shot on a sunlit Midtown Manhattan avenue at golden hour. @IMG_1 walks confidently toward camera, holding an unlit cigarette between two fingers of her right hand. She wears a black knee-length peacoat, white t-shirt and blue straight-leg jeans. Tall glass skyscrapers, yellow cabs, a white truck, pedestrians streaming past on both sides, pigeons take flight behind her, autumn leaves drift in the air. Direct eye contact with camera.
She stops dead center in the frame, raises her right hand with the cigarette, and snaps her fingers with a deep thunderous boom sound — a shockwave sonic ripple explodes outward across the frame.
The world instantly freezes. Every pedestrian completely still, locked mid-step with one foot in the air. Pigeons hang motionless in the sky, wings spread wide. Falling leaves suspended in mid-air. Cars stopped in the street. Absolute stillness everywhere. She is the only thing moving. She walks slowly between the frozen statue-still pedestrians, looks up at the unmoving pigeons in wonder, reaches up and gently touches one suspended in the air, then continues toward a bearded man in a dark coat and scarf who stands frozen with his foot mid-step.
She stops directly in front of him. She flicks open a lighter, cups her hand around the flame, and lights her cigarette — the tip glows red as she inhales. She exhales a long slow plume of smoke into his still face, smirks, and says “smoke, cool.”
She raises her hand and snaps her fingers a second time with another deep thunderous boom sound. The world unfreezes: the bearded man jolts and looks around confused, the crowd, traffic and pigeons all resume motion. She walks casually past him like nothing happened, cigarette between her fingers, a slow plume of smoke trailing behind her.
Camera cranes back and rises to a high-angle wide shot of the avenue — commuters streaming forward, traffic flowing, skyscraper canyon stretching into the distance. Image fades to black.
Cinematic realism, golden-hour sun flare, shallow depth of field, film grain. Keep @IMG_1 identity consistent throughout.
Runway Seedance 2.0 • 15s 9:16
curl -X POST 'https://api.useapi.net/v1/runwayml/videos/create' \
-H 'Authorization: Bearer user:12345-…' \
-H 'Content-Type: application/json' \
--data '{
"model": "seedance-2",
"text_prompt": "… @IMG_1 …",
"aspect_ratio": "9:16",
"resolution": "720p",
"duration": 15,
"audio": true,
"exploreMode": true,
"imageAssetId1": "user:12345-runwayml:[email protected]:…"
}'
See POST /videos/create for the full parameter reference.
Typical timing in exploreMode: ~10 min total (queue ~7–9 min, render ~1.5–4 min). Zero credit cost on the Runway Unlimited plan.
Dreamina Seedance 2.0 • 15s 9:16
curl -X POST 'https://api.useapi.net/v1/dreamina/videos' \
-H 'Authorization: Bearer user:12345-…' \
-H 'Content-Type: application/json' \
--data '{
"model": "seedance-2.0",
"prompt": "… @image1 …",
"duration": 15,
"ratio": "9:16",
"account": "[email protected]",
"omni_1_imageRef": "assetRef:…"
}'
Notes:
- Omni Reference mode is auto-triggered by presence of any
omni_N_*Refparam. The@image1token in the prompt matches theomni_1_imageRefasset. - Pass
ratioexplicitly —9:16here to match Runway.
See POST /videos for the full parameter reference.
Typical timing: ~2 min (no queue). ~$0.044/sec = ~$0.66 for a 15-second clip.
When to use which
| Runway Seedance 2.0 | Dreamina Seedance 2.0 | |
|---|---|---|
| Real human faces | ✓ supported | ✗ moderated |
| Cost (15s) | free on Unlimited (explore) | ~$0.66 |
| Typical latency | ~10 min (explore queue) | ~2 min |
| Max refs | 11 images + 3 videos | 9 images + 3 videos + 3 audio |
| Audio reference | no | yes (omni_N_audioRef) |
| Native audio generation | yes (audio: true) | yes |
| Region | global | CA only |
For volume / real people / no-cost iteration — use Runway with exploreMode. For fast turnaround / audio-reference work / non-human characters — use Dreamina.

