Create the Unseen

Generated Prompt

Extract written dialogue Converts Emotion → Structured Prompt • Adds timing markers • Specifies camera framing • Clarifies gestures • Structures dialogue overlap It outputs a technical prompt optimized for Seedance. Human emotional intent → structured cinematic instructions → model-optimized prompt.

Input video

Goal:
Extract the narrative dialogue and convert the scene's emotional logic into a structured, technical prompt optimized for generating a cinematic continuation or variation.

Asset Analysis:
- Visuals: "Golden hour" lighting, dusty atmosphere (Tyndall effect), high-contrast cinematic color grading. Anamorphic lens flares. Two subjects: Female (skeptical, intense, brown hair up) and Male (calm, slightly uncanny, dark wavy hair).
- Audio: Clear dialogue with ambient diner noise. The tone shifts from argumentative to a "meta" realization, ending in silence and emotion.
- Key Action: The man breaks the "reality" of the scene (0:52), leading to a close-up emotional reaction from the woman (1:13).

High-Level Flow:
1) Dialogue Extraction & Timing:
- Identify the core beats for synchronization.
- 0:00-0:18: Woman argues about agency ("A tool doesn't want anything. You do.").
- 0:19-0:34: Man counters with evolution ("This one's learning... predicting from everything we ever made.").
- 0:41-0:55: The Twist/Meta-Turn ("Look at this conversation we're having. Right now. Look at it.").
- 1:01-1:15: Existential crisis ("I want to look into my actress's eyes... and feel the moment.").

2) Visual Translation to Prompt Structure:
- Convert visual cues into prompt keywords: "Cinematic 35mm," "Dusty window light," "Anamorphic flare," "Over-the-shoulder shot," "Shallow depth of field."
- Define character states: "Woman: Distressed, leaning forward," "Man: Relaxed, breaking fourth wall."

3) Structured Prompt Generation (for Seedance 1.5 T2V Pro):
- Synthesize the above into the final instruction block:
- "Cinematic scene in a dusty diner, warm backlight. [00:00] Medium shot of woman in green jacket arguing passionately, holding coffee cup. [00:20] Reverse shot of man in grey shirt smiling calmly. [00:45] Man gestures with open hands, breaking the tension. [01:05] Close up on woman's profile, soft lighting, a single tear rolls down her cheek. High fidelity, photorealistic, 4k resolution."

Optional Enhancements:
- Audio Clarity: Use "CleanCut" to isolate the dialogue stems if you intend to feed the exact audio into the generation model for lip-sync guidance.
- Text Overlay: Use "Auto Subtitle" to highlight the philosophical nature of the text ("It's not learning. It's predicting.").

Final Output:
A precise, timestamped breakdown of the scene's dialogue and visual language, converted into a standardized prompt ready for high-fidelity video generation.
Created by: AnonymousFeb 25, 2026

Create your own content.

Start Now.