
I wrote the story, the shots, dialogue and direction. Then used mostly Text-to-Video on Sora to generate the visuals. The only input images I used were simple character shots on a white background for consistency. Everything else is just structured text prompts.
It definitely has rough edges, but the fact that it even looks like an anime episode still blows my mind.
This is Episode 1 of a small concept series I'm making to see how far a single person can push these tools creatively. Didn’t expect it to work this well.
Will be creating an Episode 2 to release on Monday, using better techniques, like creating keyframes, using ElevenLabs for voices, and getting a stronger cinematic control and consistent visual aesthetic language.
Let me know thoughts!
I tried making an anime episode using mostly Text-to-Video (Sora)… I honestly didn’t think this was possible
byu/No-Link-6413 inChatGPT
