Update: Last week I created an anime pilot using Text-to-Video. For Ep 2, I switched to Image-to-Video + Keyframing to lock the consistency. Here is the quality jump.


Last week, I created an anime pilot using mostly Sora 2's Text-To-Video. It was messy.

To watch episode 1, go here: https://www.reddit.com/r/ChatGPT/comments/1pd5xva/i_tried_making_an_anime_episode_using_mostly/

TLDR; The near-future, humans begin integrating with AI. People who integrate with AI live in an upper city, where they attract success, wealth, and beauty. The story follows a guy from the under city, who builds a fascination with an integrated girl from the upper city.

For episode 2, I completely redid my workflow approach to fix:

  • Character consistency
  • World, visual and spatial consistency (now, it feels like we're always in the same world / room as the characters)
  • Consistent voices
  • Worldbuilding through dialogue (Notice how AI-integrateds speak in an almost non-human fashion while the MC speaks with many 'humanistic' filler phrases, like 'so i gotta know..'
  • Echoing themes through cinematic shots
  • Sound design

I am writing up the full technical/storytelling breakdown on X. You can catch it here: https://x.com/resonance_src

Feel free to ask me questions – Hope y'all enjoy!

Update: Last week I created an anime pilot using Text-to-Video. For Ep 2, I switched to Image-to-Video + Keyframing to lock the consistency. Here is the quality jump.
byu/No-Link-6413 inChatGPT

Leave a Reply