AI Filmmaking: How I Made my Movie using AI

The Making of Maximum Perception

Takeaways:
  • Michael Tippett, an experienced entrepreneur and two-time Emmy nominee, has spent over two decades working at the intersection of technology and media, exploring how emerging technologies like AI can transform creative industries.

  • This post will walk you through the creation of his experimental AI film, “Maximum Perception,” offering insights into the tools and the step-by-step process he used, the challenges faced, and the potential of AI in filmmaking.
About Me

As a two-time Emmy nominee and an entrepreneur with over two decades of experience, I’ve spent much of my career exploring the intersection of technology and media. My journey has taken me through the founding and managing of several successful tech companies in New York and Vancouver, always driven by a deep interest in how emerging technologies can transform creative industries.

In particular, my focus has increasingly turned towards AI and its growing impact on film production. Through my site, Tippett.org, I share my explorations and experiments with AI, documenting how these tools are reshaping the landscape of media production.

In this blog post, I’ll be taking you through the process of creating my experimental AI film, “Maximum Perception.” This project was born out of a desire to push the boundaries of what AI technology can do today, and to explore both its possibilities and limitations. As you watch the film, many of these aspects become evident, illustrating how AI can both enhance and challenge the creative process. Over time, I’ve created multiple versions of this story, each iteration reflecting improvements in both my skills and the technology available. From the original version created in February 2024 to the more polished episodes released in September 2024, “Maximum Perception” has evolved significantly.

The original version: February 2024
Trailer: July 2024
Episode 1: The Pitch – september 2024
Episode 2: The Catch – september 2024

In this post, I’ll delve into the specific tools and processes I used to bring this project to life, sharing insights into what worked well and what I’m still striving to improve. Whether you’re a seasoned professional or new to AI in film, I hope this guide offers valuable takeaways that can help you in your own creative journey. Additionally, I’m excited to invite others to join the project as we continue to explore this exciting frontier. For those interested, I’ve also included links to AIDEO resources, which provide up-to-date information on the latest AI tools and features for video and film production.

I look forward to sharing my experiences and learning from others as we navigate the rapidly evolving world of AI-driven filmmaking together.

General Observations:

When it comes to creating AI films, there’s a common misconception that it’s as simple as pressing a button and watching the magic happen. In reality, the process is much more complex and requires a blend of various tools, particularly when it comes to post-production. Generative AI tools are just one part of the puzzle; they need to be used in concert with traditional editing software like Final Cut Pro, Adobe Premiere, or CapCut. Surprisingly, a significant amount of time is also spent in tools like Photoshop, fine-tuning the visuals to achieve the desired quality.

From my experience, the best AI films often come from those who already have a background in filmmaking. These creators understand the fundamentals of storytelling, pacing, and visual composition, which AI tools alone can’t fully replicate. While there are AI tools that can assist with story development, the ability to craft a compelling narrative remains a distinctly human skill. Sound design, too, continues to be crucial, and right now, AI tools in this area are not as advanced as their visual counterparts. The richness and texture that a skilled sound designer brings to a film are still hard to replicate with current AI technology.

One of the challenges I face in this space is the tendency for AI-generated content to feel commoditized or generic. With many filmmakers using the same avatars, voices, and music, it’s easy to fall into the trap of creating something that feels indistinguishable from other AI films.

I try to inject as much uniqueness and weirdness into my work as possible, but AI tools often smooth out the rough edges of interesting source material, especially when animating unconventional characters or attempting to create non-human lip syncs. This tendency to normalize outputs likely stems from the way generative AI models function—they are designed to produce the ‘most likely’ result, which often translates to something average or typical.

Keeping it weird

Step-by-Step Process for Creating “Maximum Perception”

1. Creating the Story:

  • Developing the Concept: The process of creating “Maximum Perception” began with developing a compelling story. AI tools like ChatGPT were instrumental in this stage, helping to brainstorm ideas, flesh out character arcs, and craft dialogue. My primary goal at this stage was to determine the length of the project and shape the overall story arc. It was also essential to decide on the final format early on—whether the film would be suited for the big screen or a platform like TikTok, as this significantly influenced how the story would unfold.
  • Story Outline and Character Development: Once I had a general outline—a classic three-act structure—I used tools like ChatGPT to further develop characters, create dialogue, and even generate names. For “Maximum Perception,” all character names were AI-generated, combining the most popular first and last names worldwide (e.g., Mo Zhao, Asha Chen). The final script emerged from this AI-assisted process, where I expanded upon the storyline suggested by ChatGPT and then conducted a thorough edit to ensure the narrative resonated with my vision. Most of the final wording in the script were my words in the end.

2. Creating the Outline:

  • Dialogue Organization and Audio Generation: After completing the script, I used ChatGPT to organize all the speaking parts by character. This step was crucial for managing the workflow as I moved into the next phase of production. Each character’s lines were then processed through ElevenLabs to generate audio outputs. While I generally used the standard voices provided by ElevenLabs, you can create more authentic outputs by training the AI on custom voices or working with actors to create original recordings. It’s vital that any actors involved fully understand the implications of voice capture, as their voice could be manipulated to say anything once recorded. If you go down this path, clear agreements about usage and future limitations are crucial.
  • Integrating Dialogue into the Film: With the dialogue generated, I had two primary options: (a) Import the audio files into video editing software like Final Cut Pro to create an audio foundation for the film, or (b) use the audio to generate a lip-synced video file. The first option involved a straightforward process of importing and editing the audio to match the script. The second option, generating lip-synced characters, required additional tools and attention to detail, particularly when matching audio to animated characters.

3. Character Creation:

  • Generating Visuals: For “Maximum Perception,” character creation involved several steps, leveraging various GenAI tools. These tools offer different modes, such as text-to-video, image-to-video, and video-to-video. I began by generating still images of the characters using Midjourney or Flux. Midjourney, in particular, allowed me to “zoom out” and expand the visual scope of these characters, creating broader frames and more detailed environments based on the initial character designs.
  • Animating Characters: Once I had the character images, I used tools like Runway or Luma to animate them. AI video generation can be unpredictable, often requiring multiple takes to achieve the desired effect. However, this unpredictability can also be a creative asset, introducing unexpected elements that can enrich the story. I embraced this aspect, treating the AI more as a collaborator than a tool, allowing it to influence the narrative direction organically. This approach is somewhat akin to gardening—planting the seeds and waiting to see what grows. Sometimes the results are surprising, adding layers of creativity that weren’t part of the original plan.
  • Lip Syncing and Final Adjustments: The final step in character creation involved lip-syncing the video in Runway. Ideally, the AI-generated clips would include some basic movements, such as blinking or slight head turns, which would then be synchronized with the audio. Too much movement, however, could complicate the lip-syncing process, so I aimed to have a lip-synced clip for each piece of dialogue. This step was crucial in ensuring that the characters’ on-screen presence felt authentic and engaging.

4. Pulling It All Together:

  • Assembling Visuals and Dialogue: With all the dialogue and character animations ready, I began assembling the film in Final Cut Pro, aligning everything with the script. This stage is where the film’s foundation truly takes shape, as the dialogue drives the editing decisions that follow. I reviewed the clips to determine which were the most effective, allowing these choices to guide the visual and narrative flow of the film.
  • Building the World: Creating the world of “Maximum Perception” involved a similar workflow to character creation, using Midjourney, Runway, and audio tools to generate landscapes, buildings, and other environmental elements. This phase was about crafting the visual context in which the characters would interact, making the world as immersive and detailed as possible. One significant advantage of using GenAI in this process is the ability to generate assets on the fly—if I needed a specific scene, like a flyover of a canyon, I could create it in real-time without the need for traditional filming logistics.
  • Finalizing the Video: Once all the visual elements were in place, I moved on to sound design. I used ElevenLabs to generate additional dialogue and sound effects as needed. While I’ve experimented with AI tools like Suno and Udio for music generation, I often rely on Epidemic Sound for its extensive library of music and sound effects, which provide the professional quality I aim for in my films.

5. Outputs:

  • Final Touches: The last steps involved adding credits and determining the final format for the film, whether in 4K resolution or optimized for social media platforms. Once everything was in place, I generated the appropriate files and distributed the film across social media and other channels, sharing “Maximum Perception” with the world.
Learn more / collaborate

I’m always looking to work with smart people. If you’d like to learn more about my work or are interested in collaborating please get in touch.

Please enable JavaScript in your browser to complete this form.
Name

by

Tags: