notes on
process

FOr creation of synthetic video revolving around GEnerative ai-based tools and traditional post-production 3d composition



THIS IS A BROAD LIST OF IMPORTANT STEPS IN MY PROCESS.

to call this a guide would be a joke, but it's possibly helpful.


USE YOUR IMAGINATION AND YOUR SKILLS, OR OBTAIN SOME.

ASK YOUR FRIENDS TO WATCH AND POINT OUT WHATEVER THEY SEE IS WRONG. COMPLIMENTS DON'T IMPROVE YOUR WORK.

IN REALITY, THERE ARE PROBABLY 75+ SMALL STEPS ALONG THE WAY THAT - IF LISTED - WOULD EITHER BE TOO BORING TO READ OR SCARE PEOPLE AWAY.

DON'T TAKE THIS LIST TOO SERIOUSLY. ALL OF MY PROMPTS AND METHODS ARE OPEN SOURCE OR OPEN TO THE PUBLIC. MOST OF ALL, JUST KEEP DOING IT. REPETITION AND PRACTICE HAVE NO SUBSTITUTE.



FOR THE CREATION OF SYNTHETIC SHORT FORM VIDEO, FOCUSED ON PHOTOREALISM AND NATURAL PHYSICS

1. Concept Refinement and Shot List Development (Claude by Anthropic)

  • I refine the initial concept and develop a detailed shot list using Claude by Anthropic. This step ensures that I establish a solid foundation for the visual narrative right from the start.

2. Image Generation (Midjourney)

  • I leverage the knowledge gained from generating over 18,000 prompts on Midjourney to create the first frame for every shot. No reference photos are used, so hierarchical prompting and thorough understanding of modifiers and SREFs is required. For this project, I created 377 images, ultimately selecting a dozen that best suited the vision. These will become the first shot of every clip in the final delivery.

3. Brand Adherence (Adobe Photoshop)

  • I then refine the car details from the selected Midjourney images in Adobe Photoshop. It’s crucial that these visuals align with BMW’s brand guidelines and accurately represent the car’s appearance. Many times I'll have to batch replace a specific car part in every shot, for example, as the dataset for Midjourney and all TXT2IMG generators are constantly changing.

4. Resolution Enhancement (Magnific)

  • With the first frame of every shot created, I use Magnific to upscale these images to 6K resolution, ensuring each frame is optimized for high-quality output. NOTE: make sure to have a basic grasp of Fractality before remotely rendering in Magnific; this can make or break your images many times.

5. GenAI Clip Creation (ComfyUI)

  • I generate GenAI video clips using various open-source models handling the local rendering of shots requiring rendernet access or detailed character consistency, and i use remote rendering for world modeling (see next step for more info). Hosting generative AI video platforms locally is extremely challenging FOR MANY REAONS, BUT MAINLY due to the need for significant computational resources and the complexity of managing dependencies and optimizing performance. This setup allows for the most control of shot generation using IMG2VID, but it demands a deep understanding of system architecture and network configurations. It is being PHased out as REMOTE RENDERING SERVICES ARE OFFERING GREATER CHARACTER CONSISTENCY AND PROMPT ADHERENCE. i try to move more and more tasks to remote processing as they become available.

6. Remote Processing (AWS/Nvidia Cloud Instance)

  • To ensure efficiency, I run a remote AWS instance with a dedicated Nvidia A100 GPU, connected via a 5GB fiber connection. This setup uses extremely powerful gfx compute for the parts of my workflow that involve recreating lifelike physics, natural life-like movement, and world creation. using several million dollars worth of computing power for a few dollars from anywhere in the world that has the bandwidth is game-changing.

7. Clip Cleanup

  • I conduct detailed post-production cleanup, addressing typical issues found in IMG2VID engines. This includes small errors such as de-banding or flicker adjustment, and a host of small issues that can mostly be handled with the native tools in Resolve Studio or Autodesk Maya. I create a synthetic shutter and allow it to move not in any linear fashion. If this sounds daunting don't worry, in 2024 if you can just identify the problem you'll find many answers on YouTube.

8. Editing (General)

  • Editing and pacing the video is straightforward but critical. My experience in editing allows me to complete this process efficiently, typically in under 30 minutes. Know how to pace and edit very quickly. The NLE can be very simple it doesn't really matter - understand the theory behind why most cuts happen when they do and you'll be fine.

9. Color Correction (Davinci Resolve)

  • I perform basic color timing and correction in Davinci Resolve. Since this is for a compressed Instagram ad, I make sure that the black points and white balance are perfectly matched. If you need to fix some mistakes you've made along the way that have altered the color, use a reference frame that you like and split screen it with the shots in question. If you know how to, pull up your metering tools, specifically a waveform monitor. Correct white balance by adjusting the Primary Highlights wheel, not by using the software's "white balance" slider.

10. Resolution and Bitrate Check (Jungle77)

  • Using Jungle77, a proprietary but open source module optimized for Apple Silicon, I scan each frame to verify proper resolution and bitrate attributes. I also use this to check for any dropped frames and if any are found, I use the Optical Flow process to patch over it quickly.

11. Music Licensing and Preparation

  • Depending on the project, music licensing varies. In this case, the client provided a licensed track, and I used Logic Pro 11’s stem splitter to isolate the instrumental elements by removing vocals.

12. Sound Effects (Elevenlabs)

  • I generate all sound effects from text prompts using Elevenlabs. The goal is to create a library of useful sound snippets for further processing in the next step.

13. Sound Design (Logic Pro 11)

  • I am comfortable in Logic Pro 11 and it works great for video and surround if needed, but use whichever DAW you can work with best. I ensure each effect has its own track so I can use hard panning at cut points to emphasize transitions, and slow panning through automation to follow a moving object across the screen. Make sure to normalize gain across all tracks, and automate the mix volume to keep the music central. When voiceover is present, I use TrackSpacer to duck (not compress) the audio and effects. A limiter on the output channel prevents peaking, and should be the minimum you use there.

14. Filmstock Emulation (Davinci Resolve)

  • Back in Davinci Resolve, I add the final audio track to the timeline and use FilmConvert Nitrate to emulate a film color space and add a touch of grain.

15. 6K Master Render

  • I render the final master in ProRes 422 HQ at 6K resolution, ensuring the highest quality in the event that I need to return and make any changes.

16. Web Compression

  • Finally, I compress the output using tools that adhere to the codec standards set by each social media platform. Use whatever you are comfortable with, Adobe Media Encoder works great, handbrake can do the job for free, too.

RON RADOM

FOR MOBILE DEVICES ONLY