About OpenAI’s Sora, An interview with CTO Mira Murati

An Interview with OpenAI's CTO, Mira Murati about Sora

Three days ago, Mira Murati, the CTO of OpenAI, participated in an interview. Mira handles all the technology aspects at the company, including Sora. She spoke about Sora, OpenAI’s text-to-video AI model. Many individuals have been greatly impressed by Sora’s AI-generated videos, yet they’ve also expressed concerns about their impact.

How does it work?

Sora is OpenAI’s video generation model. It is just based on a text prompt and it creates this hyper realistic, beautiful, highly-detailed videos of one-minute length. Sora is essentially a diffusion model, a type of generative model, which creates a more refined image starting from random noise. It analyzes numerous videos to learn object and action identification. When provided with a text prompt, it constructs a scene by establishing a timeline and adding detail to each frame. What sets this AI video apart from others is its remarkable smoothness and realism.

Video generated by OpenAI’s video generation model

the bull in china shop, generated by Sora, OpenAI's text-to-video AI model.
the bull in china shop, generated by Sora, OpenAI’s text-to-video AI model.

Mira elaborated, stating, “If you think about filmmaking, maintaining consistency between frames is crucial for realism and presence. Sora excels at this, ensuring a seamless flow from one frame to the next. If this continuity is disrupted, the sense of reality diminishes.” This proficiency is evident in the videos OpenAI generates from user prompts.

a video of two women generated by Sora, OpenAI's AI model
a video of two women generated by Sora, OpenAI’s AI model

Why hands are very difficult to stimulate?

During the interview, a video featuring two women was discussed, with one appearing to have an unusual number of fingers in one shot. Mira explained that simulating hand motion accurately is challenging due to its complexity.

Sora's Struggle with Hand Depiction
Sora’s difficulty in showing hands

In the clip, mouths move without sound. Currently, OpenAI is not focusing on audio with Sora, but they plan to address this in the future. The interviewer inquired about the source videos that Sora learned from, speculating on popular shows like Ferdinand or SpongeBob. They questioned the data used to train Sora, suggesting YouTube, Facebook, and Instagram videos. Mira hesitated and confessed to being unsure about the data sources. Some people criticized her for refraining from answering questions that were squarely within the realm of technology during the interview, despite her role as the company’s Chief Technology Officer. Furthermore, she avoided addressing inquiries related to the environmental impact/energy cost of operating Sora.

Full interview is here.

 

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top