Sora, the next audiovisual revolution ?

March 12, 2024

If DALL-E, the AI that allows generating images from text, was already very impressive, the arrival of Sora is set to further shake up the artistic industry. We take a closer look at this new AI, what it would be capable of doing, and its limitations.

A video from a simple sentence

Just like with image generation, Sora requires just a simple sentence to generate a video. Sam Altman shared some generated videos, and the results are already quite astounding.

Video generated by Sora, prompt (instruction sentence): Two golden retrievers podcasting on top of a mountain

We can see that the video contains some flaws, but the result is already very realistic. The shadows are well managed, the dogs are well integrated into the setting, and the video is smooth.

A video from a model

Sora is also capable of generating videos from a model. This means that you can give it an existing video and ask it to generate a similar video. This could be very useful for artists who want to create videos in a particular style.

Original video

Video generated by Sora, prompt (instruction sentence): Change the setting to the 1920s with an old school car. make sure to keep the red color

Video generated by Sora, prompt (instruction sentence): Rewrite the video in a pixel art style

The results respect the original video well, and the new artistic direction is well adhered to.

Being able to generate a video from a model really opens up a world of possibilities. We can imagine providing it with entire video sequences and asking it to do the editing for us. This remains speculative, of course, and the costs would likely be very high.

The limitations of Sora

This new AI is not perfect. For now, we are told it will be able to generate videos of a maximum of 1 minute. The possibilities will therefore be quite limited, but this is largely enough for platforms like TikTok or Instagram.

Some videos are also poorly generated, where you can see duplications, strange movements, and physics-defying actions. But it is very likely that these flaws will be corrected in future versions.

Video generated by Sora, prompt (instruction sentence): Archeologists discover a generic plastic chair in the desert, excavating and dusting it with great care.

The price ?

OpenAI has not yet officially announced the price for generating a video, but one thing is almost certain, it will not be accessible for free.

It is very likely that the price will be based on the length of the video, with some estimating it would cost between $0.01 and $0.1 per second of video generated.

It is also conceivable that its use might be included in the ChatGPT Plus subscription, but all this remains purely speculative.