technology

What’s next for AI in 2024 – MIT Technology Review

STEPHANIE ARNETT/MITTR | ENVATO

2

Generative AI’s second wave will be video

It’s amazing how fast the fantastic becomes familiar. The first generative models to produce photorealistic images exploded into the mainstream in 2022—and soon became commonplace. Tools like OpenAI’s DALL-E, Stability AI’s Stable Diffusion, and Adobe’s Firefly flooded the internet with jaw-dropping images of everything from the pope in Balenciaga to prize-winning art. But it’s not all good fun: for every pug waving pompoms, there’s another piece of knock-off fantasy art or sexist sexual stereotyping.

The new frontier is text-to-video. Expect it to take everything that was good, bad, or ugly about text-to-image and supersize it.

A year ago we got the first glimpse of what generative models could do when they were trained to stitch together multiple still images into clips a few seconds long. The results were distorted and jerky. But the tech has rapidly improved.

Runway, a startup that makes generative video models (and the company that co-created Stable Diffusion), is dropping new versions of its tools every few months. Its latest model, called Gen-2, still generates video just a few seconds long, but the quality is striking. The best clips aren’t far off what Pixar might put out.

Runway has set up an annual AI film festival that showcases experimental movies made with a range of AI tools. This year’s festival has a $60,000 prize pot, and the 10 best films will be screened in New York and Los Angeles.

It’s no surprise that top studios are taking notice. Movie giants, including Paramount and Disney, are now exploring the use of generative AI throughout their production pipeline. The tech is being used to lip-sync actors’ performances to multiple foreign-language overdubs. And it is reinventing what’s possible with special effects. In 2023, Indiana Jones and the Dial of Destiny starred a de-aged deepfake Harrison Ford. This is just the start.  

Away from the big screen, deepfake tech for marketing or training purposes is taking off too. For example, UK-based Synthesia makes tools that can turn a one-off performance by an actor into an endless stream of deepfake avatars, reciting whatever script you give them at the push of a button. According to the company, its tech is now used by 44% of Fortune 100 companies. 

The ability to do so much with so little raises serious questions for actors. Concerns about studios’ use and misuse of AI were at the heart of the SAG-AFTRA strikes last year. But the true impact of the tech is only just becoming apparent. “The craft of filmmaking is fundamentally changing,” says Souki Mehdaoui, an independent filmmaker and cofounder of Bell & Whistle, a consultancy specializing in creative technologies.