AI generates photo-realistic 3D scenes and lets you edit them too

Artificial intelligence that creates realistic three-dimensional images could run on a laptop and make it faster and easier to create animated movies


June 22, 2022

Artificial intelligence models could soon be used to instantly create or edit nearly photo-realistic three-dimensional scenes on a laptop. The tools can help artists who work on games and CGI in movies or are used to create hyper-realistic avatars.

AIs have been able to produce realistic 2D images for a while now, but 3D scenes have proved trickier due to the massive computational power required.

utilities, Eric Ryan Chan at Stanford University in California and colleagues have created an AI model, EG3D, that can generate high-resolution arbitrary images of faces and other objects, along with an underlying geometric structure.

“It’s one of the first [3D models] to achieve a rendering quality approaching photorealism,” says Chan. “In addition, it generates finely detailed 3D shapes and is fast enough to run in real time on a laptop.”

EG3D and its predecessors use a type of machine learning called a generative hostile network (GAN) to produce images. These systems pit two neural networks against each other by using one to generate images and another to judge their accuracy. They repeat this process many times until the result is realistic.

Chan’s team took features from existing high-resolution 2D GANs and added a component that can convert these images for 3D space. “By splitting the architecture into two parts…we solve two problems at once: compute efficiency and backward compatibility with existing architectures,” Chan says.

3D faces generated by the EG3D artificial intelligence

3D faces generated by the EG3D artificial intelligence

Jon Eriksson/Stanford Computational Imaging Lab

While models like EG3D can produce 3D images that are almost photo-realistic, they can be difficult to edit in design software because while the result is an image that we can see, how the GANs actually produce it is a mystery

Another new model could help here. Yong Jae Lee at the University of Wisconsin-Madison and his colleagues created a machine learning model called GiraffeHD, which attempts to extract features from a 3D image that are manipulable.

“If you’re trying to generate an image of a car, you may want to control the type of car,” Lee says. It may also let you determine the shape and color, and the background or landscape in which the car is actually located.

GiraffeHD is trained on millions of images of a specific type, such as a car, and looks for latent factors – hidden features in the image that correspond to categories, such as car shape, color or camera angle. “The way our system is designed allows the model to learn how to generate these images in a way that separates these different factors, such as controllable variables,” Lee says.

These controllable functions can eventually be used to edit 3D-generated images, allowing users to edit precise functions for desired scenes.

Details of these models will be revealed on the Conference on Computer Vision and Pattern Recognition in New Orleans, Louisiana, this week.

EG3D and Giraffe HD are part of wider move towards using AIs to create 3D images, says Ivor Simpson at the University of Sussex, UK. However, there are still issues to be solved in terms of wider applicability and algorithmic bias† “They can be limited by the data you enter,” Simpson says. “If a model is trained on faces, and if someone has a completely different facial structure that has never been seen before, it can’t generalize very well.”

Leave a Comment

Your email address will not be published.