Main menu

Pages

Researchers unveil new AI-based technique that can create short videos based on single images

Important reasons: Researchers continue to find new ways to leverage artificial intelligence and machine learning capabilities as technology evolves. Earlier this week, a Google scientist announced the creation of his Transframer, a new framework with the ability to generate short videos based on a single image input. New technologies may one day enhance traditional rendering solutions and allow developers to create virtual environments based on machine learning capabilities.

The name (and, in a sense, the concept) of the new framework is transformerFirst introduced in 2017, Transformer is a new neural network architecture with the ability to generate text by modeling and comparing other words in a sentence. The model is then embedded into standard deep learning frameworks such as TensorFlow and PyTorch.

Just as Transformer uses language to predict potential outputs, transformer framer Create short videos using contextual images with similar attributes in combination with query annotations. The resulting video moves around the target image and visualizes the correct perspective, even though no geometric data was provided in the original image input.

Proven using Google’s DeepMind AI platform, the new technology works by analyzing a single photo-context image to obtain significant portions of the image data and generate additional images. During this analysis, the system identifies the framing of the photo. This helps the system predict the photo’s surroundings.

The context image is then used to further predict what the image will look like from different angles. Prediction models the probability of additional image frames based on data, annotations, and other information available from context frames.

This framework represents a major step forward in video technology by providing the ability to generate fairly accurate videos based on very limited data sets. The Transformer task also shows very promising results in other video-related tasks and benchmarks, such as semantic segmentation, image classification, and optical flow prediction.

The impact on video-based industries such as game development could potentially be huge. Current game development environments rely on core rendering techniques such as shading, texture mapping, depth of field, and ray tracing. Technologies such as Transframer use AI and machine learning to build environments while simultaneously reducing the time, resources, and effort required to create environments, potentially offering developers entirely new development paths. is hidden.

Image credit: DeepMind