Sora, OpenAI’s Video Generation Software, Expands Capabilities Beyond Text Prompts

In a picturesque scene in Burano, Italy, colorful buildings line the canal streets, capturing the attention of a camera. A charming Dalmatian peers through a window on the ground floor of one of the buildings, adding a touch of endearment to the setting. Meanwhile, a bustling activity unfolds as numerous individuals stroll and cycle along the canal streets, creating a lively atmosphere in this charming town.

OpenAI, the creator of the renowned ChatGPT chatbot, has extended its capabilities with Sora, a video generation software. In a recent blog post, the company announced that Sora can now animate not only from text prompts but also from still images.

This development follows the success of ChatGPT, which gained popularity in late 2022 for its proficiency in composing emails, writing code, and crafting poems, showcasing the versatility of OpenAI’s advancements in generative AI.

Meta Platforms, the social media giant behind Facebook, enhanced its image generation model Emu in the past year. The upgrades introduced two AI-based features that enable Emu to edit and generate videos based on text prompts. In a bid to stay competitive in the rapidly evolving generative AI landscape, Meta Platforms is positioning itself against tech giants like Microsoft, Google from Alphabet, and Amazon.

While Sora is a work-in-progress, OpenAI acknowledges that the model may encounter challenges in accurately interpreting spatial details in a prompt and may struggle to follow a specific camera trajectory. The company is actively working on addressing these limitations and is also in the process of developing tools that can identify videos generated by Sora.