Skip to main content

Google's New AI: The Anything-to-Anything Game Changer

My recent engagement with Google's Omni AI model involved sending my child's stuffed animal on a virtual rafting trip and placing a deepfaked version

5 min read7 views5 tags
Originally reported bytheverge

My recent engagement with Google's Omni AI model involved sending my child's stuffed animal on a virtual rafting trip and placing a deepfaked version of myself in front of the Eiffel Tower. While impressive, these capabilities do not yet signify the dawn of a technological singularity.

Last year, I experimented with generative AI by crafting a "vacation" for my child's plush deer, Buddy, to replicate scenes from a Google Gemini advertisement. Though I never shared these fabricated adventures with my four-year-old, the exercise proved insightful, prompting me to reflect on the fine line between harmless AI-driven amusement and outright fabrication. Whether that distinction is clear or blurred, one undeniable truth emerged: the tools for generating realistic videos are remarkably effective, demanding surprisingly little effort or specialized knowledge. This trend continues robustly into the Gemini Omni era.

Omni represents a new suite of generative models, envisioned to eventually transform any input—be it a photo, video, or text—into any other desired output. Initially, however, its focus is solely on video creation. Omni Flash is the inaugural model released by Google in this family, now integrated into the company’s AI video generation and editing platform, Flow. While the preceding model, Veo, remains an option, Omni introduces several key enhancements.

A notable feature of Omni is its ability to accept both an uploaded video and a text prompt as the foundation for an AI-generated creation. Google asserts that Omni also incorporates a broader base of real-world knowledge when producing videos, leading to improved character consistency throughout the generated content. To verify these claims, I once again enlisted AI Buddy, preparing him for another digitally rendered adventure.

The results were remarkably inconsistent, even baffling. Some clips demonstrated significant improvement, exhibiting greater consistency and adherence to my prompts compared to my tests with Veo five months prior. Yet, even Omni's most polished creations still contained unsettling "AI jump scares," such as Buddy abruptly changing orientation mid-skydiving.

For another video, I granted Omni creative license with the prompt: "Create a montage of Buddy packing for a vacation and embarking on a cruise ship for a tropical vacation. The mood is cute and playful. Buddy packs something funny in his suitcase that comes into play later in the clip." The AI depicted Buddy packing a jar of honey, which he later mistook for sunscreen, humorously squirting it onto his hoof with an "Uh oh."

While the comedic premise was decent, the execution suffered from glaring inconsistencies. The honey container repeatedly transformed throughout the video, shifting from a jar to a clear squirt bottle filled with water, then back to a squeeze bottle with honey. Furthermore, the final frame of the video appeared as a chaotic amalgamation of elements from the preceding sequence, almost as if the model had simply discharged its visual data.

Omni allows for text-based prompts to suggest video edits, and to Google's credit, this functionality is superior to my experience with Veo 3. With Veo, editing was so problematic that generating an entirely new video from scratch was often more efficient. Omni genuinely attempts to incorporate edits, but the outcomes are not always successful.

For instance, when I requested an emphasis on Buddy’s facial reactions in his vacation clips, the results looked peculiar. The model also sporadically added antlers to Buddy, a feature he does not possess – "Buddy is a baby, thank you very much." When I prompted it to remove antlers from one scene, it complied, only to then introduce them into all other scenes.

It is important to note that these generative processes are not free. Video generation consumes credits, ranging from 15 to 40 credits depending on scene length and initial inputs. A single round of edits costs 40 credits. Under my $20-per-month AI Pro plan, which includes 1,000 credits monthly, I found my balance depleted to 145 credits after generating approximately 20 clips and performing a few edits. Achieving a precise creative vision can thus entail a costly and iterative exchange with the model.

One of Omni’s advertised strengths lies in its ability to integrate AI-generated elements into real videos. Shifting focus from Buddy, I ventured into deepfaking myself. Beginning with a neutral selfie video, I prompted Omni to create clips of me eating spaghetti, sitting in an airplane, and standing before the Eiffel Tower while biting a baguette. I can genuinely say I wasn’t prepared for what I saw.

My deepfake videos contained subtle AI tells: the clink of the fork against the pasta bowl sounded somewhat artificial, and a woman in the background of the airplane video appeared twice. However, despite these minor glitches and a vague sense of the uncanny, the videos were remarkably convincing.

I presented the pasta clip to my husband, who was aware I was testing an AI video tool but uninformed about which elements were AI-generated. He genuinely believed I was sitting in front of a camera eating pasta, noting only that the bowl looked unfamiliar. The act of eating pasta itself was realistic enough to convince my husband—a man who has observed me in real life virtually every day for the past decade.

My other deepfakes varied in quality, ranging from "good enough to fool people on social media." While a couple of the Eiffel Tower clips appeared slightly cartoonish, one was sufficiently convincing that it might require multiple viewings to discern its AI origin. I personally recognized it wasn't me when the AI version turned its head to reveal hair pulled back in a ponytail, a detail I knew to be incorrect. However, I doubt others would notice the discrepancy, which leaves me with a peculiar feeling.

To be candid, I find myself somewhat exhausted by this progression. I was genuinely shocked by the realism Veo 3 could produce, and consistently surprised by the ease of generating fake individuals in fake photos over recent years. While Omni’s capabilities are still impressive, the initial sense of shock has diminished. It is not yet effortless to produce an AI-generated cinematic masterpiece, contrary to what Google might suggest. However, Omni undeniably improves upon Veo in discernible ways. With a Google account and a credit card, one can effortlessly transform a video of themselves sitting at home into a scene of them flying to Maui. While not quite the "foothills of the singularity," we are certainly deep in the uncanny valley.

All images and videos featured in this story were generated using Google Gemini.

#AI News#Google Omni#Generative Video#Flow Platform#AI Inconsistency
ES
Editorial StaffEditor

The Editorial Staff at AIChief is a team of professional content writers with extensive experience in AI and marketing. Founded in 2025, AIChief has quickly grown into the largest free AI resource hub in the industry.

View all posts
Reader feedback

What did you think of this story?

User Comments

Filter:
No comments yet. Be the first to comment!
Continue reading
View all news