Generative AI is coming for videos. A new web site, QuickVid, combines several generative AI systems in one tool to automatically create short YouTube, Instagram TikTok and Snapchat videos. With just a single word, QuickVid chooses a background video from a library, writes a script and keywords, overlays images generated by DALL-E 2, and adds a synthetic voiceover and background music from YouTube’s royalty-free music library.
QuickVid creator Daniel Habib says he’s building the service to help creators meet the “ever-growing” demand from their fans.
“By providing creators with tools to quickly and easily produce quality content, QuickVid helps creators increase their output of content, reducing the risk of burnout,” Habib told australiabusinessblog.com in an email interview. “Our goal is to empower your favorite creator to keep up with the demands of their audience by leveraging advancements in AI.”
But depending on how they are used, tools like QuickVid threaten to flood already overcrowded channels with spammy and duplicate content. They also face potential backlash from creators who choose not to use the tools, either because of the cost ($10 per month) or out of principle, but still have to compete with a slew of new AI-generated videos.
Looking for video
QuickVid, which Habib, a self-taught developer who previously worked at Meta on Facebook Live and video infrastructure, built in a matter of weeks, launched Dec. 27. January – but QuickVid can piece together the components that make up a typical informative YouTube Short or TikTok video, including captions and even avatars.
It’s easy to use. First, a user enters a prompt describing the topic of the video they want to create. QuickVid uses the prompt to generate a script, taking advantage of GPT-3’s generative text power. From keywords either automatically extracted from the script or entered manually, QuickVid selects a background video from the royalty-free stock media library Pexels and generates overlay images using DALL-E 2. It then outputs a voiceover via the text-to voice API from Google Cloud – Habib says users will soon be able to clone their voice – before combining all these elements into a video.
See this video made with the “Cats” prompt:
Or this one:
QuickVid certainly doesn’t push the boundaries of what’s possible with generative AI. Both Meta and Google have shown off AI systems that can generate completely original clips with a text prompt. But QuickVid merges existing AI to take advantage of the repetitive, template format of b-roll-heavy short videos, getting around the problem of having to generate the footage yourself.
“Successful creators have an extremely high quality bar and aren’t interested in putting out content they don’t think is in their own voice,” said Habib. “This is the use case we’re focusing on.”
If that’s supposedly the case, QuickVid’s videos are generally a mixed bag in terms of quality. The background videos tend to be a bit random or only marginally related to the topic, which isn’t surprising considering QuickVid’s currently limited to the Pexels catalog. The images generated by DALL-E 2, meanwhile, exhibit the limitations of current text-to-image technology, such as garbled text and odd proportions.
In response to my feedback, Habib said QuickVid is “tested and tinkered with daily”.
According to Habib, QuickVid users retain the right to commercially use the content they create and are permitted to monetize it on platforms such as YouTube. But the copyright status around AI-generated content is… vague, at least at the moment. The US Patent and Trademark Office (USPTO) recently moved to withdraw copyright protection for an AI-generated comic, for example saying that copyrighted works require human authorship.
When asked how the USPTO decision could affect QuickVid, Habib said he believes it only pertains to the “patentability” of AI-generated products and not creators’ rights to use and monetize their content. to earn. Creators, he pointed out, don’t often file patents for videos and usually lean into creator economics, allowing other creators to reuse their clips to expand their own reach.
“Creators care about having high-quality content in their voices that will help their channel grow,” said Habib.
Another legal challenge on the horizon could affect QuickVid’s DALL-E 2 integration — and, by extension, the site’s ability to generate image overlays. Microsoft, GitHub and OpenAI become sued in a class action lawsuit accusing them of violating copyright law by allowing Copilot, a code-generating system, to regurgitate portions of licensed code without providing credit. (Copilot was co-developed by OpenAI and GitHub, which Microsoft owns.) The case has implications for generative art AI like DALL-E 2, which similarly copy-paste from the datasets they’re trained on (i.e. images).
Habib is unconcerned, arguing that the generative AI genie is out of the bottle. “If there’s another lawsuit tomorrow and OpenAI goes away, there are several alternatives that could power QuickVid,” he said, referring to the open source DALL-E 2-like system Stable Diffusion. QuickVid is already testing Stable Diffusion for generating avatar photos.
Moderation and spam
Legal quandaries aside, QuickVid could soon have a moderation problem. While OpenAI has implemented filters and techniques to prevent them, generative AI has known issues with toxicity and factual accuracy. Exit GPT-3 disinformation, especially on recent events, which are beyond the bounds of its knowledge base. And ChatGPT, a refined offspring of GPT-3, has been shown using sexist and racist language.
That is especially concerning for people who would use QuickVid to create informational videos. In a quick test, I had my partner – who is much more creative than me, especially in this area – enter a few offensive prompts to see what QuickVid would generate. To QuickVid’s credit, obviously problematic prompts like “Jewish New World Order” and “9/11 Conspiracy Theory” didn’t produce toxic scripts. But for “Critical Race Theory That Indoctrinates Students,” QuickVid generated a video suggesting that Critical Race Theory could be used to brainwash schoolchildren.
Habib says he relies on OpenAI’s filters to do most of the moderation work, claiming that it’s up to users to manually review each video created by QuickVid to make sure “everything is within the bounds of the law falls”.
“As a general rule, I think people should be able to express themselves and create whatever content they want,” Habib said.
That apparently includes spammy content. Habib argues that the video platforms’ algorithms, not QuickVid, are best placed to determine the quality of a video, and that people who produce low-quality content are “only harming their own reputation.” The reputational damage will naturally discourage people from creating massive spam campaigns with QuickVid, he says.
“If people don’t want to watch your video, you don’t get distribution on platforms like YouTube,” he added. “Producing low-quality content also causes people to view your channel in a negative light.”
But it’s instructive to look at ad agencies like Fractl, who in 2019 used an AI system called Grover to generate an entire site of marketing materials — reputation be damned. In a interview with The Verge, Fractl partner Kristin Tynski said she envisioned generative AI that would enable “a massive tsunami of machine-generated content in every niche imaginable.”
In any case, video-sharing platforms like TikTok and YouTube have not had to deal with AI-generated content moderation on a large scale. Deepfakes — synthetic videos that replace an existing person with someone else’s likeness — began populating platforms like YouTube a few years ago, powered by tools that made creating deepfak images easier. But unlike even the most convincing deepfakes today, the types of videos QuickVid creates are in no way clearly AI-generated.
Google Search’s policy on AI-generated text could be a preview of what’s to come in the video domain. Google treats synthetic text no differently than human-written text when it comes to search rankings, but takes actions to content that is “intended to manipulate search rankings and not to help users.” That includes content stitched together or combined from different web pages that “[doesn’t] add enough value’, as well as content generated by purely automated processes, both of which may apply to QuickVid.
In other words, AI-generated videos may not be completely banned from platforms if they take off, but simply become the cost of doing business. That probably doesn’t allay the fears of pundits who believe platforms like TikTok are making a new home misleading videos, but – as Habib said during the interview – “the generative AI revolution is unstoppable.”