AI: Making TV Great Again?

BY Yako Molhov

As the hype around AI (artificial intelligence) has accelerated, companies from all industries, including content production, have been scrambling to promote how their products and services use AI. Often what they refer to as AI is simply one component of AI, such as machine learning. AI requires a foundation of specialized hardware and software for writing and training machine learning algorithms.

What is in fact AI? Britannica defines it as “the ability of a digital computer or computer-controlled robot to perform tasks commonly associated with intelligent beings. The term is frequently applied to the project of developing systems endowed with the intellectual processes characteristic of humans, such as the ability to reason, discover meaning, generalize, or learn from past experience.”

Chinese broadcasters have been among the first who started using AI in TV shows. The Lunar New Year is undoubtedly the most important festival for China’s 1.4 billion people. The Spring Festival Gala produced by China Central Television (CCTV, CMG) has tried to re-establish itself with some fresh elements, such as incorporating new visualization technology and creating interactive experiences for the audience. AI+VR naked eye 3D technology started to appear on the stage of the Spring Festival Gala, and through special shooting methods and production means, it brings a breakthrough experience for those watching it on TV or their mobile devices. In 2019, the four well-known human hosts – Beining Sa, Xun Zhu, Bo Gao, and Yang Long – were joined by an “AI copy” of themselves – in effect, their very own digital twin. The “personal artificial intelligences”, created by ObEN Inc, were touted as the world’s first AI hosts. Rather than simply being computer generated avatars, AI technologies including machine learning, computer vision, natural language processing, and speech synthesis have been used to “rebuild” the virtual copies of the celebrities from the ground up.

ObEN CEO Nikhil Jain commented that the potential of personal AIs – PAIs – could revolutionize many areas of society. In fact, the company was already working to create AI-powered doctors, nurses, and teachers, as well as its highly publicized virtual celebrities.
Back in August 2017 AI VS Human (Ji Zhi Guo Ren), a science challenge program that pits top human talents against AI or robots, was aired throughout China. Also that year, a robot “invaded” the Chinese version of The Brain. The smart, AI-powered bot, Xiaodu, took on human competitors in complex trials involving face and voice recognition. Super Brain, the Chinese version of The Brain franchise, dedicated the whole of season four to the theme of supercomputers versus humans. The AI robot built by search engine giant Baidu faced off against four people and other clever computer programs.

CCTV introduced a new AI virtual anchor for one of its news television shows in 2022. Every Monday, the anchor — known as Xiao C — presents sporting events such as football, basketball, volleyball, and swimming on the network. Dressed in a pink t-shirt with her hair tied in buns, Xiao C interacts with human sports commentators, talks about game tactics, and poses questions to the audience. Baidu developed her as an early example of a virtual human market that has been forecast to be worth USD38.5 billion by 2030, according to industry services platform QbitAI. That revenue will come, says QbitAI, from virtual celebrities and service-oriented virtual humans.

“With breakthroughs made in artificial intelligence-powered algorithms, the production cost of digital humans will be reduced by 10 to 100 times, and the production period will be shortened from several months to a few hours,” a Baidu spokesperson told the China Daily in September 2022. China’s Government launched an action plan in August 2022 to drive the digital human sector and to develop one or two leading virtual human companies by 2025.

China-owned streamer iQiyi has also been ramping up its use of AI across productions. Kelvin Yau, head of southeast Asia for iQiyi International, told delegates at Singapore’s Asia TV Forum (ATF) in December last year that the Baidu-owned streaming service has explored “new opportunities” in AI due to the COVID crisis, which has had a crippling effect across numerous industries in China due to strict, enduring government restrictions.

AI is also being used in voice conversion technology for dubbing. Yau explained that two voice actors can be used to create 10 voices: “Most voice actors don’t want to be packed in recording rooms, so we invented this tech so we can use a limited number of voice actors but adapt [their voices] into different tones,” said Yau. “That’s important for us to expand into southeast Asia.”

During ATF Singapore’s Infocomm Media Development Authority (IMDA) revealed that it is launching an SGD5 million ($3.6 million) Virtual Production Innovation Fund, which is designed to support the local media industry to develop capabilities needed to harness virtual production technology. The technology uses LED screens to display realistic background environments for TV or film scenes, powered by a video game engine, so that the camera is able to capture actors and visual effects in real-time.

One of the most-popular “names” now, connected to AI, is ChatGPT. The NYT reported in March this year that for the first time in more than 40 years, Alan Alda and Mike Farrell sat down for a table read of a new scene of M*A*S*H. But the script wasn’t by Larry Gelbart or any of the other writers who shaped the television show over more than a decade — it was the work of ChatGPT.

Spring Festival Gala

Alda, who hosts a podcast called Clear+Vivid, had decided to ask the tool to write a scene for M*A*S*H in which Hawkeye accuses B.J., his right hand man and fellow prankster, of stealing his boxer shorts. The result, after plenty of behind-the-keyboard prompting from Alda, was a brief, slightly stilted scene between the two men, recorded for the podcast while the actors were on opposite coasts. Did it work? Not quite, Alda acknowledged. While M*A*S*H was known for its snappy humor and lively dialogue, ChatGPT’s effort was hollow and its jokes leaden at best.

One of the most widespread use cases of AI adoption is in its ability to determine the optimum video quality per user depending on the network speed. Like in the case of Netflix’s smart video compression technology. Another important contribution of AI to the world of streaming is that it assists a great deal in quality assurance and control. They include humble checks to identify whether media content is aligned with technical parameters as well as a more profound moderation of compliance with local age restrictions, privacy legislations, and the like.

As part of the AI Production project by the BBC, the UK broadcaster has set up a special website where it publishes different information on machine learning. This applies equally to the audio and video that the BBC records, edits and broadcasts and the “metadata” that describes this media and makes it possible to find, search and re-use it. There are many production tasks that are repetitive, or even formulaic. These tasks could instead be performed by machines, freeing up creative people to spend more of their time being creative.

For instance, editing programs is a deeply creative role, but an editor’s first task when putting a show together involves finding good shots from a huge number of video assets. An hour-long program is usually edited down from many hours of “rushes”. Sorting through those assets to find good shots isn’t the best use of the editor’s time - or the fun, creative part of their job. BBC thinks that AI could help to automate this for them.

In fact, at the 2018 Edinburgh Television Festival, Microsoft’s Tony Emerson who was the company’s long-term Head of Media and Entertainment, revealed how Big Brother used tools provided by the company and how the producers had all of the output from the previous day in the next morning, i.e. all transcripts, emotions, all of the face recognition data, etc. Producers could take this content and distribute it over the cloud to the editors who no longer had to be on site, putting things together in a more immediate sense. AI tools also can help companies with huge catalogs that they can go back and “mine” that catalog, Emerson noted.

Another key aspect of AI deployment in CTV (connected TV) is programmatic ad buying and selling. Global programmatic display ad spend is expected to reach $558 billion in 2023 – a 13.1% increase over 2022 and almost double what it was in 2019. Currently, marketers are putting more than 50% of their media budget into programmatic advertising. Only in the US, the size of the market in 2023 is expected to reach $148.83 billion, up 16.9% year on year. Programmatic ad buying offers lucrative benefits to marketers by allowing them to break free from gross rating points (GRP). Moreover, it ensures a more intelligent and pinpoint way of placing ads in front of the right viewers at the right time. As for publishers, who are typically wary of impression scarcity, programmatic transactions are more cost-effective. This kind of ad buying is determined to sell all available ad spots to the most suitable buyers and minimize waste.

AI’s key input into CTV’s development lies in it stemming the flow of default content libraries. Thanks to data-driven analysis, OTT services are capable of delivering addressable recommendations for their audiences.

Netflix, Hulu and Amazon Prime monitor all customers’ journeys down to the smallest details to gain a wealth of new ways to fine-tune content. For instance, they offer tailored trailers based on interests in certain actors, genres, reviews, and countries of origin. So, if a user has recently finished binge-watching The Queen’s Gambit with Anya Taylor-Joy on Netflix, this user is likely to be offered to watch Peaky Blinders with the same actress on the TV show’s cover.

Other ways in which AI is changing TV is that it provides ways for broadcasters to offer greater coverage of live events. This problem-solving technology is learning to create a series of shots that appear natural to the viewer for live recordings. AI is also proving useful for scouring large amounts of data for news stories. It can also be used to enhance the experience for visual and audio impaired viewers.

AI is also used in news gathering. A smart production system developed by the Japan Broadcasting Corporation (NHK) is designed to comb social media and environmental monitoring systems to report on newsworthy situations. The system is taught to look for certain words and phrases divided into a series of different news categories. These are then grouped to allow the production team to view high-frequency reports.

Face detection systems are being used to help with the cataloging of actors in various TV shows. Advances in AI allow for better recognition, even in situations where lower lighting or obscure angles would otherwise make it difficult.

Speech recognition is also useful for cataloging of news articles. The software is currently able to auto transcribe regular broadcast speech. However, it still struggles with some interviews where the subject may be talking faster or less clearly.

Also automated audio descriptions can be created using AI for live programs such as sports broadcasts for the visually impaired. This generates a script of the game progress that accompanies the audio broadcast.

The technology can also be used to create audio commentary for sports for radio or television using a speech synthesizer. This has uses for both sports and news reports where facts and statistics can easily be generated. Speech synthesis technology is rapidly improving, and a natural voice tone is now possible.

Computer-generated sign language is also possible as a way of displaying broadcast information. Rather than a live signer, a CG animated character can now relay the content into sign language.

The uses for AI and machine learning in broadcast are constantly expanding to enhance the user experience. AI technology is even being used to speed up the process of recoloring black and white footage. This makes it possible to reduce the time to colorize five seconds of film from 30 minutes down to 30 seconds.

However, not all people are happy with the advances of AI in TV and cinema. Last month The Screen Actors Guild - American Federation of Television and Radio Artists said that If producers use artificial intelligence to simulate an actor’s performance, they’re going to have to bargain for it. “The terms and conditions involving rights to digitally simulate a performer to create new performances must be bargained with the union,” the guild said in a statement.

“These rights are mandatory subjects of bargaining under the National Labor Relations Act,” the guild stated. “Companies are required to bargain with SAG-AFTRA before attempting to acquire these rights in individual performers’ contracts. To attempt to circumvent SAG-AFTRA and deal directly with the performers on these issues is a clear violation of the NLRA.”

The WGA announced a similar position, and will seek to “regulate use of material produced using artificial intelligence or similar technologies”: “Governments should not create new copyright or other intellectual property exemptions that allow artificial intelligence developers to exploit creative works, or professional voices and likenesses, without permission or compensation. Trustworthiness and transparency are essential to the success of AI.”

The guild also noted that its Global Rule One, which requires members to work under its contract on projects shot anywhere in the world, “covers entering into any agreement with an employer to digitally simulate a member’s voice or likeness to create a new performance. As such, members should not assign these rights to any employer who has not executed a basic minimum agreement with the union.”

At the same time, WGA is ready to allow the use of ChatGPT as long as it does not affect writers’ credits or residuals. It proposed to the AMPTP, which represents the studios, to not consider AI-generated material “literary material” or “source material.”

“The WGA’s proposal to regulate use of material produced using artificial intelligence or similar technologies ensures the Companies can’t use AI to undermine writers’ working standards including compensation, residuals, separated rights and credits. AI can’t be used as source material, to create MBA-covered writing or rewrite MBA-covered work, and AI-generated text cannot be considered in determining writing credits. Our proposal is that writers may not be assigned AI-generated material to adapt, nor may AI software generate covered literary material. In the same way that a studio may point to a Wikipedia article, or other research material, and ask the writer to refer to it, they can make the writer aware of AI-generated content. But, like all research material, it has no role in guild-covered work, nor in the chain of title in the intellectual property. It is important to note that AI software does not create anything. It generates a regurgitation of what it’s fed. If it’s been fed both copyright-protected and public domain content, it cannot distinguish between the two. Its output is not eligible for copyright protection, nor can an AI software program sign a certificate of authorship. To the contrary, plagiarism is a feature of the AI process,” WGA said in a statement on Twitter.