AI in radio — looking beyond the hype

A black and white image of Juha Lahti

Juha Lahti is an executive producer at Yle

HELSINKI — In December 2023, the Finnish Broadcasting Company (Yle) made a 20-hour broadcast with two high-profile talents, Viki and Köpi. However, the dialog was written by artificial intelligence, and AI also created the duo’s voices. So, the only thing humans assisted with was managing the broadcast equipment. The main reason for using the voices of two celebrities was to show our audience how far AI technology is already. The guys have a dedicated fan base, so in this way, we had a large audience without using any additional marketing.

It was made possible with a very simple idea. Text produced by ChatGPT was fed to Elevenlabs to convert the text to audio. ChatGPT is a tool that can generate human-like text, and Elevenlabs can convert any text into lifelike speech. And because the broadcast was so long, we collaborated with a software developer company to automate the process between the software tools. The project was done in a couple of weeks, and the team consisted of just three people. The talents only lent their voices and, most importantly, their brand.

AI and the news

What did the audience say? In a way, it does not matter. Because AI is so hyped at the moment that companies want to be among the first innovators, and in some cases, using AI has become more important than the quality of the content. This project was basically a joke that lasted 20 hours. The audience thought it was funny, and our show built the reputation of using new cutting-edge technology. Our show has over 100 episodes yearly, so this was just one of the hopefully funny stunts. We are just starting the journey with AI as an industry, so we must take tests and make mistakes. Building on top of the previous ideas and concepts will get us far.

Many radio stations worldwide are experimenting, for example, with AI reading the news and weather. News programs don’t need emotional depth, so an anonymous AI-generated voice can be a good solution. With entertainment, audiences still prefer to hear their favorite talents and familiar people. Using AI to generate the voice of a major talent and make short, simple radio presentations is very easy. There are already solutions that use the voice of a major celebrity across multiple local radio stations. But it is more challenging if we want many AI-generated talents to converse. The human brain is built to detect emotions and messages beyond words. AI cannot generate those yet, but OpenAI has said it is working on a new AI model that can mimic a real person in a very realistic way.

The internet has no rules, but our industry must regulate itself.

We can soon generate content that was previously impossible using AI voice clones. We can use AI to gather big stars in the same podcast and add a unique twist by cloning younger versions of their voices. Now, even language is not a barrier anymore. We can use AI to translate the dialog using their own voice and make it sound very realistic. With this idea, it is possible to combine celebrities from different countries and make them speak, for example, Spanish or Japanese with just a push of a button.

Technically, this is easy to model, but the challenge is crafting a suitable agreement and getting permission to use their voices. We can get permission to use a voice for specific audio content, but because we are using third-party software, we can’t be sure if their voices will be used to train the language model. We also can’t be responsible for those platforms’ security. We need to know what the voice data is used for.

The ethical question

With AI tools, we can do big things with small teams. A voice can be cloned using 15 seconds of audio. Using AI to write the dialog is also very fast and easy. Some companies sell production-ready solutions, but achieving the same results in a more cost-effective way is possible. In Finland, Bauer Media has created an AI radio station engineered by one person. Of course, it needs time and dedication, but we can build and develop broadcast-ready solutions in-house with these new tools. Bauer has also studied ideas using AI with their online radio web players. Imagine an analyst tracking the habits of your listeners 24/7 and the data used in millions of ways. Possibilities are limitless.

Eventually, we face the ethical side. We have a lot of questions to answer. How far can we go with synthetic human voices? How do we tell the audiences that we are not using real people? Is it more acceptable to use AI in entertainment productions than in news? What are we giving away if we use more and more synthetic audio? Is the end goal just to be more cost-effective, or is this a way to elevate the quality of the content?

One concern is also the ease of using new AI tools. Anyone online can just take an audio clip and make a synthetic version of whoever they want. The internet is full of funny clips of AI versions of presidents and other celebrities making funny statements. The internet has no rules, but our industry must regulate itself. We need to make ethical agreements and rules. That is one way of ensuring we are always a couple of steps behind people who can use whatever tools they want.

Hopefully, we will soon have more answers and even more impressive tools. In the meantime, we can try to have fun and develop amazing new ideas and concepts.

The author is an executive producer at Yle in Finland.

More stories featuring Yle

A case study in responsible generative AI use

Giving a voice to science

Finland: Jutel Moves Yleisradio Oy to the Cloud

Exit mobile version