Banner Image

Beyond Audio: The Convergence of Audio and Video in Podcasting

Beyond Audio: The Convergence of Audio and Video in Podcasting

  • Dec 11, 2023
Nathan Gwilliam
Nathan Gwilliam

Nathan Gwilliam: Welcome to the Podcasting Secrets show where successful creators share their best stories, secrets, and strategies. I'm your host Nathan Gwilliam.

Hello incurable creators in this episode. I am joined by Rob Greenlee. Rob is the founder of Spoken Life Media, and he has had a prolific career working for many leading podcasting-related companies.

For example, he served as the Vice President of content and partnerships at Libsyn. He was the head of partnerships at Spreaker. He's. The co-host of the podcast the New Media Show where he discusses trends and news from the podcasting world. He recently launched Podcast Tips with Rob Greenlee on StreamYard, and he's even the former chairperson of the Podcast Academy, which is the Amby's award.

Thank you so much for joining us on this episode, Rob.

Rob Greenlee: Thank you for having me. I appreciate you inviting me to join you Nathan. It's great to be on your show.

Nathan Gwilliam: Absolutely. And in this episode, we're going to have a discussion about the convergence of audio and video strategy.

And it's really interesting. I talk about podcasting with people and podcasting, if you look up the definition technically means recording and publishing an audio file, but it feels like more and more. That term is becoming synonymous with creating content and publishing it in numerous different ways, including audio, but also video and social and newsletters and blogs and a lot of different ways.

Tell me about how you've seen that kind of transformation of the term podcasting and all that it encompasses.

Rob Greenlee: Yeah, let me take you back a little bit further. If you go back to the earlier days of the podcasting medium itself it was an audio and video medium. So that definition that you're quoting there is fairly recent around the perception of podcasting. And because back in the let's say the 2007. 8, 9, 10, 11, 12 timeframe podcasting was probably 30 percent video podcasting, which was the same thing as audio. So you could put an audio file in an RSS feed or you can put a video file in RSS feed. The standard is open to even PDF files.

In the past, it was more common for people than it is now for a person to publish a video file into a separate RSS feed from a audio file in another RSS feed. So you could have the same program that was available in Apple podcasts, which you can do even today. There would be an audio version and a video version and you can consume your choice. You can pick what medium that you want to consume it in.

So when YouTube launched in approximately 2007 or so and started to grow, a lot of those video podcasters started to publish their episodes over to YouTube. And so we saw a decline in the number of video podcasters. Because of YouTube and so to come full circle with this, we went through an era like the last 10 years or so where the medium has perceived to be primarily podcasting as an audio medium, but it really hasn't always been that way.

So in some ways this convergence conversation that we're talking about today is almost like going back to the origins of this medium to look at podcasting as a more complete open syndication distribution technology or platform or whatever you want to call it. If you look at some of the early podcast hosting platforms, like Libsyn, Podbean, Bluberry, they still support video podcasting today.

So my new media show that you mentioned has been a video and audio podcast since about 2013 or something like that. So it's been active in that area and there were whole media companies that were created in the early days of podcasting that were just doing video podcasts. And some of those companies sold like big networks, like the Discovery Channel.

And some of those, so there's a whole other era of this podcast medium cause it's 20 years old, right? It's gone through a lot of evolutions and if you weren't around back then paying attention to the medium, you probably wouldn't know any of this. So that's why we're seeing what's happened today. And why I'm talking about this convergence is that we're coming back to the origins to some degree.

But it is too many people. It is a significant development because it is like what you just said is that podcasting is being seen by audiences as a online content distribution. Strategy, right?

Not so much entirely linked to RSS, which historically has been the case. But now we're starting to see people perceive podcasting on platforms like YouTube or whatever as being podcasts. And guess what YouTube has done is they've named some of that content to be podcasts on their platform, which is only propelling that perception.

Nathan Gwilliam: Okay, so let's talk about this convergence. Maybe you could start off by just giving a high level overview. What is the convergence of audio and video strategy and podcasting and what's driving it?

Rob Greenlee: Yeah, I think that the big thing is just looking at it from a more complete perspective, right? Of some people like to consume audio, some people like to consume video. And if you're creating a program or a show it's an optional strategy to, to look at that and say can I make this video show good for video consumption and can I also take that same show? And put it out as an audio version as well.

And how do I do that? What's that approach that I need to make? And what do I need to think about in being able to produce a piece of content that is useful in a audio and video consumption experience? And then also, how can I take that content that I'm making and make it available? In other tools and other platforms like a TikTok or Reels or in Spotify or whatever.

And that's where the audience is perceiving. And this is really what we're talking about here is more of what the audience is perceiving that this content is that if you're doing a program like what we're doing here that doesn't have an RSS feed distribution strategy behind it, but it's only available. In like Spotify, just look at a great example of this is Joe Rogan. Joe Rogan started out as a podcast audio only started getting into video. And then he did a licensing deal with Spotify to put his program only exclusively on Spotify. And guess what he's done inside of Spotify? He's created a convergence strategy.

He's kept that going, actually, is what he's actually done. But he still has his RSS feed account with Lipson, but he's not using it. So he doesn't need to use it because he's on an exclusive distribution deal with Spotify. So many in the podcasting space look at Joe Rogan and say he's actually technically not a podcast, right? Because he doesn't have an RSS feed. He has no need for an RSS feed. But people still consider him to be the biggest podcaster in the world.

So that's an example of what I'm talking about convergence, right? These people are perceiving things as a type of a content. And I've been arguing this point for many years is that progressively, we have been seeing this development that the audiences are driving the perception of the medium, not the content creators. The content creators get caught up in this. It's I don't need to know about RSS or I need to know about RSS. It goes both ways. My contention now is that it's all content.

It's what the audience perceives is what's most important.

Nathan Gwilliam: Yeah, definitely. That perception is driving it.

Let's talk about the whys for a second. Why are some people not adopting this convergence with video and audio and why should we? What are the biggest benefits that should be driving us in this direction?

Rob Greenlee: Yeah, I think it's a very good question actually, because I think as a content creator and as a person that's thinking about getting involved in creating online content, I think you do need to think about what your passion is and what you want to do is a big part of this, right?

Especially if you're just getting started, video can be a challenging medium, right? Many people feel like that they have to build up to that. And then there's other people that are very talented on video that can just jump right in. And so we've seen this approach, and you can clearly see it in YouTube, is that there's very talented people.

on YouTube that are just so good on camera and they just have natural talent in that way that they tend to just gravitate towards that and do that only. And oftentimes they don't think about just putting out an audio version of it because they built such a following in YouTube and that's working for them.

But then there's other people that feel shy or don't want to put their face out on video and don't feel confident in that, that maybe they feel more confident. And just doing audio to get started, right? Or to do audio because they feel like they can create the best kind of content because they can do post-production editing more and things like that.

Cause editing video can be a little more challenging and it can also be punishing in the consumption experience if you do too much editing, right? But in audio, you can do as much editing as you want, right?

So it's all about how you want to create content, what you feel comfortable doing and what your topic is and what's the demographic breakdown of what that content is. YouTube consumption tends to be younger though I think it's moving a little older. And podcasting is tending to be the fastest growing demographic age demographic is young people 16 to 24 is the fastest growing consumption demographic for podcasts. And radio obviously tends to be a lot older. So you have that demographic piece of it too, but we're not really talking about radio but radio was an interesting contrast to what's happening in podcasting right now and why we're seeing a lot of. People are moving away from radio going towards more on demand and podcasting experiences, so it really gets back to what you want to do.

What are your passions? What are your goals? What do you feel comfortable with? What do you have knowledge of already and then build up to these different strategies, right?

So this convergence strategy is probably the best way to go if you know how to do it and you feel confident and have the time to do it, because it is a time commitment to do both of these things and to do them.

But it can be very effective and some of the very largest content creators and some of the most successful ones do a convergence strategy and have been doing it for many years.

Nathan Gwilliam: Let me do an aside really quick and just comment on something you just said. You were talking about radio.

And we're not talking about radio here, but I'm seeing a lot of convergence with radio into this audio video. I've personally worked with multiple very popular radio hosts who are seeing the need that they're going to, they're having to go to podcast and they're to audio and to video and they are going to need to transition.

Yeah, the radio is converging with this audio video convergence as well.

Rob Greenlee: Because they want to stay relevant as the as the demographics of media change they need to be following that trend and that's what's happening with radio right now. It's not like they're falling off a cliff right now, but it's gradually changing as the demographic consumption and media changes.

And as more older people stop being consumers of radio, the younger people are coming through the pipeline and they're adopting the new on demand or live streaming or streaming technology platforms increasingly, which is eroding their market share.

Nathan Gwilliam: Yeah, definitely.

Okay. You also are talking about some of the challenges that hosts and creators are facing as they, they try to implement convergent, this audio and video together into their platform. And maybe you can talk a little bit more about some of those challenges. What are some of the most common challenges and what are some of the best ways to overcome those?

Rob Greenlee: I think some of the biggest challenges is just figuring out what the best way is to get started.

If you have a lot of. online content experience coming from a prior job or, if you worked in media or something like that, or if you were an online presenter and you've done a lot of zoom type content or just been on a lot of Zoom calls for your job and you have decent recording equipment and good webcam and a good camera, I think or a good microphone or one that at least picks up good sound.

I think you have the good foundation to get started with this. Then it really boils down to what your distribution and syndication strategy is. And then obviously the content, it's the why it's what's your message? What you are trying to build for yourself is the other huge component to that.

But the biggest challenge is just for many, it's just getting started and really coming up with a content approach that fits with you, fits with your passions, maybe aligns with your personal brand goals if you even establish those yet and working through kind of a little bit of a focus now, granted, you don't have to have all this stuff figured out to get started.

I think one of the things that a lot of people do is. They get bound up in this this dilemma that they get involved in of indecision, right? Because they can't make up their mind on something. And oftentimes it's better to just get started with just a simple concept. Be willing to accept a certain period of time when you're not going to have a large audience.

So if you have a small audience at the beginning, then you have less risk, right? So it just depends on where you are in your career and your personal brand. How free you are to just enter creating content at maybe a lower level of performance or quality or something like that to get started. If you're a high profile personality, you probably need to take it a little bit more seriously at the beginning.

But if you're a person that is relatively unknown, I think you can get away with a bunch of experimentation and trying new things because you're not going to be exposed to a lot of people right out of the gate. So I think those are the key things to think about, especially at the beginning. And then increasingly we're seeing these new online tools like StreamYard and some of these that are enabling you to just have a webcam and a microphone and then just use their software platform to, to create an account and you can do live streaming, you can do a recording, you can do video, audio, all that stuff into one tool and you don't have to have a big fancy studio set up what I have here.

But I think those are the biggest things to think about, especially at the beginning. Now, granted, as you move through this process, that can be a multi year process there can be challenges that come up and, increasingly, content creators are faced with trying to figure out if they're going to use AI tools to help them with their post production or their pre production or their content planning, things like that. And then that's a whole other level of complication and then where's your content hosted and what platforms do you want to distribute to what's, increasingly content creators are faced with this option of converting your program into multi-languages.

AI is giving us the ability to start doing that. And so that could appeal to a much bigger audience, but it does take a different kind of a content syndication strategy. It's a little more of an advanced concept at this point. I think just getting started with your native language is probably the safest way to go and the easiest way to go to get started.

Nathan Gwilliam: Yeah, I agree with that. I've seen so many people who just get bogged down. They can't figure out exactly the right name, or they can't figure out exactly, changing the equipment. You don't need a lot of fancy equipment to get started. I may have some of this equipment, but when I travel, next week I'm going to be traveling on the road and I'm going to be recording.

And I take a few little pieces of equipment. I can still record just fine. And you can definitely get started with basic equipment. You can definitely get started, even if you have to change the name of your show. Even if you get 10 or 20 episodes in and you decide you want to go and do something else with your show.

That's okay. And by getting. Partway down the road. It'll help you make that decision. Sometimes just doing it is the best way to get where you want to be. Not getting where you want to be first before you do it.

Rob Greenlee: Yeah and oftentimes, you can just start with your smartphone. And that's got typically a very good video camera in it. And it's got the ability to bring in high quality audio in there too. And you can just get like a microphone or a lavalier type of a setup that plugs right into your iPhone. And just keep it really simple, get a little, like a smartphone tripod stand that you can put on your desk and get a little, inexpensive kind of light.

I have like studio lights above me that are creating this balanced look with my background and things like that. So that's advanced concepts. But you can do that in front of an open window or something like that gives you that light that's very similar to a studio light.

So there, there are inexpensive ways that you can get started with this. And a lot of the platforms are increasingly supporting mobile content production and live streaming through your mobile phone. So you can do a lot of this stuff with the equipment you probably have right now.

Nathan Gwilliam: Let's not make it too complicated.

Okay. Can you share with us some examples of people that have been very successful in this convergence of audio and video?

Rob Greenlee: I think there are. Examples of them out there. I know that there's, I think if you go to YouTube there are a bunch of shows in there and sometimes it's, if you get attached to a particular video podcast that is doing, content what we are, that's a little bit long form pop over to the Apple podcasts platform and do a search for their show name and see if they're in there.

What we are seeing is increasingly those folks that are building successful programs on YouTube are tending to start thinking about audio distribution and maybe you would prefer to consume that program that's on YouTube as an audio show, maybe you can listen in your car or whatever, if you like the content.

And what the big difference is really between these two mediums is how visual they are, right? How much information is being communicated on the visual side versus the audio only side. And that's what the content creator has to really think about as it appeals to a listener is that you're giving both sides the same value as much as you can now, granted, it's never going to be exactly the same because there's no way of being able to replace, facial expressions and visual kind of hand gestures and things like that.

You can't replace that in an audio program. And that may be how you, as a human. Like to experience people is with that facial expression because it does communicate a certain amount of either of its authenticity or whether this person is believable or if there's just kind of a connection there that's at a deeper level than you get with audio and I think some people prefer the video.

Really, I know I've increasingly moved in that direction myself. I consume far more podcasts really on, on YouTube today than I do in Apple podcasts. Just because I like to sit down in front of a big screen television and consume that content there. But for all intents and purposes, and many of those programs have an audio version of it. But that's just me. But that's not how millions of other people are making choices like that.

Nathan Gwilliam: Yeah you're definitely right. There's some huge quality advantages with video where you can connect in a way that's harder to do an audio. And on the flip side of that, some of the advantages with audio are the multitasking ability, right? When I'm driving, when I'm exercising, right? I can do two things at the same time.

Okay, what are some of the biggest mistakes you've seen people make with this convergence of audio and video and how do we avoid them?

Rob Greenlee: I think a lot of them happen unintentionally because maybe they weren't fully accepting or understanding of the additional processes and work that goes into it, right? And I think you get caught up on the video side and that can take a lot of time. And then on the audio side, that takes up a certain amount of time too, especially if you want to create a slightly different version of your show on the audio side than you do on the video side. Oftentimes it requires either separate editing on the video than the audio because you do have the option to edit differently between the audio and the video. But there is, platforms out there that will allow you to edit the audio and the video at the same time.

And I think we're going to see more of that kind of AI integration around content editing going forward, where there will be selections in the software that you can select to remove certain filler words. So if I say or if I say like this, the AI will recognize those and remove them from the audio and the video.

So when you export from these AI platforms, your publishable version, the audio will be edited the same way that the video is. So there is something compelling about that, but I do think that you have to be careful about that because sometimes if you do too much editing. That takes away from the authenticity and the trust of the content too.

But the counterbalance to that is that you're maybe being more conscience or more aware of your listeners' time and. And not wasting their time with filler words or big silence gaps or things like that, which is a double edged sword too, because oftentimes, at least, if you think about broadcast radio and a lot of the best kind of hosts especially solo hosts that tell stories, they like to tell a story and then take a pause for that point or that message to the listener to really have a little bit of time to think about without the next, thing hitting them. And that's the balance that you have to strike between these editing tools is are you trying to drive emotion and reaction from your audience by what you're saying?

Or you just trying to crunch it down into the shortest possible amount of time that you can to save them time. But it may be removing out of it a lot of the emotional impact that is possible in the content. So I think that's the juggle that we're having right now.

Nathan Gwilliam: Yeah, definitely. I had not thought about that. That's a huge negative of using those filler word and empty space removal AI tools. We've got to be careful that the AI doesn't take away the humanness of that.

Rob Greenlee: And then all of us start sounding like voice clones.

Nathan Gwilliam: Let's talk about live streaming for a second. Live streaming has become a huge part of this audio video convergence.

What do we need to do to be successful with live streaming in this convergence?

Rob Greenlee: Yeah, I personally really enjoy live and I always have because I started creating content on live radio. So I'd go into a radio station and do it live and had to manage, somewhat manage the audio board.

Plus I had to take, answer the phone when I got a caller that was wanting to call in and join the show. So I've done this multitasking thing with live, and doing it for many years and this, the show that I do, The New Media Show, has been live on all the streaming platforms for over 10 years.

So it's something that is picking up steam because it's a way to have an interactive experience with audience. And it's a way to connect with people in real time. And it's like being a sports athlete where you're in the locker room before the game and you get kind of butterflies and you get excited about, I'm a former college basketball player.

I played for Pacific Lutheran University in the Seattle area. And so we had games and, that we would travel to and stuff like that. So it was all about mental preparation, right? So each game you had a strategy, right? The coach had worked with you all week prior to the game and you had your certain role that you had against this upcoming opponent, right?

So podcasting in a lot of ways is like that, especially live- streaming content where you have preparation before you know what you're going to do. You have a plan you have an outline whatever and you have to mentally prepare for that and There's a whole technique around, getting yourself mentally prepped for the competition or in this case it's clicking the live streaming button and all of a sudden you're in front of, 100 people or 20 people or however many that people is and it's real time, right?

And what you say goes out right there. There's no taking it back. There's no seven second pause button. Some content creators have that right where they can bleep you out if they need to. But so you have to be really On your game and really focused and that's the thing about live streaming that you have to think about is are you ready for that?

Have you practiced enough in just recording? And getting comfortable with the process, but it is a different mindset and you might try and get involved in being a guest on some live programs. I think it will get you set up in a way that you are a mental mindset to bring your a game when that red light goes on, right?

That you're there to perform and you need to deliver on valuable content. You need to be focused. You need to be on top of it, but you also increasingly have to be able to manage the production too. Cause oftentimes if you're doing it alone, you have to manage all the knobs and buttons on the screen to be able to play overlays or to bring in a comment from a social media platform displayed on the screen and then talk about that comment and just paying attention to everything requires you to be a multitasker.

But sometimes that takes a little practice and sometimes that takes a little bit of feeling confident that you can pull it off. And probably the first couple of shows are maybe a little bit of a disaster but that may be what you have to do to actually learn how to do it.

Nathan Gwilliam: From experience, you definitely will not be confident the first few shows.

Rob Greenlee: Unless you've done it as much as I have over the years.

And it even pushes my, but after I'm done with an hour live show, I'm like beat because it's like, you're really focused and sometimes you just got to step away and go take a break for a while because it's like playing on the basketball floor, you've got four quarters that you have to perform and win the game.

And if you lose the game, guess what? You're regretting your performance the whole next week. So it's very similar.

Nathan Gwilliam: But even though I was not confident when I started live streaming, the only way to get confident is to do it.

Rob Greenlee: So you have to jump in with both feet and take a risk and just be, it's not necessarily a bad approach to just be honest with your audience about what's going on. Just tell them that that you're new to this And there may be things that don't go quite right, but just ride along with me and I'm going to bring you the best content that I can. And each episode as we move forward we'll get better and better as I get better and better.

And that's a common thread with podcasting just in general is that episode one is. Not going to be as good as episode 30. That's just how it works. You're going to get better. Hopefully, you only produce one episode at a time and that gives you time to learn and improve from episode, and take feedback from your audience and get and be open to that.

Don't be overly sensitive to people saying your audio sounded like crap and. And get in there and see if you can improve it.

Nathan Gwilliam: You talked a little bit about AI and that's a huge part of the future of podcasting we're living today. Where else do you see the future of this audio video podcasting convergence. Where will it take us?

Rob Greenlee: I think it's going to increasingly move towards video. I think that's the pattern that we're seeing. Increasingly I'm not a hundred percent sure about what the future of audio is in the bigger picture of things. I think that there will always be a certain amount of consumption of audio because just like you said earlier, there's just so many more places where it can be consumed than video.

So I think that the pattern going forward is increasingly content creators are going to have tools that are easier and easier to use. And we're already starting to see some of that stuff happen, platforms like Humano or whatever that are using AI tools, cloud production tools, things like that, that can audio process the audio.

Which means you can record your content pretty much anywhere. You don't need special equipment or, special microphones and mixers and all these wires and all these kind of things. You can just have this kind of very simple lavalier type device clip on and voice it. And then it's done, it goes up to the cloud, it gets voice processed. It sounds like you're in a studio, but you're outside by a waterfall, or you were in other places. And we're starting to see this kind of technology start to take hold.

And so it may be much easier. And I think this also, this move towards creating content with your mobile device is increasingly going to become more and more a thing as well. And then these AI tools will augment that production and clean them up and optimize them and create different experiences and stuff with it more automatically.

Now the content creator still needs to be in charge of the process and if the AI doesn't do the right job, I'm a firm believer that the AI needs to be able to be modified and you give it your creative input. And don't be 100 percent reliant on any kind of AI tools. Always review what it does, the output that it creates. Increasingly we're going to be probably using very good voice cloning tools. We're already seeing that right now as a way to edit audio. You can add audio to a piece of content with your voice without actually having to use your voice you just type it in.

Nathan Gwilliam: And even your mouth moving according to those words you typed in.

Rob Greenlee: Exactly. You could use a younger version of yourself or something like that. Or there's all sorts of things that this is going to unlock. It could create a lot of this whole concept of deep fakes and things like that I think is a realistic concern.

And it's going to be hard for us sometimes to know what's a hundred percent real and what's been created for us. And I'm not sure exactly how we're going to mitigate that quite yet. I'm not sure if there's going to be laws or if there's going to be some sort of rules around identification or labeling this as AI generated or somebody, is there going to be some rules around that that will preserve the integrity of human created content like we're doing right now.

Nathan Gwilliam: Thank you for being with us today and for sharing your time and wisdom with us. If our audience enjoyed this and they want to learn more about you and your products and services, what are the best ways for them to do that?

Rob Greenlee: I have a website at and that's g r e e n l e e dot com, Rob Greenlee.

And I'm on Twitter as well, or X at Rob Greenlee. You can actually find me by my name on pretty much all the social platforms, whether it be Instagram, on YouTube, you can see a bunch of the content that I'm making over on YouTube as well at Rob Greenlee. And I have a show on the StreamYard channel it's called Podcast Tips with Rob Greenlee.

And that's typically every Thursday at 7 p. m. Eastern 4 p. m. Pacific and the new media shows live Wednesday at 3 p. m Eastern and that would be noon Pacific. And that's at new media show. com. Those are two of about the five shows that I'm involved in. I could be here for a long time telling you all the stuff I'm working on, but that's a good summary.

Nathan Gwilliam: With everything you have going on, thanks for making time and being here.

Rob Greenlee: Yeah thank you, Nathan, and good luck with your projects as well.

Nathan Gwilliam: I really appreciate that.


Subscribe now to the free Podcasting Secrets newsletter, and we send you our Ultimate Podcast Monetization guide at no charge!