Apple Award Winning Podcast
Frank Schneider explains how software and artificial intelligence can create powerful opportunities by listening to thousands of conversations simultaneously
Frank Schneider is the CEO of artificial intelligence listening company, Speakeasy AI, whose mission and technology is based upon the premise of listening to understand, not merely respond.
He was born and raised in Philadelphia, a city where listening is equal parts human empathy and survival, Frank spent the bulk of his 22 professional years in roles where active listening is of paramount importance.
Frank has taught elementary, middle and high school and worked with adult ‘English as a second language’ students fleeing war-torn countries, and teens who were court adjudicated. He has coached basketball players and sales reps, counselled convicted felons, teachers and corporate teams in conflict resolution and peer mediation.
Frank explains the history of listening software, in typed conversations between humans and chatbots. Now we’re speaking vocally to artificial intelligence, famously to assistants like Alexa and Google Home. However, your voice is still transcribed word for word and sent into very similar algorithms to those that powered chatbots. This is listening to transcribe, not listening for meaning or understanding.
Advanced listening AI, that Frank works with, attempts to understand what we’re saying from the moment we say hello. Real listening examines the type of language that’s being used and also incorporates context. In this way, it should be accurate, helpful and effective at large scale.
Frank also talks about why listening is pivotal to being a basketball coach. No matter how you coach, you’re not playing the game, so you need to listen to your team to get on-the-field knowledge about what’s happening. The players have something to say, so in order to give the best advice, guidance and direction, you need to have their input onboard.
Tune in to Learn
- How software listens differently to humans, but what we can learn ourselves
- Why listening is important for serving others
- Why groups can solve their own problems when they are in a listening environment
- How comedic impressions can provide valuable insight
- The power of software to listen to thousands of conversations simultaneously
Episode 34 – Deep Listening with Frank Schneider
Hi, I’m Oscar Trimboli, and this is The Deep Listening Podcast Series, designed to move you from an unconscious listener to a deep and productive listener. Did you know you spend 55% of your day listening, yet only 2% of us have had any listening training whatsoever? Frustration, misunderstanding, wasted time and opportunity, along with creating poor relationships are just some of the costs of not listening. Each episode of the series is designed to provide you with practical, actionable, and impactful tips to move you through the five levels of listening. I invite you to visit oscartrimboli.com/facebook to learn about the five levels of listening and how others are making an impact beyond words.
You’re looked to for leadership. It’s hard to be patient and fall back and say, “Let’s listen and figure out what this is and what’s needed before I prescribe something, or before I jump into a solution.”
What we found is that there was a gap in the performance of these voice AI systems, and that gap, interestingly enough, is that AI isn’t properly listening. Artificial intelligence is listening to transcribe, almost like a court stenographer. It’s listening to simply type out word for word what you are saying, and then push it somewhere, like push it to a human.
How does listening weave it’s way through your career?
In this episode of Deep Listening: Impact Beyond Words, we get to hear from Frank, the CEO of an artificial intelligence listening company. He explains how his early career as a teacher, conflict resolution mediator, and basketball coach helped to form the way he thinks about listening through the lens of software. He explains how he combined listening to a customer from the past, and how that helped him create listening software for the future. Frank does a great job of exploring what AI, artificial intelligence, can listen to and what it can’t in the current generation of software. Learn about the parallels between machines and how they listen, and how this replicates the mistakes of the way humans listen when they just listen to content and aren’t listening to context within the dialogue, and ultimately what’s unsaid. Let’s listen to Frank.
When I think about the good listeners, I had a violin teacher who my parents were kind enough to get me private lessons outside of school, who very much crafted lesson plans that had goals and benchmarks in mind for where he wanted me to get to, as a coach or instructor. The pacing of how I was going to get there and the ability to figure out the true learning moments along the way were done based on him listening to me, not just as a musician and as a player, but even feedback on simple things like, “This is fun.”
I remember coming up with … He had me playing these Irish jigs, these really fast Irish songs, and I told him one day I felt like that we were racing. We would play other instruments while I was playing the violin, so he would play the flute or the clarinet and I would play the violin. I said, “I feel like I’m racing you,” and so he turned that the next week into a lesson plan where we kind of raced through pieces. It was a way to have me understand pacing and syncopation and all these fun music concepts, and also read music at a quicker, more rapid pace as well. It was super fun. I remember those violin races, and this was a long time ago, like nine, ten years old when I was doing this. But I don’t think he did races until I mentioned that, and sure enough he found a way to weave it into his lesson plan and have me stay really into the violin, which wasn’t easy for a kid who most of the time would prefer to be playing baseball. He was really great guy.
Thinking about that situation, what lessons do you take forward now in terms of your ability to listen as a result of that?
That, in general, impacts me a lot. It’s something that I feel like every day I’m imperfectly working towards, in regards to being a better listener, and that’s across the board, professionally, in my roles in executive of my company. I’m very passionate about coaching and I coach a high school basketball team still, and listening there is of paramount importance. Then with my wife and my three daughters, it’s everywhere. It’s that idea that I’m a firm believer in service leadership, and at the core of service leadership is the concept of elevating people by delivering things for them, by doing things for them, by serving them.
Giving someone what they need at the right moment is so hard. It’s so hard to know what that is. Constantly trying to listen to both verbal and nonverbal cues, to not be in a hurry to prescribe what that help is that you can deliver or jump to a solution, but actually sometimes what you can do to serve someone or help them is actually provide the listening is a constant challenge, especially when you’re in a leadership position. Whether it’s coaching or my role at my current company or with my children, you’re looked to for leadership. It’s hard to be patient and fall back and say, “Let’s listen and figure out what this is and what’s needed before I prescribe something, or before I jump into a solution.”
In those time critical moments as a basketball coach on game day, where are the moments you’ve noticed catching yourself where you haven’t been listening, and where are those moments where you were really proud and listened well?
It took me a while to know that sometimes in those critical moments … So you’re down three, maybe there was a bad referee call or a turnover or something where you need to call time out, and you have either 30 seconds or a minute to get what you need out. It’s really hard to be patient when there’s literally a clock on your communication. You bring in these five kids and you have to get another 10 or so that are on the bench to stand up, and they all come in a circle and physically get around, and everyone’s ready to hear your words, sometimes the best thing you can do is let someone who’s living the moment in a way that you can’t …
No matter how you coach, you’re not playing that game. You’re not participating in the contest at that level. Only those five active players on the floor really have 100% knowledge of what’s occurring. Like a good coach, you’re going to instruct and guide, try to help, facilitate successful outcomes for those five, but those five have to do it. When they come off the floor, oftentimes they have something to say. If you can get that input, it can help make sure that when it’s time for you to deliver your message, ” Okay, this is the strategy we’re going to try, or the play we’re going to run, or here’s something that I’ve noticed that you guys haven’t noticed in regards to the physicality or what have you,” if you don’t give them a chance, you’re going to stub your toe in many a huddle. It’s not like in the movies where the coach says seven amazing words and everyone goes, “Yes, let’s go. Go, fight, win. Everything’s magic.” It doesn’t work like that.
I’d say early in my coaching career, I worried about, from a control standpoint, a player coming in from the court and maybe they’re flustered and they’re frustrated, that communication starting with them making the huddle spiral out of control. It took time to realize, especially as you develop trusting relationships with your players, that sometimes them initiating that conversation or initiating that strategy discussion can help you in a critical moment come to the decision that’s needed for the team. Because you have to decide what are we going to do with these 30 to 60 seconds that’s going to help us when we run back out into the fray?
It’s a lovely example with parallels in the workplace because I sense the very first thing the coach would want to say in that situation to the playing group is, “Listen,” and yet the first thing you’re doing is posing a question from the people who are undertaking the activity, where they’re at there, and it’s a great parallel for leaders in the workplace that sometimes the most powerful thing is to help them listen to themselves by your posing the question so that they can listen. A really good example of “Ask the question and you’ll probably hear more.” But more importantly, the group can solve its problems faster usually than you can as a coach. Have you found that to be true?
For sure. You hit the nail on the head. If the group can arrive at “Here is what we think the path is to get this outcome we’re looking for. Here’s how we want to solve this problem,” or, “Here’s the play we want to run,” it suddenly delivers a level of ownership that you can’t get by suggesting the same thing even in a way that’s incredibly impactful. Like any organization, if you’ve prepared properly and you’ve practiced properly, and then you come into that meeting in a critical moment and say, “What should we do here?” If two or three team members take control of … We’ll have in basketball these dry erase boards, and they take the dry erase marker and say, “Well, here’s what I think we should run because I’ve noticed X, Y, and Z.”
I’ve had players suddenly since I’ve let go get really specific, without making the podcast about basketball, and say, “Well, I’ve noticed they’re overplaying screens, so we should backdoor,” so that means three in our language. “Let’s run threes right here from the wing,” and they’ll draw it on the board. It’s the best moment you could possibly have as a coach because all you do is say, “Yup, that’s a great idea. Let’s go.” Even when that huddle ends and the five players run back on the floor, the ten who are on the bench are incredibly uplifted by the fact that that decision came from the people on the floor, and they can envision themselves doing the same thing.
A really good example of simply listening to your team. Often they have much better solutions to their own problems than you do, but more importantly you create an environment for them where they can solve that problem. Don’t underestimate the role you’re playing as a leader, as a manager if you’re not telling simply creating that great environment.
One of the funniest things that happened to me about two years ago is, I had been coaching at this high school for long enough now that old players are coming back, which is a lovely treat. It’s a great treat when your old players come back to talk to your new players. There a bond there even though the kids didn’t play together. I’ve uncovered that one of my old players thought he did a great impression of me. Once I found out, I had to see it. I had to hear it. It was great.
What I learned about two years ago is if I can find out that someone on the team can do an impression of me, I want to hear it as soon as possible. It’s, of course, super entertaining and funny, but it’s also a really good insight into … Humor is a great insight into what people really think, what they really think they’re hearing from you, and it’s a different kind of audio mirror to try to figure out what am I really like? When people listen to me, especially this audience that I’m leading this team, what are they really hearing and what does it mean for me in regards to how I present the information or the coaching?
Deep Listening: Impact Beyond Words the book is available via Amazon or at oscartrimboli.com/books. It’s organized in a really practical way around the five levels of listening. Whether I speak on stage or listeners who email me after listening to the podcast, the most common question is, “What’s the most practical tip I can give somebody to improve their listening?” It always starts with the level one, listen to yourself. How do you prepare yourself so that you can listen to somebody else in the dialogue? The deeper that you breathe, the deeper you listen. Check out the book, Deep Listening: Impact Beyond Words, and you’ll be able to move from an unconscious listener to a deep and powerful listener.
I started as a teacher and conflict resolution peer mediator, and now I wind up in an artificial intelligence company. But there certainly is a common thread in regards to listening that’s very germane to what we’re talking about today. But specifically for Speakeasy AI, I helped launch this company primarily through an idea and an investment from my former CEO, as well as a product created by a developer from my previous company who is now our CTO. The fourth member of our team was a former customer of mine at my previous company, and listening to him to drive the product placement has been key.
But overall, what I found in my last company was that we were working on these systems commonly known as chat bots or virtual assistants. They’re artificial intelligence algorithms essentially, and software and platforms, that at their core are trying to emulate human conversation. Primarily, these systems are doing that through typing. You might type to one on the web or type to it in a mobile app or message it through Facebook messaging or a Twitter bot or what have you, and what we found with these solutions is that with the advent of things like Amazon Alexa and Google Home, suddenly people wanted to talk to things again. We went from everyone staring at their phone and typing to it very quickly to voice is new again. Suddenly people are willing to pick up the phone or press a microphone and talk to things.
The irony of it all, that these typing systems which at their core were trying to emulate human conversation, have not easily been adaptable back to where maybe the idea started, which is voice conversation. These conversational systems are difficult to talk to. When you speak to Alexa, she does some cool things, but she’s not overly intelligent. When you speak to her, what actually happens at her core, you can see it on the app if you play with it, it translates every word you say speech to text word for word, not in conjunction with each other, not holistically, not in a way where it says, “Let’s wait and hear the whole thing and then translate to figure out what this intent or what the person means or what I’m trying to understand.” But what Alexa does is word for word, each word parsed out individually, transcribe it, and then push it into a typing conversational AI system, so it pushes it into some kind of AI platform, that is then trying to take those transcribed words and then repeat the process for output. Take the words, somehow figure out what the answer might be, push that answer back, and then turn it back into voice.
What we found is that there was a gap in the performance of these voice AI systems. That gap, interestingly enough, is that AI isn’t properly listening. Artificial intelligence is listening to transcribe, almost like a court stenographer. It’s listening to simply type out word for word what you are saying, and then push it somewhere, like push it to a human. If you hit the microphone button because you want to text me while you’re driving … I know some people are guilty of doing that. It’s not a good thing. But some people will hit that button, speak, and then send the text along. There’s no context or memory or any kind of understanding of maybe what the intent is with that. It’s going to not be understood until the receiver receives it. In the same way for artificial intelligence systems, what our hypothesis has been and what we think we’re proving out, is that that attempt to understand should start from the moment someone says, “Hello.” From the moment someone starts speaking and uttering things in voice, that listening needs to be in an artificial intelligence manner, just like a human, authentic and active.
The primary goal is to understand, not to respond. You need to know what is unsaid, as well as said. Where is this person coming from? What have they done previously? What about the language they’re using aligns with other language we’ve heard before? And the technology has to be ready to serve, but not jam the response and learn through failure. So often with these artificial intelligence systems, that back and forth conversation emulation, which isn’t really conversational, is only getting better based on mistakes. It’s only getting better based on failure. “Well, this failed, so what do we do next time?” What we’re proposing is that if you have a true listening capability with an AI platform and a voice conduit, you should be able to show reports and insights and thoughts around this is what was understood and the litany of possibilities that are open based on this understanding. But we didn’t jam the solution to the customer until whoever the administrator or whoever the person is who’s going to oversee this type of AI system has decided that that’s the right solution.
If I’m calling a large financial services brand and I say something like, “I’d like to open a home loan,” because of my Philly accent, a good amount of times a traditional speech to text system that’s just listening word for word and trying to do a transcription is going to say, “I am home alone,” as in all by myself without anyone hanging out with me. Whereas, because our system is saying, “You’re calling a bank and maybe we know who you are. We know you have a mortgage with us. Maybe we know you’ve been to a branch before,” and when we hear that utterance, we’re going to simply, before we say, “I can’t help you with that because you’re home alone and I’m not really in the business of making you feel like you have company,” we might say to you, “Oh, are you asking about a home loan?” That’s a nuanced difference.
It’s not a huge difference for a human. But for an artificial intelligence platform, to leverage the context to say, “This conversation has to be about home loans,” as in mortgages or things of that nature as opposed to someone talking about being alone in their house, what that does is, the accuracy lift is nice, but it’s not a huge difference, right? It’s going to reduce frustration for the customer. But ultimately at scale when you’re fielding … The future will be how many of these voice inquiries will be coming in in a week, in a month for these large brands? When at scale, you can start to make connections on how we can be better listeners with our technology platforms, which is what this is enabling. A big piece of our solution is the ability to get better based on volume, because the more context and memory you have, the more the machine learns and can try to figure out how to connect those dots, and also how quickly you can improve things at scale over time. How quickly can you say, “I know these are the types of conversations that my customer is having”?
Now right there, that’s an example more of accent or speech to text not incorporating any kind of filter around accent or context or nonverbal communication because it’s just looking for word for word. Not so much did it fail, it just didn’t listen in the way that we’re trying to listen, you know? That active listening approach to AI. That ability to listen maintaining context is really what we think the future is.
What do you suggest those things are that we can learn from how software listens versus what can we teach software from how humans listen?
It’s interesting because when I think about the critical nature of business communications, and if I’m being honest it’s primarily because of the brand reputation and the monetary connection involved. Listening can’t be done in a way that has you jam solutions or jam answers before that understanding is achieved. Understanding is always the goal of listening, and understanding requires you to do more than just listen to the words someone is saying, but bring as much to the table about them, and even acknowledge what you can and can’t do as a listener, and acknowledge that you’re going to have to try and endeavour to get better upon every listening each time.
I think these elements or variables in play for our software platform are very much human. Maybe in a lot of ways that’s the irony of AI, if AI’s going to be done right, is it’s not trying to replace humans. It’s trying to help humans be more human, if it’s done right. If a business or an organization is having trouble listening to five million people in a certain period of time, but this technology can help them truly listen to them at scale, the opportunities are very exciting and compelling.
All my work is built around the power and healing qualities of listening free of judgment and comparison, and so I recognize in Oscar’s beautiful book the profound truth in what he says. More than this though, he describes the importance of listening in a very accessible and elegant way. It’s hard to deny anything so clearly stated, and the steps to take to improve our own listening seems so clear and compelling. A simple yet powerful book, it should be everyone’s companion on the journey back to a proper balance between speaking and listening, and thus a healthier outlook on the world.
If you’re enjoying the series, the best way to stay up to date is to subscribe via your favourite podcast application, Apple Podcasts, Spotify, and now available on Amazon Alexa. We’d love to hear you feedback as well. We’re always listening for ways to improve the show, so please leave a review on your favourite podcast application as well. You know we’ll be listening.
As I start each interview, I never have a prescribed approach to listening to the guest. I don’t have a script. I don’t have a set range of questions. I simply trust the process that by listening deeply to them, I’ll discover something valuable for you and how it will help you become a better listener. If I didn’t have this approach, I might have missed the amazing range of stories that Frank had to tell about listening as a basketball coach. I probably took as much out of listening to him as a basketball coach as I did where he deconstructs the way software is now programmed to listen. If someone could listen to you often enough that they did a great impersonation of you, I think that’s a brilliant example of deep listening. Not only are they listening to what you say, they’re also listening to how you say it and how your body moves when that happens. I think great impersonators are deep listeners.
One of the things that surprised me in the interview is I learned that software needs to engage with the complete context of the sentence and then the dialogue to make sense of it, especially by listening to what’s unsaid. How much time are you spending just listening to words rather than taking the time to listen to the complete sentence and the dialogue so you can make more meaning of what’s said. Teaching software to listen for context, unsaid, and meaning is the next frontier of listening. I think that’s not only true for software, but equally true for humans.
Thanks for listening.