Endless Thread - Bonus: Yanny Vs. Laurel... Vs. Yaley?
Episode Date: May 16, 2018We bring you a bonus episode that takes on the "Yanny vs. Laurel" debate that's dividing the Internet. ...
Transcript
Discussion (0)
Support for endless thread comes from MathWorks, creator of MATLAB and Simulink Software, to design and develop engineered systems, accelerating the pace of discovery in engineering and science. Learn more at Mathworks.com.
Support for WBUR comes from Is Business Broken, a podcast from the Mayrotra Institute at Boston University that explores questions like, why is innovation in healthcare so hard? Is ESG just greenwashing?
of course, is business broken? Listen, wherever you get your podcasts. Produced by the I-Lab
at WBUR, Boston. Hey, Emery. Hey, Ben. Yeah. Yeah. Team Yanny. Team Laurel. So, obviously,
this is not the usual day that you would get our podcast in your feed, but we are giving you a little
taste of something because the internet, including Reddit, kind of blew up last night around this
big debate involving this piece of audio that went viral and nobody can tell whether the piece
of audio is somebody or a computer saying Yanny or Laurel, right?
Yeah, including us. I mean, we were part of that heated debate last night where it was like,
Yanny or you're dead to me. So this is like every audio nerds dream to dig into something.
something like this, and that's what we're going to do.
And some of the reporting, the original reporting on this,
linked back to comments made on Reddit by Redditors who were reacting to this sound
and trying to figure out what it actually was, what was happening.
So we actually called up one of these Reditors and talked to him about it.
My name's Steve Backenstos.
I'm from Harrisburg, Pennsylvania.
I am a college student and audio engineer.
My Reddit handle is STVNTB.
Are you team Yanny or team Laurel?
I actually heard both.
You heard both even the first time?
Yeah, I heard both at the same time.
They come in and out of different levels of prevalence,
but they were both there from the get-go.
I think because I knew that it was going to be
some sort of weird split phenomenon
from the beginning.
I was prepared for that.
So if you go in having the knowledge, then it's easier to hear both of them?
Yeah, well, I did a little research here and there, and I think it was an Atlantic article, brought up a good point, that speech is made up entirely.
Like, it's just sounds that we've given significance to.
They have no innate value.
So when you're presented with something, like you visually see, hey, this is the word you're going to hear, and then you hear it.
you'll probably hear it and your brain will try and fill in those gaps even if you don't
entirely hear it your brain will be like yeah that's that's what you heard huh interesting so
when you were going to listen the first time did you expect to hear both yes but I wasn't sure to
what degree and after I heard it I kind of understood why you could hear it different ways
you know from just reading about it you expect it to be like the the the
black and blue dress or the white and gold dress where it's going to be just strikingly blatant
right there. You will immediately hear one or immediately hear the other. And you won't be able to
change your ability to hear either one. Well, I think that was most people's experience in the
beginning at least, or at least most people around me just anecdotally. And it was my experience, too.
I first heard Yanny over and over and over and try as I might. I couldn't hear Laurel to save my
life. And then, like, I don't even know. Like, I switched from phone to laptop and listened again,
and it was Laurel, and I couldn't hear Yanny. Yeah, that's when I was listening to it originally,
and when I made that whole post, the prevalent one was Laurel. And after like 20 minutes or so,
after I had gotten your email, I went and talked to my dad and was like, you would not believe the
email I just got and wanted to give him a little background and played it for him. And then I heard
Yanny more prevalent than Laurel.
Was he like, what drugs are you on?
And you were like, no, no, no.
The only drug I'm on is the internet, dad.
Yeah, we both kind of came to the conclusion of,
I should probably go do a little more research,
so I don't say something stupid on the phone.
So can you walk us through your Reddit comment?
Yeah.
And there's really, I think, two big parts to what's happening here.
and one is the frequencies of human speech and human hearing, and then how that works with audio formats.
Okay.
So human hearing runs from about 20 hertz to about 20,000 hertz or 20 kilohertz.
That's actually the range of human pitch perception.
So just because we can't hear above 20,000 hertz or below 20 hertz as a pitch, it doesn't mean we can't perceive it as something.
So that's why now moving over into audio file formats, you hear a lot of audio guys complaining about MP3s and stuff like that because that's a compression format.
And so in order to make file sizes small, the algorithm that compresses the audio and makes it smaller just says, well, we don't really need anything above 20,000 hertz or even more so.
We probably don't need anything above 16,000 hertz because likely over the age of, likely over the age of, likely,
like 18, you probably can't really hear above that.
So these algorithms will cut these audio files down more and more,
but it's not just a hard chop.
So some things are left over.
And that's what we refer to as artifacts or digital distortion.
And the best way to probably explain that to the layman is it kind of sounds like
the audio is pixelated.
And if you notice in that recording,
Larry, Larry.
It's very rough, very distorted around the edges.
And so you can tell that that's compressed.
And because it's voice synthesis, it's probably chopped down to between 80 and, I don't know, 4,000 hertz because of where the voice rests.
And then we're going to just take a quick ancillary here and talk about overtones versus fundamentals.
So whereas, like, the pitch of my voice talking to you right now, you're not actually hearing the fundamental frequency.
because of the way that telephones have traditionally worked.
The phone is only transmitting the overtones of my voice,
but your brain is able to piece that together.
So it doesn't sound like I have a high-pitched voice right now.
My brain paints a full picture,
even though the information is not a full picture,
which is like one of the sort of, I guess,
kind of magical things that the human brain does
in order to help us all make basic sense of the world
without going insane.
Yeah, exactly.
And that's really what it comes down to.
When you chop off that top information,
that 4,000 up to 20,000 hertz,
in order to keep the file size small,
you're taking away a lot of that overtone information
that really provides the character of the sound you're hearing.
If you were to take like a trumpet and a guitar
and chop everything down all of the frequencies out,
except for the fundamental,
you'd just be left with a sine wave.
and it all be the same sound.
So it's those overtones that really ultimately give a sound its character.
There was a tweet out that somebody had pitch shifted the sound up a certain percentage and down a certain percentage.
And when they shifted it up, you could no longer hear Yanny.
Laurel.
And I think that's because the word that I'm assuming the voice was going for,
that they had entered into the software, was Laurel.
And the Yanny was a result of that compression.
And by taking away a lot of the top of it, some of those artifacts that were left over, your brain pieced it together and said, oh, this is Yanny.
You know, I can kind of hear enough of the puzzle pieces to say, oh, it's Yanny.
Yenny.
But when you shift it up, those leftover artifacts are now pushed way out of the realm of human hearing.
and you're left with just that lower and, you know, the overtones closer to the fundamental are stronger than the ones away from it.
Sure.
So you're going to end up with things that are a little more prevalent.
So by forcing out some of those artifacts that made it sort of ambiguous, you're left with Laurel.
But when you shift it down, you're taking those artifacts that make you think that it's Yanny and you're putting those closer into like human hearing range and what you're
expecting or into human voice range.
Uh-huh.
And now you're able to hear Yanny a lot clearer because you're bringing all of those
artifacts down into, you know, the realm that you're expecting to hear.
Something that really kind of complements this, this story, well, is the native phonetic
alphabet.
You know, when you hear pilots and stuff, say, like, Alpha Delta, Niner and all that stuff,
that whole alphabet was created because when you're using radio and you're using telephones,
where those frequencies are cut, it's very hard to tell.
But if I was talking to you in person, we wouldn't be having a debate over if I'm saying the letter V as in Victor or B as in boy.
I think it's a very similar situation where, you know, it's just a misunderstanding based on the way the audio is compressed.
So it's not black magic.
It's just science.
It's just science.
Yeah.
Steve, thank you so much for explaining this to us.
Well, thanks for having me. This was cool.
All right, so that was Steve. But now we're going to turn to one of our buddies here at WBUR, John Parati. He's the manager of podcast production here. Hey, John.
Hi, guys.
So you're an audio nerd. This is your, this is kind of your jam, right?
How dare you? How dare you?
I'm so sorry, but this is your jam, right?
This is my domain.
Okay. So you are the perfect person to help us break.
this down. Yeah, we just talked about this a lot in theory, but John is here with us, with a computer,
and he did some of his black magic in order to really settle this with us once and for all.
Right. So what's the first thing you're going to play for us? Okay, so this is the original file.
Laurel. Lowell. Okay. So. Which team are you?
I'm team Laurel.
Laurel.
I heard Laurel that time.
Okay.
Okay.
We all agree.
This is amazing.
There's something strange about this file because people are hearing that other thing.
So I was like, let me see if I can get in there and figure it out.
You guys sent me some videos of people using EQ to kind of hear different things.
So I started with that.
And I played around with our isotope tools here.
And I got to this.
Now, do you hear there's something going on there, right?
Do you hear like a high, like, weird?
ghosty thing?
Yeah, and also, I'm now team Yanny.
Okay.
Because that sounds like Yanny, and it also, I hear, like, digital distortion that makes
me want to jump off a bridge.
Okay.
So what I heard was there's something way up there.
So everybody does the EQ thing, right?
And they're like, okay, but I was like, let's pitch this down and see if we can hear
anything, like, if it's a really high thing.
So I pitched it down.
You're hearing that?
Now it sounds like, yeah.
Yaley.
Yaly.
So I went back to the, I did all that work, and then I was like, okay, well, now I can go back to the original tape.
So here's the original tape.
Laurel.
Laurel.
Take it down an octave.
Now what do you hear?
Really.
Really?
Whoa.
Wait, I'll pitch it up.
Now you're just going to hear Laurel.
Laurel.
Laurel.
And then back down.
Really.
Really.
There are two files on top of each other.
There's somebody saying, yay.
but in the original here, it's like an octave higher.
So you had to really dig to find it, but it's just pitched up.
Yaly and Laurel are both right, because they're happening at the same time.
But wait, Yali is still different than Yanny, which a lot of people hear too.
Yanny, I think is the combo that people are hearing, but I mean, I think it's pretty clear that we're hearing Yalee here.
Yaly.
Yaly.
Yaly. I can even take it up a bit.
Yaly.
And then you can hear the point at which they start crossing right around here.
Yally.
So hearing both?
Yerry.
Yerry.
Yally.
Yally.
Yally.
My brain is breaking.
It's hard when things are solved right in front of you like magic.
I'm not giving up on Yanny, though, yet.
I'm not ready to give up on that.
You're telling me...
No, because I have clearly heard...
Yes, I agree.
That sounds like Yaley.
But I have heard Yanny.
And I have heard Yanny.
have heard Laurel.
I want to tell you something
as a fellow musician.
No, listen, sometimes
you have two instruments playing together
and they kind of make this third sound,
that's what's happening here. But really,
when you take it out, that's just a bass and a guitar.
Two things,
doing two different parts, just
at the same time. So we're hearing
like a new timbre
because of
two combined
timbers.
Yes.
But you're saying that your theory of what it actually is, is Yaley.
Yeah, it's Yaley and Laurel at the same time with Yaley being pitched up an octave over it.
Which makes Yanny and Laurel.
Yes.
Solved.
Solved, or this is just the beginning and we're going to hear from a lot of people who say that now.
We're crazy.
Of course they're going to say that.
But that is very clearly Yale.
That's Yili.
I'm team Yali now.
Yali.
Yaly.
Even if I go lower, it's even more clear.
Yali.
Lower.
Yeah.
Lower, it does sound more like Yaly than Yanny.
It becomes more separated from Laurel.
The more I pitch it down because then Laurel becomes so low you can't hear it.
Guys, I think my ass had just kicked in.
No, but this is like a weird thing.
this is like, we should say this is very much like the gold, the gold and white or blue and black
dress debate in which it's really about partially how people perceive sound. And in this case,
according to our homie, John Perotti, this is the case of two words being layered on top of
each other that actually end up creating a unique sound that can be.
misheard by people or heard by people as two other different words, essentially. Is that what it is?
I think so. I guess my last question, John, would be why is it that I listened, the first time I listened
to this audiophile at all, I so clearly heard Yanny. Ten minutes later, same computer, same headphones,
same everything. I heard Laurel as clearly as I had originally heard Yanny. So why is it
that with the same ears and all of the same equipment,
I'm hearing two distinctly very different things at different times.
I just think I'm not a neuroscientist, so I really can't.
Why not?
Yes, it just didn't try hard enough.
No, I just think it's like anything, right?
Like, you know, sometimes when you're listening to music
and you hear like rhythms, people will hear the rhythms differently.
It's just kind of where you jump in.
So if you just heard that your sound first,
like your brain does the rest of the work for you,
That's, I'm throwing that out there.
Mm-hmm.
And I now am more at the point where I hear both at the same time.
Yanny and Laurel, which you say at its core is really Yalee.
It is Yalee.
All right.
We're team Yelly.
Are we, we all team Yelie now?
We're all team Yelie.
Okay.
At Radio Lab, we love nothing more than nerding out about science, neuroscience, chemistry.
But, but we do also like to get into other kinds of stories.
Stories about policing or politics, country music, hockey, sex, of bugs.
Regardless of whether we're looking at science or not science, we bring a rigorous curiosity to get you the answers.
And hopefully make you see the world anew.
Radio Lab, Adventures on the Edge of what we think we know.
Wherever you get your podcast.
