Everything You Need To Know About Digital Audio Signals

Throw out what you think you know about digital audio signals and get your facts straight in 20 minutes with Monty.

This should be required viewing for music and audio people.

Youtube Video Description:

Monty at Xiph presents well thought-out and explained, real-time demonstrations of sampling, quantization, bit-depth, and dither on real audio equipment using both modern digital analysis and vintage analog bench equipment.
grintersays...

That was excellent.
I got a stairstep on an oscilloscope coming out of an analog output a while ago. Spent a few days trying to fix it, and never could. I wonder if my 'analog' oscilloscope was actually digital?

MilkmanDansays...

This goes beyond my knowledge level of signals and waveforms, but it was very interesting anyway.

That being said, OK, I'm sold on the concept that ADC and back doesn't screw up the signal. However, I'm pretty sure that real audiophiles could easily listen to several copies of the same recording at different bitrates and frequencies and correctly identify which ones are higher or better quality with excellent accuracy. I bet that is true even for 16bit vs 24bit, or 192kHz vs 320kHz -- stuff that should be "so good it is impossible to tell the difference".

Since some people that train themselves to have an ear for it CAN detect differences (accurately), the differences must actually be there. If they aren't artifacts of ADC issues, then what are they? I'm guessing compression artifacts?

In a visual version of this, I remember watching digital satellite TV around 10-15 years ago. The digital TV signal was fine and clear -- almost certainly better than what you'd get from an analog OTA antenna. BUT, the satellites used (I believe) mpeg compression to reduce channel bandwidth, and that compression created some artifacts that were easy to notice once somebody pointed them out to you. I specifically remember onscreen people getting "jellyface" anytime someone would nod slowly, or make similar periodic motions. I've got a feeling that some of the artifacts that we (or at least those of us that are real hardcore audiophiles) can notice in MP3 audio files are similar to an audio version of that jellyface kind of issue.

PHJFsays...

I feel like he just went through four years of audio engineering classes in twenty minutes. I may not have understood half of what he said, but it sure sounded interesting!

bmacs27says...

I'm still worried about phase. The argument is that he can represent any phase he wants. I challenge him to represent different phases of his Nyquist frequency without the reconstruction losing power. He keeps saying "band limited", which I don't believe to be exactly true. I agree, the ear can only detect powers at frequencies below 22.1k, I'm not convinced it's ability to detect phase shifts is limited in the way you would expect with a digital signal with a cutoff at that frequency. For instance, the human ear can localize an impulse with accuracy down to about 10 microseconds. I can't see how a Dirac function can be localized that accurately by a sampled wave unless the system acted like a 100K sampled system. The latter, IMHO, is supported by the neuro-anatomy. There are different mechanisms for identifying pitch and onset. The quote-unquote Calyx of Held neurons carry the phase information, and are designed to fire with astonishing precision. Much more temporal precision than would be predicted from the "nyquist frequency" of the place coding subset of 8th nerve ganglia. I understand that this is what he was trying to address with his bit at the end, but he kept insisting on "band limited" inputs. Pressure waves aren't band limited dodge-rammit.

hamsteralliancesays...

Going from 16 bits, to 24 bits will lower the noise floor which, if you have the audio turned up enough, you can hear it ever so slightly. It's not a huge difference and you're not going to hear it in a typical song. It's definitely there, but it's already insanely quiet at 16 bits. An "Audiophile" on pristine gear may notice the slight change in hiss in a moment of silence, with the speakers cranked up - but that's about it.

As for pushing up the sampling rate, when you get beyond 44.1kHz, you're not really dealing with anything musical anymore. All you're hearing, if you're hearing it at all, is "shimmer". or "air". It sounds "different" and you might be able to tell which is which, but it's one of those differences that doesn't really matter in effect. A 44.1khz track can still make ear-piercingly high frequencies - the added headroom just makes it glisten in a really inconsequential way.

This is coming from 17 years of music production. I've gone through all of this, over and over again, testing myself, trying to figure out what is and isn't important.

At the end of it all, I work on everything in 16bit 48kHz - I record audio files in 24 bit 48 kHz - then export as 16 bit 44.1kHz. I don't enable dither anymore. I don't buy pro-audio sound cards anymore. I don't use "studio monitors" anymore. I just take good care of my ears and make music now.

MilkmanDansaid:

However, I'm pretty sure that real audiophiles could easily listen to several copies of the same recording at different bitrates and frequencies and correctly identify which ones are higher or better quality with excellent accuracy. I bet that is true even for 16bit vs 24bit, or 192kHz vs 320kHz -- stuff that should be "so good it is impossible to tell the difference".

MilkmanDansays...

Thanks for the reply and sharing your expertise -- sounds like you'd confirm everything that the video said.

This probably just displays my ignorance more, but specifically with regards to the MP3 format, do you think it adds any noticeable compression artifacts even at high-quality settings? Part of my problem was that I was thinking of MP3 *bit*rate as sampling rate (128 kbit/s = 128 kHz, which is not at all correct). But still, MP3 is a lossy format (obviously since one can turn a 650M CD into ~60M of 128k MP3s, or still a large filesize savings even for 320k) and even my relatively untrained ear can sometimes hear the difference at low (say, 128k or lower) bitrates.

I guess that a music producer wouldn't record/master anything in a compressed format like MP3, so that is sort of entirely separate from the point of this video and your comment. But just out of curiosity, do you think that people can detect differences between a 16 bit 44 kHz uncompressed digital recording (flac maybe?) and a very high quality MP3 (say, 320 kbit)?

hamsteralliancesaid:

Going from 16 bits, to 24 bits will lower the noise floor which, if you have the audio turned up enough, you can hear it ever so slightly. It's not a huge difference and you're not going to hear it in a typical song. It's definitely there, but it's already insanely quiet at 16 bits. An "Audiophile" on pristine gear may notice the slight change in hiss in a moment of silence, with the speakers cranked up - but that's about it.

As for pushing up the sampling rate, when you get beyond 44.1kHz, you're not really dealing with anything musical anymore. All you're hearing, if you're hearing it at all, is "shimmer". or "air". It sounds "different" and you might be able to tell which is which, but it's one of those differences that doesn't really matter in effect. A 44.1khz track can still make ear-piercingly high frequencies - the added headroom just makes it glisten in a really inconsequential way.

This is coming from 17 years of music production. I've gone through all of this, over and over again, testing myself, trying to figure out what is and isn't important.

At the end of it all, I work on everything in 16bit 48kHz - I record audio files in 24 bit 48 kHz - then export as 16 bit 44.1kHz. I don't enable dither anymore. I don't buy pro-audio sound cards anymore. I don't use "studio monitors" anymore. I just take good care of my ears and make music now.

jmdsays...

I am still going through his last video which I think he takes on compression, but I can tell you right now as for mp3, it is cd-quality, but it is not cd. Even at high bitrates, the high frequencies get hit hard. It is pretty sad we continue to let our music suffer with a lot of people still compressing to mp3. If you look hard enough though you will find people using FLAC, and apples high bitrate AAC files are great. Anime fansubs which are probably more fickle about quality and standards then the Hollywood movie pirate scene are now all using AAC in their mp4 file instead of bad old mp3. Although in its defense, MOST movie rips are AC3/DTS, or at least offer it aside long its MP3 stereo track.

MilkmanDansaid:

Thanks for the reply and sharing your expertise -- sounds like you'd confirm everything that the video said.

This probably just displays my ignorance more, but specifically with regards to the MP3 format, do you think it adds any noticeable compression artifacts even at high-quality settings? Part of my problem was that I was thinking of MP3 *bit*rate as sampling rate (128 kbit/s = 128 kHz, which is not at all correct). But still, MP3 is a lossy format (obviously since one can turn a 650M CD into ~60M of 128k MP3s, or still a large filesize savings even for 320k) and even my relatively untrained ear can sometimes hear the difference at low (say, 128k or lower) bitrates.

I guess that a music producer wouldn't record/master anything in a compressed format like MP3, so that is sort of entirely separate from the point of this video and your comment. But just out of curiosity, do you think that people can detect differences between a 16 bit 44 kHz uncompressed digital recording (flac maybe?) and a very high quality MP3 (say, 320 kbit)?

hamsteralliancesays...

tl;dr: No, not really and no, probably not.

-

MP3 compression methods are pretty good these days. A well encoded mp3 sounds quite good at 224k. 320k is ideal, but 224k sounds fine to me.

I think most people would be incredibly hard pressed to tell the difference between a well encoded 320k MP3 and a FLAC file.

To showcase this and hopefully answer your question through demonstration, I've put together an odd sound file here for ya: http://dl.dropboxusercontent.com/u/837649/soundtest.wav

It's a 24bit 48kHz wav file of a piece of bright and full audio thrown together just for this (using 24-bit 48kHz audio sources). The audio loops a few times and each time it loops it's in a different format or quality.

The odd part is that I've dropped the audio volume down all the way to just barely above the 16-bit noise floor before exporting into each format, then cranked the volume back up again. Just to see what would happen.

Anyway, the play order is as follows:
1. Original 16-bit audio (sound normal, as it should.)
2. 16-bit audio re-gained (noise city - the 16-bit FLAC was the same.)
3. 24-bit audio re-gained (Sounds as good as the original.)
4. FLAC 24-bit re-gained (Sounds as good as the original.)
5. MP3 8 k re-gained (What?)
6. MP3 64 k re-gained (Sounds like a bad MP3, because it is. but, do note it's mostly just dull and a bit unstable sounding, not all weird like the 8k one.)
7. MP3 128 k re-gained (Pretty good, but still a bit dull. Not horrible though.)
8. MP3 224 k re-gained (Sounds as good as the original? Pretty close, I'd say.)
9. MP3 320 k re-gained (Sounds as good as the original as far as I'm concerned.)

This is just one test though. There are most certainly songs or sounds out there that wouldn't fare as well as this one. No idea what those would be though, as everything I've MP3-ified in the last decade or so sounds absolutely fine to me.

MilkmanDansaid:

Thanks for the reply and sharing your expertise -- sounds like you'd confirm everything that the video said.

This probably just displays my ignorance more, but specifically with regards to the MP3 format, do you think it adds any noticeable compression artifacts even at high-quality settings? Part of my problem was that I was thinking of MP3 *bit*rate as sampling rate (128 kbit/s = 128 kHz, which is not at all correct). But still, MP3 is a lossy format (obviously since one can turn a 650M CD into ~60M of 128k MP3s, or still a large filesize savings even for 320k) and even my relatively untrained ear can sometimes hear the difference at low (say, 128k or lower) bitrates.

I guess that a music producer wouldn't record/master anything in a compressed format like MP3, so that is sort of entirely separate from the point of this video and your comment. But just out of curiosity, do you think that people can detect differences between a 16 bit 44 kHz uncompressed digital recording (flac maybe?) and a very high quality MP3 (say, 320 kbit)?

MilkmanDansays...

@hamsteralliance - Great comment and demo file, above and beyond the call of internet forum duty!

To my ears and with my speakers, I agree with your comments. I think that I can *just* distinguish between the 224 k and 320 k, but I don't have much confidence of doing so reliably in a blind test. 128 k versus 224 k or 320 k I think I could do with a reasonably high accuracy. 320 k versus original -- I must admit I'd just be blindly guessing.

Again thanks for the demo file and going above and beyond to answer my question and letting me (and anyone else here) see real results of various settings for myself!

jmdsays...

Hamster, I find its heavy cymbal use, especially during fairly busy pieces that is where mp3 falls apart. Even at 320k it still suffers a warble sounding artifact.

charliemsays...

Great video, but id like to see him expand this into digital modulation, ala QPSK / QAM, and how we use it to deliver data streams of whatever the hell we want.

He didnt focus enough on fundemental frequency noise products either, carrier tripple beat / second+ order distortions. Pretty important because a single band can cause noise in heaps of places across the spectrum.

....I work in telecoms, this audio video seems like fundementals to me

jmdsays...

charliem, my guess is that was not the intent. The issues you bring up seem to be a non issue when dealing with digital audio editing and playback in terms of music/movies and the pc in either a consumer or professional environment. His first video touched with a number of subjects dealing with video formats, audio formats, containers, and how their streams are packed.

Januarisays...

Yeshhhh... not to many vidoes make me feel as dumb as this one did. It seemed like he was trying to explaining it as simply as possible and i'd be lying if i said by about the 3 minute mark i wasn't completely lost.

CreamKsays...

It's been tested and the "best" audiophiles can't hear differences between 14bit and 16bit, nor can they hear differences between 44.khz and ANYTHING higher. In some tests they could use12bit sound with 36khz sampling frequency... The differences they hear are inside their head. Thus the description of improved sound is always "air", "brilliance", "organic" etc.. Don't be fooled by their fancy gear, most of it is for nothing. Cables: i am always willing to bet my months salary on doubleblind tests, 10 000€/m against a coat hanger, no audible differences.. It's all about confirmation bias, you think there's a change and suddenly you hear it.

About MP3s vs PCM:
Here we have audible differences. But. Put on high enough energy, ie turn your amp high enough, suddenly double blind studies can't find which is which. But it can be audible, mp3 is lossy format and even 320kbps can be heard. Not with all material, it's about in the limits of human hearing. Some might hear high end loss, if you're in your twenties. Once you hit 40, everything above 17khz is gone, forever. You will never hear 20k again. And to really notice the difference, you need good gear. Your laptop earphone output most likely won't even output anything past 18khz well and it's dynamic range can be represented with 8bit depth.. It can be just horrible. Fix that with usb box, around 80€: you can take that box anywhere on planet to the most "hifiest" guy out there and he can't hear the difference between his 10000€ A/D converter.. In fact, 5€ A/D converter can produce the same output as 3000€ one... That's not why i said buy a external.. It's more to do with RF and other shielding, protection against the noises a computer makes than A/D conversion quality. Note, i'm talking about audible differences, you can find faults with measuring equipment and 95% of the gear price is about "just to be sure".

If you want a good sound, first, treat your room. Dampen it, shape it.. If you spent 10k on stereo and 0 on acoustics, you will not have a good sound no matter what you do. Spend the same amount on acoustics than what you do on you equipment, room makes a lot more differences than gear. Next comes speakers, they are the worst link in the chain by a large margin. Quality costs, still wouldn't go to extremes here either, the changes are again "just to be sure", not always audible.. Then amps, beefy, low noise, A/B. You don't need to spend a huge lot of money but some. Then cables.. Take the 50€ version instead of 300€ or 3000€. Build quality and connectors, durability. Those are the reason to buy more expensive than 5€. Not because of sound quality.. There will always be group of people that will swear they can hear the differences, that's bullcrap. Human ear CAN NOT detect any chances, even meters are having a REALLY hard time getting any changes. You need to either amp up the signal to saturation point, or use frequencies in the Mhz ranges, thousands of times higher than what media needs to get any changes between cheapest crap and high end scams.

Audiophiles can't be convinced they are wrong, they are suffering from the same thing antivax people do: give them facts, they will be even more convinced they are right.

MilkmanDansaid:

This goes beyond my knowledge level of signals and waveforms, but it was very interesting anyway.

That being said, OK, I'm sold on the concept that ADC and back doesn't screw up the signal. However, I'm pretty sure that real audiophiles could easily listen to several copies of the same recording at different bitrates and frequencies and correctly identify which ones are higher or better quality with excellent accuracy. I bet that is true even for 16bit vs 24bit, or 192kHz vs 320kHz -- stuff that should be "so good it is impossible to tell the difference".

Since some people that train themselves to have an ear for it CAN detect differences (accurately), the differences must actually be there. If they aren't artifacts of ADC issues, then what are they? I'm guessing compression artifacts?

In a visual version of this, I remember watching digital satellite TV around 10-15 years ago. The digital TV signal was fine and clear -- almost certainly better than what you'd get from an analog OTA antenna. BUT, the satellites used (I believe) mpeg compression to reduce channel bandwidth, and that compression created some artifacts that were easy to notice once somebody pointed them out to you. I specifically remember onscreen people getting "jellyface" anytime someone would nod slowly, or make similar periodic motions. I've got a feeling that some of the artifacts that we (or at least those of us that are real hardcore audiophiles) can notice in MP3 audio files are similar to an audio version of that jellyface kind of issue.

VoodooVsays...

I only got about a 1/3 of the way through it before it went over my head. Still too much jargon that I didn't understand.

I need the 101 class

Send this Article to a Friend



Separate multiple emails with a comma (,); limit 5 recipients






Your email has been sent successfully!

Manage this Video in Your Playlists




notify when someone comments
X

This website uses cookies.

This website uses cookies to improve user experience. By using this website you consent to all cookies in accordance with our Privacy Policy.

I agree
  
Learn More