How It Works: Audio Compression

The term “compression” is often a source of confusion when discussing digital music. There are two kinds of compression. The first is the kind used to compress the size of files; this is data compression. There is lossy compression, using with MP3 and AAC files, and lossless compression, used with FLAC and Apple Lossless formats.

But the other kind of compression, dynamic range compression, is the much derided method of limiting the amount of dynamic range in music. The point of dynamic range compression is to make less of a difference between the quietest parts of a piece of music and the loudest parts. Most music is compressed as part of the recording and mastering process, because it does sound a lot better, and keeps you from blowing out your speakers. But over-compressing music makes it sound like crap.

The best way to understand dynamic compression is to look at a couple of audio waveforms. The screenshots below were made using Rogue Amoeba’s Fission audio editor.

Here’s a song which is free on iTunes today. I chose this one because, well, any free pop single is likely to be heavily compressed, and this example shows that I’m not wrong.

001.png

You can see two things in this waveform. The first is that the song is almost universally loud; the waves show the loudness. The second thing to notice is that there is a lot of clipping; audio volume that hits the top of the available limit. This is bad. As Wikipedia says:

Music which is clipped experiences amplitude compression, whereby all notes begin to sound equally loud because loud notes are being clipped to the same output level as softer notes.

Excessive compression has led to what is known as the loudness wars. This is when record producers make their songs louder and louder so they stand out against other songs. Generally, the human brain perceives louder music to be better, so additional loudness can make a song more compelling. But, in the end, all this has done is made lots of loud, clipped songs.

Here’s an example of a song which is not compressed. This is Pink Floyd’s Wish You Were Here:

002.png

You can see the difference in two places in this screenshot. In the overall timeline at the top of the window, you can see that the music has a shape; in the first screenshot of the free pop single, it’s just one long mass of sound. And in the actual waveform, you can see that there is modulation, and no clipping, in the Pink Floyd song.

The difference is that you may play your Pink Floyd song at a louder volume, in order to hear the quiet parts of the song, but the louder parts will be, well, loud. In the first song, the entire song is loud, and you’re likely to become fatigued more quickly after listening to music like that.

For good examples of audio that is not compressed – or only very slightly – watch a movie. In general, movie audio is not compressed; this is why the dialog is often too soft, but the special effects are too loud. This is why you often need to adjust the volume for movies with lots of explosions, otherwise your ears hurt. (You may have an AV receiver which has a dynamic range compression feature; if you’ve turned this on, you may not hear such large differences in volume.)

Dynamic range compression isn’t a bad thing; it’s just bad when it’s overdone, as is the case in much popular music today.

10 thoughts on “How It Works: Audio Compression

  1. Check out the waveform of Donovan’s wonderful Atlantis. That’s a song that really would be destroyed by amplitude compression…

  2. Check out the waveform of Donovan’s wonderful Atlantis. That’s a song that really would be destroyed by amplitude compression…

  3. After finding a pile of my old CDs in the garage awhile back, I compared the dynamic range of a large portion of my album collection with their remastered counterparts. Across the board, the earlier releases had better dynamic range. I think this is a big reason that albums I used to listen to for long periods of time never used to cause listening fatigue but fatigue sets in quickly when I listen to the remastered versions of the same tracks/albums. The remasters also sounded very flat and dull, compared to the original releases, likely due to the fact that everything is pretty much at the same loud volume in the remasters, with little to no “headroom.” In the end I’m glad I was able to find my original collection of pre-loudness war era CDs. Unfortunately, there isn’t much I can do about the poor dynamic range & production or new releases, other than vote with my wallet and support artists that pay attention to these things, or just listen to what I already have.

  4. After finding a pile of my old CDs in the garage awhile back, I compared the dynamic range of a large portion of my album collection with their remastered counterparts. Across the board, the earlier releases had better dynamic range. I think this is a big reason that albums I used to listen to for long periods of time never used to cause listening fatigue but fatigue sets in quickly when I listen to the remastered versions of the same tracks/albums. The remasters also sounded very flat and dull, compared to the original releases, likely due to the fact that everything is pretty much at the same loud volume in the remasters, with little to no “headroom.” In the end I’m glad I was able to find my original collection of pre-loudness war era CDs. Unfortunately, there isn’t much I can do about the poor dynamic range & production or new releases, other than vote with my wallet and support artists that pay attention to these things, or just listen to what I already have.

  5. There are good reasons to employ audio (amplitude) compression, some are commercially driven and some are to improve the playback experience. As for the playback experience, virtually nobody listens to music in a serenely quiet environment and there is need to raise the level of low level content above the noise, be it a car, the home or where ever. Broadcasters are challenged by overly dynamic content and invariably require some compression. Recording and mastering engineers are always working to strike the right balance to make sure that what they produce sounds good on the bulk of playback systems ranging from smart phones to car stereos to main stream systems. People with audiophile quality systems are not given priority in terms of target audio quality and compromises are not limited to amplitude compression. I really would like it if there wasn’t so much deriding of perceptual encoding through the term lossy compression though, removing content that is not reliably detectable by listeners is missing the mark. The greatest impact to the quality of the audio experience is more to do with such things brought to light by Kirk.

  6. There are good reasons to employ audio (amplitude) compression, some are commercially driven and some are to improve the playback experience. As for the playback experience, virtually nobody listens to music in a serenely quiet environment and there is need to raise the level of low level content above the noise, be it a car, the home or where ever. Broadcasters are challenged by overly dynamic content and invariably require some compression. Recording and mastering engineers are always working to strike the right balance to make sure that what they produce sounds good on the bulk of playback systems ranging from smart phones to car stereos to main stream systems. People with audiophile quality systems are not given priority in terms of target audio quality and compromises are not limited to amplitude compression. I really would like it if there wasn’t so much deriding of perceptual encoding through the term lossy compression though, removing content that is not reliably detectable by listeners is missing the mark. The greatest impact to the quality of the audio experience is more to do with such things brought to light by Kirk.

  7. In defense of the loudness wars, the ultimate problem is the technical limitations in consumer playback systems. It’s not the music producers who are to blame, it’s the CD player makers. And today’s music is still mastered for CD.

    If you look at your example of listening to a movie and the dialogue and music has too high a dynamic range, the fix for that is to put a dynamic range adjustment in the playback system. So you give the consumer the full dynamic range in the content, and their playback system enables them to apply dynamic range compression to suit their room and taste and playback volume. That hasn’t existed on consumer music players until recent iPods, and that feature — SoundCheck — is not enabled yet by default because it also requires that the music content be remastered with a high dynamic range. That is what Apple’s “Mastered for iTunes” program is about. Apple is asking music producers to stop giving them 16-bit 44.1kHz (CD mastered) audio with the dynamic range totally crushed, and instead provide 24-bit 96kHz (music studio standard) audio with the dynamic range intact. Then SoundCheck can be enabled by default on all Apple players and listeners can hear all the songs in a playlist at a similar volume while still enjoying a much larger dynamic range.

    Keep in mind that in music, people often listen to 10 artists an hour, whereas in movies, it is one artist every 2 hours. If you have to adjust the volume control at the beginning of every new movie, it is not really a problem. Same with CD to some extent because you were more likely to listen to just one album in an hour. But if you are listening to an MP4 playlist of 10 artists and you have to adjust the volume control at the beginning of every new song, it is a problem. The listener would be riding the volume control the whole time they listen.

    In the early days of the Web, the browsers could only show 256 colors at a time. So the Web always looked posterized. It wasn’t that Web developers wanted to suck all the colors out of their 16 million color images and show them to Web readers in only 256 colors — that is what the technical limitations of the browsers demanded of them. It is the same with music producers crushing the dynamic range of their music. When there is a consumer playback system that can play 24-bit 96kHz full dynamic range music from 10 different artists as a listenable mix, then we will have surpassed those technical limitations. Just like the Web of today can now show you 16 million colors.

    In short, it all comes down to the fact that we are still in the CD era of music, which is 1980 technology. Even though we are using MP4 files, they are all basically rips of a CD. We have to move the producers and the music players into the post-CD era to get past the problems of the CD era such as the loudness wars.

  8. In defense of the loudness wars, the ultimate problem is the technical limitations in consumer playback systems. It’s not the music producers who are to blame, it’s the CD player makers. And today’s music is still mastered for CD.

    If you look at your example of listening to a movie and the dialogue and music has too high a dynamic range, the fix for that is to put a dynamic range adjustment in the playback system. So you give the consumer the full dynamic range in the content, and their playback system enables them to apply dynamic range compression to suit their room and taste and playback volume. That hasn’t existed on consumer music players until recent iPods, and that feature — SoundCheck — is not enabled yet by default because it also requires that the music content be remastered with a high dynamic range. That is what Apple’s “Mastered for iTunes” program is about. Apple is asking music producers to stop giving them 16-bit 44.1kHz (CD mastered) audio with the dynamic range totally crushed, and instead provide 24-bit 96kHz (music studio standard) audio with the dynamic range intact. Then SoundCheck can be enabled by default on all Apple players and listeners can hear all the songs in a playlist at a similar volume while still enjoying a much larger dynamic range.

    Keep in mind that in music, people often listen to 10 artists an hour, whereas in movies, it is one artist every 2 hours. If you have to adjust the volume control at the beginning of every new movie, it is not really a problem. Same with CD to some extent because you were more likely to listen to just one album in an hour. But if you are listening to an MP4 playlist of 10 artists and you have to adjust the volume control at the beginning of every new song, it is a problem. The listener would be riding the volume control the whole time they listen.

    In the early days of the Web, the browsers could only show 256 colors at a time. So the Web always looked posterized. It wasn’t that Web developers wanted to suck all the colors out of their 16 million color images and show them to Web readers in only 256 colors — that is what the technical limitations of the browsers demanded of them. It is the same with music producers crushing the dynamic range of their music. When there is a consumer playback system that can play 24-bit 96kHz full dynamic range music from 10 different artists as a listenable mix, then we will have surpassed those technical limitations. Just like the Web of today can now show you 16 million colors.

    In short, it all comes down to the fact that we are still in the CD era of music, which is 1980 technology. Even though we are using MP4 files, they are all basically rips of a CD. We have to move the producers and the music players into the post-CD era to get past the problems of the CD era such as the loudness wars.

  9. I often run sound for beach weddings through small, battery powered speakers. They don’t have a lot of volume (SPL kind… not spatial) because the internal amp is powered by a small battery. So I *heavily* compress (in Soundtrack Pro) all the music that will be playing from an iPod (I also EQ out the lows since the speaker cone isn’t very large). That way I get significantly louder music on the beach, which competes with the wind and the waves much better, and I don’t have to worry too much about quality because of the background noise.

  10. I often run sound for beach weddings through small, battery powered speakers. They don’t have a lot of volume (SPL kind… not spatial) because the internal amp is powered by a small battery. So I *heavily* compress (in Soundtrack Pro) all the music that will be playing from an iPod (I also EQ out the lows since the speaker cone isn’t very large). That way I get significantly louder music on the beach, which competes with the wind and the waves much better, and I don’t have to worry too much about quality because of the background noise.

What do you think?

This site uses Akismet to reduce spam. Learn how your comment data is processed.