Lossy compression
Lossy audio compression is used in an extremely wide range of applications. In addition to the direct applications (mp3 players or computers), digitally compressed audio streams are used in most video DVDs; digital television; streaming media on the internet; satellite and cable radio; and increasingly in terrestrial radio broadcasts. Lossy compression typically achieves far greater compression than lossless compression (data of 5-20% of the original stream, rather than 50-60%), by simplifying the complexities of the data. Given that bandwidth and storage are always limited, the trade-off of reduced audio quality is clearly outweighed for some applications where users wish to transmit or store more information. (For example, one can fit a lot more songs on their iPod using lossy than using lossless compression; and a DVD might hold several audio tracks using lossy compression in the space needed for one lossless audio track.)
In both lossy and lossless compression, information redundancy is reduced, using methods such as coding, pattern recognition and linear prediction to reduce the amount of information used to describe the data. For example, suppose you wanted to record twenty house numbers along one side of a street, each of which goes up by 2. If the first address was 14461, or five digits, the uncompressed stream would require 20 times 5 bytes, or 100 bytes, to store. You could recode that to take advantage of the repetition and simply say begin at 14461, increase by 2, repeat 19 times. Now the data are losslessly captured in just 8 bytes!
The innovation of lossy audio compression was to use psychoacoustics to recognize that not all data in an audio stream can be perceived by the human auditory system. Most lossy compression reduces perceptual redundancy by first identifying sounds which are considered perceptually irrelevant, that is, sounds that are very hard to hear. Typical examples include high frequencies, or sounds that occur at the same time as other louder sounds. Those sounds are coded with decreased accuracy or not coded at all.
While removing or reducing these 'unhearable' sounds may account for a small percentage of bits saved in lossy compression, the real savings comes from a complementary phenomenon - noise shaping. Reducing the amount of bits used to code a signal increases the amount of noise in that signal. In psychoacoustics based lossy compression, the real key is to 'hide' the noise generated by the bit savings in areas of the audio stream that cannot be perceived. This is done by, for instance, using very small amounts of bits to code the high frequencies of most signals - not because the signal has little high frequency information (though this is also often true as well), but rather because the human ear can only perceive very loud signals in this region, so that softer (noise) sounds 'hidden' there simply aren't heard.
To illustrate this by continuing with the example, suppose the data were more complex, so the difference between two house numbers was 4 in one instance, between the tenth and eleventh houses. Lossless coding would require something like this: begin at 14461, increase by 2, repeat 9 times, increase by 4, increase by 2, repeat 8 times. So 10, rather than 8 bytes, are needed to store the data. But if your model of lossy compression determines that difference was not relevant for the application, it might simplify the data to ignore the variation and increase the compression. However, some data are lost in the process, because the original data cannot be reconstructed from the lossy compression scheme; only an approximation of that data, determined to be sufficient for this application, can be recovered.
If reducing perceptual redundancy does not achieve sufficient compression for a particular application, it may require further lossy compression with a difference in quality that can be more readily perceived by a user. Most lossy compression schemes allow compression parameters to be adjusted to achieve a target rate of data, usually expressed as a bit rate. Again, the data reduction will be guided by some model of how important the sound is as perceived by the human ear, with the goal of efficiency and optimized quality for the target data rate. (There are many different models used for this perceptual analysis, some better suited to different types of audio than others.) Hence, depending on the bandwidth and storage requirements, the use of lossy compression may result in a perceived reduction of the audio quality that ranges from none to severe. Of course, that trade-off is usually intentional.
Because data are removed during lossy compression and cannot be recovered by decompression, some people may not prefer lossy compression for archival storage. Hence, as noted, even those who use lossy compression (for portable audio applications, for example) may wish to keep a losslessly compressed archive for other applications. In addition, the technology of compression continues to advance, and achieving a state-of-the-art lossy compression would require one to begin again with the lossless, original audio data and compress with the new lossy codec. The nature of lossy compression (for both audio and images) results in increasing degradation of quality if data are decompressed, then recompressed using lossy compression.