I’ve spent the last 90 minutes reading a bunch of DSP geeks arguing about whether 192 kHz audio is worthwhile at
dsprelated.com. Here are a few references for context:
Nyquist-Shannon sampling theorem, pulse-code modulation.

Many argue that since most humans can’t perceive sounds greater than 20-22 kHz, it doesn’t make sense to encode at a higher rate than twice that; CD audio is defined as 44.1 kHz. While Nyquist states that twice the audible threshold is sufficient to encode audio, he is careful to note that it is sufficient assuming ideal components for reproduction of said audio. Bandlimiting must be ideal during the ADC process to avoid aliasing, and the DAC’s output must use ideal sinc functions and impulses to avoid interpolation error. Increasing the sample rate mitigates the error caused by unideal conversion processes.

There’s certainly a reason that many prefer the sound of vinyl compared to that of a CD, and the term most people use to describe the difference is “warmth”. Because of the way a CD’s audio is encoded, frequencies in the mid range (200-800 Hz) can end up sounding muddy. While a CD is producing a piecewise staircase pattern (one with smoothed step corners under circumstances involving high-end DACs), the information stored on vinyl is a continuous analog curve. To put it in simple terms, the more samples there are, the closer to analog a digital recording becomes – visualize it as the steps being less steep and closer together; more intermediate steps.

The middle frequencies get muddied up because it takes a series of numbers to represent any one frequency, and at any given time in a piece of music, many frequencies must be reproduced simultaneously. Picturing a histogram provides a clear example of what is going on: there are only so many bins that can be defined in the result of a Fourier transform. If a frequency ends up being a non-integer-divisible numerator over the sample rate, the frequency will end up falling into multiple bins of adjacent integer frequencies – quantization error I believe it’s called. By increasing the sample rate, more frequency bins are available, allowing for finer frequency resolution and cleaner reproduction of music, especially music that involves non-equal-temperament tunings.

Another fine example of reduced audio quality is that of the result of MP3 compression. Some may argue that an analysis of the frequency response is identical to that of uncompressed audio, but IMHO it’s not a very good argument. One only has to crunch audio down to a lower bitrate to understand the horrors of MP3 compression and what it does to the signal. That weird tinny sound in the upper frequency range is the result, and can be avoided by destroying (bandlimiting) data in the higher frequency range for better results, but clearly MP3 compression is lossy so any further discussion to the contrary is absurd. MP3 compression “gets rid” of “things we normally can’t hear” by reducing the frequency resolution through Fourier transformation and further quantization.

Ultimately I think it boils down to what people enjoy. Personally I think there’s room for more research in sampling. Not that Nyquist was wrong, but just that most people take for granted the part about things needing to be “ideal”. Multi-channel audio is certainly better than monophonic or stereophonic audio, and I think it’d be nice to see an audio format that stores multiple frequency ranges in different channels – like the subwoofer channel on a DVD… let’s have four channels for four ranges of frequencies: lo-lomid-midhi-hi, and two sets of those to make stereo, or maybe just channel one for the lowest frequencies since directionality is less perceptible at those rates, for a total of seven channels. Or even crazier, what if we separated the channels by odd/even frequencies?

One last thought about the issue of higher resolution: Digitized audio is the sum of periodic functions. Not too long ago I set about to improve my sub-frequency sinewave samples for use in music production. When generating sinewaves at low frequencies in equal temperament, I ran into the issue of period variance. Each sample had to be trimmed differently because the zero-crossings varied. Like a continued fraction or irrational number, some frequencies simply cannot be represented by a finite series of digits.

Some argued that it’s all about the money and taking money from naive consumers, and I think that’s an unfair remark. Yes, the record industry is in fact out to make a profit, and I disagree heavily with some of the legal tactics they’ve undertaken to achieve that goal, but offering higher-resolution sampled audio data isn’t dishonest in the least. For now, analog ultimately wins the quality battle, and one can only hope that more research will be undertaken by those who care equally about empirical quality and listener satisfaction.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: