CD Audio Sampling and Reconstruction

CD audio in a nutshell

The original audio signal is sampled at a rate of 44.1kHz. Each sample is digitised, using an analogue to digital converter (ADC) with 16-bit resolution. By clever means this binary data is stored on a plastic disc. In the CD player a laser diode reads off the data, and via various decoding and error correction methods, the original 16-bit data is accurately recovered. This data is fed to a digital to analogue converter (DAC) which recovers the original audio signal. This is then amplified in the usual way before going to loudspeakers or headphones.

The big complication comes from a fundamental restriction on sampling. The Nyquist criterion says that the original signal can only be recovered if the sampling rate is at least twice as high as the highest frequency component in the audio. Putting it the other way round, for a given sampling rate the audio must not exceed the Nyquist limit of half the sampling rate. So at a sampling rate of 44.1kHz, as used by CD, the audio must not exceed 22.05kHz. A low-pass filter is needed to ensure this. Ideally it should pass everything up to about 20kHz, attenuate anything from 20 to 22.05kHz, and completely block anything above 22.05kHz. This is called the anti-aliasing filter. Any signals above 22.05kHz getting through will produce a digital output which is indistinguishable from that produced by a lower frequency, reflected around the Nyquist limit. So a signal at 23kHz would look exactly like one at 21.1kHz. This is aliasing. No filter is perfect, and the constraints of causality mean that a sharp amplitude response causes problems with phase response. Nevertheless, we can assume that a CD contains little signal above 20kHz and it cannot carry any signal above 22.05kHz.

An issue similar to aliasing arises in the CD player. The bare output from the DAC includes both the wanted audio signal and unwanted image signals at higher frequencies. This can be seen in the 5kHz figure. The dark blue trace shows the original input sine wave. The red trace is the output from the DAC. The magenta trace shows the difference between this and the desired output (green); it is clear that the residual contains high frequency components. The CD player thus needs a low-pass filter, known as the reconstruction filter.

CD player technology

Early CD players used essentially the technique outlined above, but quickly acquired a reputation for poor sound quality. It was thought that the problem came from the reconstruction filter. A sharp cutoff at 20kHz caused phase problems at lower frequencies.

Philips developed a new technique, called oversampling. This replaces the sharp analogue filter with an even sharper digital filter. Digital filters can achieve results which would be difficult or realistically impossible with analogue filters. Phase correction can deal with unwanted side-effects. The output from the digital filter is, in effect, sampled at a higher rate than the input. Philips used x4 oversampling, but other ratios have been used. This means that the reconstruction filter only has to deal with image signals above 176.4kHz. Between 20kHz and 176.4kHz there will be no signals, so the analogue filter has a much easier job to do. By clever use of this technique, Philips managed to get 16-bit performance from 14-bit DACs in their early players. These sounded much better than the conventional units.

Over-sampling became the standard method used in CD players. It is assumed that the phase problems in sharp analogue filters mean that oversampling will always give better reproduction. A more extreme form of oversampling is now sometimes used, with very high ratios and low-resolution DACs.

Recent ideas

In the mid 1990's, some people started to question the oversampling consensus. They argue that the sharp digital filter creates problems. So they want to either use a softer digital filter, in the technique known as upsampling, or get rid of oversampling altogether and return to simple DAC conversion but without a sharp analogue filter. Does the sharp digital filter cause audible problems? If it does, then what can be done at the CD player to improve things, as the CD recording equipment will have already used a similar filter?

In the peculiar way some people handle language, the move away from oversampling is known as 1x oversampling or non-oversampling (also known as NOS - which to me means New Old Stock). This has a following among those who want to get back to simplicity. They argue that any sharp reconstruction filter, whether analogue or digital, creates problems. They like to show the waveform of a square wave with pre- and post-edge-ringing, and say that this proves that sharp filters damage the sound. In reality this is merely showing what happens when you band-limit a square wave. Digital audio necessarily requires band-limiting filters, at least at the recording end, but fortunately real music does not contain square waves. The square wave test signal they use would never appear on a real music CD, so it is an artifical test.

In a NOS DAC the actual DAC chip feeds into a relatively simple low-order analogue filter which removes some of the higher frequencies (but sometimes even this simple filter is omitted). The result, they say, is clean pure sound - even though it is theoretically wrong. This technique is becoming popular among audio DIYers.


What happens if a DAC is not followed by a reconstruction filter? You get lots of ultrasonic components, with the details depending on the nature of the DAC. Virtually all DACs use the zero-order hold (ZOH) technique, which means that the output stays steady at one value before the next value comes along. Unfiltered, this gives a step waveform - see the red trace in the figure. The desired audio signal is there, but so is lots of other stuff. The main undesired signal is the first image, so 20kHz audio will be accompanied by 24.1kHz image. The image will be smaller than the signal, but not much smaller. The DAC output looks like two tones beating together. It is perhaps surprising that this beating is not audible. NOS enthusiasts say that the system loudspeakers and the listener's ears are an adequate low-pass filter. In a conventional system the 24.1kHz image would be filtered out by the reconstruction filter. NOS people can hear 20kHz perfectly but 24.1kHz not at all? Kusuniki, one of the inventors of the NOS DAC, admits that his idea will not work if people can hear above 20kHz.

Critics of NOS say that it is theoretically wrong, and that the excessive ultrasonic signal will cause intermodulation in amplifiers and possible damage to tweeters. I believe the truth is somewhere between these two.

The unwanted ultrasonic components will be attenuated by the ZOH output from the DAC. This gives a low-pass frequency response which falls off to zero at the sampling frequency. It is already at -3.2dB by 20kHz. Low frequency signals will have higher frequency images, which will be more attenuated. See the figure: images above 40kHz (i.e. audio below 4.1kHz) are about -20dB down. The strongest audio signals occur at low frequencies, so the total ultrasonic output is not as high as might be expected. Amplifiers and tweeters are probably safe!

Why does NOS, apparently, sound good? In audio, simplicity is often a virtue but this is not the full explanation. Low frequency signals will have images which are low in amplitude and high in frequency, so they are likely to be inaudible and harmless. What about high frequencies? Here the story gets interesting!

There is some evidence (also discussed here, and here) that the presence of ultrasonic signals, although inaudible, do improve the perceived quality of sound reproduction. These high frequency signals typically relate to transients or percussive sounds, and are not heard as musical tones. Is it possible that the preference of some listeners for NOS DACs is due to the image components just above 22.05kHz being perceived as part of a percussive signal? They will occur at exactly the same time as the below-22.05kHz signals which cause them, and we cannot perceive pitch details in this frequency range. Maybe they are to some extent replacing/impersonating ultrasonic components which were present in the original audio signal before the anti-aliasing filter in the CD recorder? If so, then NOS might work but not for the reason claimed!

People sometimes say that NOS has an "analogue" sound. This may be just the effect of the gentle sinc roll-off in the upper frequency range: -3.2dB at 20kHz. This filter, caused by the ZOH output from the DAC, has linear phase so should sound quite benign.

DAC tricks

No DAC is perfect, so people have thought of clever ways to get better performance out of them. One trick is to put several DACs in parallel. The idea is that minor random errors in the internal resistor ladders will be swamped out, as each DAC will have different random errors. This seems to make sense, and is often used with the older cheaper DACs such as the TDA1543. Doubling the number of chips halves the likely error, so the law of diminishing returns soon sets in. The trick is to keep them cool, and provide a good power supply. A popular idea with NOS is passive I/V (i.e. a resistor); many DACs output a current which needs to be converted to a voltage. I have (theoretically) explored passive I/V with the TDA1543.

Some people have tried running DACs in antiphase, as a form of push-pull. This has the same effect as running them in parallel, for random errors. It has the additional advantage of providing some cancellation of even-order wider-range errors. It will not help with odd-order errors, so there is a risk that odd-order will lose its even-order masking.

One idea I saw used several DACs, but with the signal delayed on some of them by half a sample interval. I think the idea was to reduce the step size from the ZOH output, and thus reduce high frequency components. You get some of the advantage of x2 oversampling. The snag is that, in essence, you are simply adding a delayed version of the signal to itself and this will cause attenuation and phase shifts at higher audio frequencies. This may be an attempt to simulate a DAC with first-order hold (FOH) output (i.e. a linear slope between sample points instead of a step waveform) but this ruins the frequency response - it becomes sinc2rather than sinc.

Back to audio home, modified Mullard 5-20, grid current, TDA1543 passive I/V

updated 7 July 2009: remove some of my confusion!