From the beginning, there were concerns about this push towards 1-bit systems into the consumer space along with claims that 1-bit PDM should form some kind of archival foundation for music. There were critics include Lipshitz and Vanderkooy - see their paper "Why Professional 1-Bit Sigma-Delta Conversion is a Bad Idea" from the September 2000 AES. And the next year in May 2001, they followed up with "Why 1-Bit Sigma-Delta Conversion is Unsuitable for High-Quality Applications". Even Bob Stuart chimed in on the unsuitability of DSD for "high-resolution audio" back in 2004. This is no surprise since Meridian was firmly with DVD-A including developing the MLP compression system which subsequently has been licensed by Dolby and renamed TrueHD; it looks like Dolby and Meridian had an arrangement dating back even to 1998.
These concerns around fidelity and the unsuitability of 1-bit PDM as an editable format in audio production are why in the professional world, we see audio recorded and edited in 24/352.8 "DXD" and Sony's own "DSD-Wide" (8-bit/2.8MHz) instead of DSD64/1-bit "DSD-Narrow".
While this was playing out in the academic/professional arena, the advertising industry including the "mainstream audiophile media" championed DSD and published all kinds of flowery words suggesting how it sounded "more natural", or "analogue-like" compared to PCM. While I don't think we can put an exact date on when DVD-Audio officially died as a viable commercial product, I think by 2005 it was quite clear that hi-res physical formats were not going to be mainstream and DVD-A did not have the number of titles available compared to SACD. My sense is that the hybrid-SACD feature with both DSD and CD-compatible layers was a major differentiating factor that has resulted in still a trickle of SACDs released these days.
I'm bring this stuff up now as an extension to the discussions around SoX-DSD and the Philips Test SACD articles last year during my series on the Topping D90SE review because I've been thinking about how best to standardize the DSD test signals I use when testing. Different DACs tend to handle DSD playback differently and I wanted to make sure that my test signal parameters are at least somewhat in line with the music encoded on an SACD or maybe DSD128 download these days.
In that previous article on SoX-DXD, I mentioned that I ran into some issues around signal instability with higher-order noise shaping settings when testing the Topping DX3 Pro V2. For example, I was unable to obtain a glitch-free DSD256 1kHz THD+N and had to drop the modulator down to a 5th-order setting while higher order noise shaping still worked well enough for DSD64/128 testing.
While not spoken of among audiophiles or in the mainstream press, once you start playing around with DSD encoding, you realize that there are intricacies to the modulation system to be mindful of. We are no longer dealing with a digital system that treats each sample as a amplitude level like the 65,536 possibilities of a 16-bit PCM sample digitized tens of thousands of times per second, and tracing out these levels to show a facsimile of the audio waveform. Rather, we have to think about the sound encoding as signal "density" over time consisting of numerous single bits of '1' (up) and '0' (down) values digitized over millions of times per second - the "average" of which determines the "modulation index" and the eventual analogue output level from your DAC.
|Typical illustration of PCM vs. DSD from Wiki. Note that in good PCM playback, unless specifically NOS, we do not see stair-stepped quantization levels like this in the analogue output; I understand what they're trying to convey here though as a discussion on the nature of the digital data.|
While we've talked over the years about DSD noise and how that might limit audio fidelity, we have not addressed another issue specifically which is the peak level in a DSD stream. The formal specifications for SACD/DSD64 are the "Scarlet Book" specs which we can examine on Internet Archive. If we look at the file "SACDspecP2audio_200 contents.pdf", in Annex D we can see how Sony/Philips specified SACD/DSD64 peak levels:
D.1 is a reference to filtering for the noise which we've talked about before; particularly with DSD64 and a recommendation to apply low-pass filtering around 50kHz.
D.2 tells us that "SACD 0dB" (let's call it 0dBDSD) = 50% peak amplitude of theoretical maximum DSD signal level. If we think about this in the PCM world, this would correlate to -6dBFS.
D.3.1 then is important because it tells us specifically how they defined maximum "density" of bits allowed; the maximum modulation level. Notice that they're looking at 28-bit chunks of the DSD bitstream and officially within any of these chunks, there cannot be ≥ 24 or ≤ 4 "1's" within these consecutive bits or a Peak Modulation Level of 20/28 (71.43%). From 0dBDSD at 50% modulation to the maximum 71.43% is +3.1dB. Peak signal levels therefore "cannot" officially go beyond +3.1dBDSD or the equivalent of -2.9dBFS in the PCM world. Beyond this peak limit, "there be dragons" in the form of potentially higher distortions in the form of modulator overload from DAC playback.
This information is important and well known within the SACD production world. Merging Tech's Pyramix SACD Production Guide for example is very clear about this; let's highlight a few important facts:
1kHz THD(+N) DSD Test Signal...
sox.exe "+3.1dBDSD 1kHz Sine (32-192kHz).wav" "+3.1dBDSD 1kHz Sine (DSD64, clans-8).dsf" rate -v 2822400 sdm -f clans-8 -t 32 -n 32
One more detail about the test tone. If you have a look at the 24/96 PCM file, at the start of the audio, we see this:
RightMark DSD Test Signal...
I hope you found this foray into the intricacies of SACD/DSD/1-bit PDM interesting. After all these years, there's still something akin to reverence among audiophiles towards DSD as if there are "ideal", "right", even "more analogue" qualities about the sound. I don't think this is the case. (We can think about other technologies which some audiophiles will idealize, even fetishize, as well.)
As usual, there's no need to believe audio technology has any magical properties (the gift of the snake oil salesmen). While there are some extra details to be aware of such as peak output level and care with selecting noise-shaping modulator settings, this should be handled behind-the-scenes by manufacturers and audio production folks. The fact that I'm writing this post is simply part of the "evolution" of my evaluation system which forces me to think about how best to encode the tests I use.
Yes, 1-bit DSD can sound very good because the science works, in particular the noise shaping needed. As I've expressed over the years, for perfectionistic audiophiles, I recommend DSD128+ which allows noise to be pushed further out beyond the audible spectrum and will provide lower noise floor with less aggressive noise shaping compared to DSD64 where the rapid increase in noise is just beyond 20kHz and could have an impact with distortions into the audio band depending on your system.
While fine for consumer music consumption, for those of you who create, record, mix, and master music, 1-bit DSD is simply not good for audio production purposes and I think these days no serious archivist would use DSD64 in particular. Due to the lack of flexible editing ability for DSD data, almost all recordings have to go through some kind of multibit/PCM step. With PCM almost always used, it's no surprise that many SACDs are basically transcoded/upsampled PCM content. By the way, the vast majority of the new pop SACDs coming out of Asia in the 2020's that I have examined over the last few years are just upsampled 44.1/48kHz PCM as well.
[This kind of thing reminds me of vinyl these days with the majority of albums sourced as digital then played back into analogue for LP cutting - tell me, what DACs are used to play back the digital data? Are they using multi-multi-thousand-dollar audiophile DACs in the cutting room or even a very high resolution ES9038 Pro device? ;-]
Needless to say, I believe it's best when recording/producing music to go with hi-res PCM such as DXD (24/352.8) from the start (like the way 2L does it) and then convert the final master to DSD or downsample to hi-res PCM as desired.
Sadly, much of my critique of DSD from back in 2013 remains the same including limited implementation of lossless compression. WavPack 5 is still the only freely available compression system I'm aware of - as discussed here.
One last thing ;-).
Please dear audiophiles, don't just throw out comments like "Feedback in audio sounds bad!" without specific examples, and then in the next breath claim stuff like "DSD sounds like analogue - much better than PCM!"
Realize that the noise shaped modulators used in DSD/SACD are examples of high feedback! Without noise shaping, 1-bit DSD64 at 2.8MHz (64 = 2^6) has only 6 bits of dynamic range (~36dB). I'm not saying feedback is always good or always bad, but in a situation like DSD, it's simply necessary for hi-fi quality within audio frequencies! Almost everything in this world must be approached with nuance.
For more readings on the technical details around 1-bit PDM/DSD/SACD, have a look at Reefman & Janssen's review "One-bit Audio: An Overview" (2003) which includes discussions on the Trellis algorithm, and the article by Måns Rullgård "PCM and DSD" (2020) might also help clarify things further.
Alright dear audiophiles, that's all for now. Next week, let's finish the last part of the S.M.S.L. DO100 DAC review including a look at DSD performance based on the discussions above.
Hope you're all enjoying the music!
*** Achieving stability with higher-order modulator settings on test signals could be a real pain in the derriere; a few times I've had to agonize over the balance between noise level and potential stability issues. While there are some general trends, the results can be wildly unpredictable which is why it took quite a bit of trial and error to create and decide on my standard set of test signals (not just 1kHz, but also the 1/10 Octave Multitone 32, and collection of converted RightMark signals).
Some general observations using the SoX-DSD CLANS/SDM modulator settings looking at Trellis convergence failure data suggested that:
1. CLANS-5 seems very good at controlling overload across the board from DSD64-512 even with excessively loud +5.9dBDSD output level. When in doubt, use this. It still provides 120+dB dynamic range through the audible spectrum and sounds great. CLANS-5 was particularly stable with DSD256/512 compared to SDM-5.
2. Above the 5th-order setting, sometimes SDM is better than CLANS; it depends on the audio data. For example, the RightMark test signal converted to DSD64 had less issues with SDM-7 than CLANS-7 which I could not have predicted (although I still decided on using CLANS-8 for this to remain consistent with the 1kHz tone).
3. DSD128 seems more finnicky to encode than DSD64 given the same amplitude level and even with lower order modulator settings. I was surprised by this as it stood out as unusual compared to DSD256 and DSD512 encodes. I don't know if this observation extends beyond the freely available SoX-DSD software though.
Seeing these disparities and nuances, I wonder how studios decide on the modulator settings they use when producing albums. Do they actually run a few encodes for example with various order settings and sit down to listen, then make a choice for the final product? Audible differences may be noticeable especially with DSD64 using higher-order modulators so this could be rather tedious work to listen for such subtleties! Unless truly obsessive about the sound quality, I have doubts as to whether artists and audio engineers would bother for every project.
Addendum: June 20, 2022
Just wanted to add another set of measurements that I thought was interesting. Here's the RME ADI-2 Pro FS R Black Edition - DSD-Direct mode activated (no volume control, no DSP) with 50kHz DSD Filter.
As you can see, I've increased the ADC sampling to 384kHz to show bandwidth out to 192kHz. We can appreciate the relative magnitude of the ultrasonic noise with DSD playback compared to PCM and of course as we go from DSD64 up to DSD256; noise pushed further out as the DSD sample rate increases.
A nice example of improved objective performance as we get from DSD64 to DSD256. And only 1dB difference between PCM and DSD playback output level (-0.93dB is the expected change in the volume bypass "direct" mode based on datasheet).
Addendum: June 21, 2022
Yet another update ;-).
Up to this point, notice that I've published results from the test signals of real-life measured DACs only. Let's in a way take a step back and look at what we expect from a "Perfect DAC" when it comes to reproducing the test signals. We can do this by running a software decode of the DSD data through foobar2000's SACD plugins (much of this discussed awhile back). Let's use the highest quality decode with 64-bit floating point and create 32/352.8 output to analyze. +6dB offset added to the decode. No DSD filtering applied (for example, the RME ADI-2 results above are with a 50kHz filter).