Archimago's Musings: ANALYSIS: DSD-to-PCM 2015 - foobar SACD Plug-In, AuI ConverteR, noise & impulse response...

Saturday, 11 April 2015

ANALYSIS: DSD-to-PCM 2015 - foobar SACD Plug-In, AuI ConverteR, noise & impulse response...

Noise characteristics of PCM vs. DSD - image found here.

In my post last week looking at the various DSD-to-PCM converters, Solderdude Frans made a good suggestion... Let's have a look at the newer SACD plugin which has superseded the DSDIFF plug-in as the converter of choice these days for foobar. Also, it was suggested by Yuri Korzunov, the author of AuI ConverteR 48x44 to have a look at his converter package as well.

I. DSD-to-PCM - foobar SACD plug-in & AuI ConverteR

So, I downloaded the newest SACD plug-in currently at version 0.7.7 dated 2015/03/16. I deleted the DSDIFF plug-in from my computer so there are no interactions, and installed the new files.

Notice that the SACD plugin has a configuration panel for settings:

Because this plug-in does not directly output to 24/96, I figure let's try with the highest output (352.8kHz) and I will use the best samplerate converter (SRC) I have (the excellent iZotope RX 4) to bring it down to 24/96 for analysis as before. Here are the settings I used in iZotope RX 4:

Sharpest "max" filter for 24/96 in iZotope RX 4, linear phase without suppression of pre-ringing - nothing fancy...

The other parameter we can play with in the SACD plug-in is the DSD2PCM mathematics setting. By default, it's the standard fixed-point integer mode. Let us also analyze the result from the highest precision mode "Multistage (Double Precision)".

As well, I downloaded the AuI ConverteR 48x44 software. The current demo is version 4.1.20. Other than setting the output for WAV 24/96, I left the rest of the settings to default.

Using the exact same procedure as last week, here's a summary of what I got:

Interesting... It looks like the SACD plug-in is actually about the same as the old DSDIFF 1.4 (<1dB difference) in terms of noise level and dynamic range. Notice just like last week's results from XLD, that going from fixed-point to double floating-point calculations made no difference here.

AuI ConverteR resulted in some impressive numbers! Let's see what it's doing in detail...

As you can see, AuI ConverteR is using a sharp "brick-wall" filter right at ~20kHz to remove essentially everything after 20kHz. As such, have a look at what it does with the noise profile:

Wow. That is an impressively sharp, precise filter at 20kHz! I can approximate that effect with iZotope RX 4's EQ plug-in with a low-pass at 20kHz, high Q of 25 or so (not shown) but AuI ConverteR looks even cleaner with less noise floor irregularity.

That little bit of high-frequency "rippling" with DSDIFF is probably a result of the resampling algorithm. Otherwise, foobar DSDIFF and the newer SACD plug-ins appear very similar.

Basically, this is what we can say at this point...

1. foobar SACD plug-in works about the same as the old DSDIFF plug-in. I would not be surprised if the algorithm (DSD2PCM) is essentially the same if we look "under the hood".

2. The AuI ConverteR software puts up some impressive numbers. This is done with a very strong low-pass filter. If you feel there is no need to retain frequencies >20kHz, then this will clearly get the job done.

II. All that noise!

But wait, there's more! SACD Plug-in also has a 30kHz lowpass mode - "Direct - (Double Precision, 30kHz LF)". Hmmm, I wonder how that looks?

Engaging the 30kHz lowpass mode really resulted in a step down in calculated accuracy. Here's a look at the graphs:

Indeed, the 30kHz low-pass filter is doing the job (yellow).

We can see the effect of that 30kHz filter on the noise floor... Certainly not the prettiest filter out there! Realize that although the differences are there in these graphs with a synthetic test signal, we're talking about noise down below -150dB (below 20kHz). It's just not an issue in terms of audibility.

Now, let us see if we can do it better by using iZotope RX 4 to do the 30kHz low-pass filtering instead of the algorithm used by SACD plug-in. Here's a simple setting:

Low-pass filter at 30kHz as seen, Q = 5.0 (not too steep), linear-phase FIR with FFT size of 32k.

Voilà:

This is what a good low-pass filter can do for the results. As you can see, when we use the 30kHz iZotope low-pass filtering, the calculated noise level drops substantially on the RightMark analysis... This also tells us that the reason Saracon and JRiver measured so well is because they implement very good quality noise filtering algorithms beyond just the DSD-to-PCM calculations.

The SACD Plug-in's DSD2PCM algorithm is excellent, and when we pair it up with iZotope's SRC and 30kHz low-pass filtering, we get some fantastic results easily on par to what Saracon does:

III. Impulse Response and DSD-to-PCM Converters

Despite the inherent noise in DSD, we can drop overall noise levels substantially with a good low-pass filter. In fact, since a picture is worth a thousand words, this is what a 15kHz (-12dBFS) sine wave looks like comparing the unfiltered foobar SACD plug-in output at 352kHz with the 30kHz iZotope RX 4 low-pass filtering (again, this is with Saracon as the encoding software for PCM-to-DSD):

This is what all that extra high-frequency noise looks like in DSD when you don't filter it out at all. Notice that DSD128 is significantly less noisy. The question is, just how much noise reduction should we actually do? (You can also see the noise through an analogue oscilloscope - as shown here.)

As noted by Juergen in the comments to the previous post, there is this matter about time-domain behaviour as well which can be skewed as we apply various filters.

Let's see what a 24/96 impulse looks like after going through the DSD encoder [Saracon] and most of the decoders I looked at (DSD-to-PCM converter output set to 24/352 for each, AudioGate's max was 192kHz):

(Click to enlarge.)

In the top left panel, this is what a 0dBFS 24/96 "impulse" would look like with a typical linear-phase oversampling interpolation showing symmetrical pre- and post-ringing. Even though Adobe Audition renders the interpolation, the actual PCM data itself is a simple, single "pulse" (see Addendum below for screenshots using Audacity).

When I convert this waveform to DSD64 and DSD128 with Saracon and then back to PCM with the foobar SACD plug-in to 24/352 unfiltered (retaining all that ultrasonic noise), you get the 2nd and 3rd left images. Notice again the amount of noise in the signal and again, we see the superiority of DSD128. From a time domain perspective, the SACD conversion process is excellent. The shape and timing of the impulse would be completely retained since the 2.8224 MHz sampling rate of DSD64 provides ~29 samples within each 96kHz time period.

When we use iZotope RX 4 with 30kHz low-pass filtering (4th left image from the top), the "impulse" amplitude is significantly reduced and we see the corresponding ringing pattern as the high frequency noise is removed and no longer obstructing the picture.

AudioGate and Saracon both look very similar. Both use linear phase filters with characteristic symmetrical pre- and post-ringing. Whereas AudioGate allows high frequencies through (and thus well formed impulse), we see the effect of Saracon's filter (pre- and post-ringing ~30kHz). JRiver looks like it uses an intermediate phase filter (with 24kHz or 30kHz low-pass) which minimizes but does not remove pre-ringing. Comparatively, we see that DSD Master is using a form of minimum phase filter that removes the pre-ringing but the post-ringing is augmented.

AuI ConverteR is an interesting case. As we saw above with the RightMark tests, it implements a very sharp ~20kHz low-pass filter. This impulse response looks to be linear phase with accentuated pre- and post-ringing due to the sharpness of the filter; the "price" to pay I suppose.

I'll leave you to decide how you feel about this information and whether you think the relative time domain effects resulting from implementation of the filters are audible. Back in 2013 I had a listen to some filter settings off the TEAC DAC and had difficulty noticing much of a difference; again here, I listen and fail to convince myself that I have any clear preferences among the converters including using ABX Comparator. So far I'm using headphones (Sennheiser HD800 + TEAC UD501 DAC, ASIO driver playing DSD64 converted to 24/352kHz) so perhaps I need to try again with the speaker system. You guys up for an internet "blind" test to see if there's a preference towards linear phase vs. minimum phase upsampling???

IV. Conclusion

I hope we can appreciate the compromises we face with DSD to PCM conversion. How much noise can we tolerate from the 1-bit quantization when we move the signal to PCM? What's the best frequency to set a low-pass filter assuming one believes it's necessary? What parameters should we use to filter (minimum / linear / intermediate phase, sharp vs. gradual roll-off...)? What's the best sampling rate to spit out the PCM data (eg. do we need to produce >96kHz files if we roll-off before 48kHz)?

As I noted last week, I really am not convinced that these differences are audible beyond volume level changes and whether the ultrasonic noise causes problems for one's audio system (eg. intermodulation distortions, interaction with tweeter ultrasonic peaks, and other non-linearities). This is why I don't think there's any point in "crowning" any software package as being superior. Although it's interesting to demonstrate and experiment with, I suspect this is all rather obsessive academic results of interest to audio geeks :-).

If I had to choose, I remain partial to Saracon and JRiver because of the excellent results from the low-pass filtering used by default with those programs; one-step easy conversion using very reasonable parameters. As you can see, I can get similar results with the foobar SACD plug-in creating 24/352 output, and running that through iZotope RX 4 with high-precision samplerate conversion and low-pass filtering indicating that the underlying free DSD-to-PCM algorithm works well. AuI ConverteR is interesting in that the default setting I looked at resulted in a very clean output so long as one does not feel there is any need for >20kHz signals to be retained nor concerned about the effect on the impulse tracing.

----------

You can perhaps imagine, after "penning" these last 2 posts, I'm pretty well done with talking about DSD for awhile. The most interesting question for me currently as suggested by the discussions with the previous post is this whole notion of just how much significance we should place on resolution in the time-domain irrespective of audible frequencies.

If it is significant (I hesitant to use the word "important" since that should be obvious by now if it is the case), then how much is enough? Should we take research like this paper by Kunchur (2008) seriously? Or is it possible that for practical purposes, it doesn't really matter that much when we're listening to real music as opposed to test signals? In any case, I have a strong suspicion that we will be revisiting this in the days ahead since this seems like an area that will be brought out when Meridian's MQA becomes available as I suspect they will emphasize time parameters, digital filter types, and samplerate given their apparent satisfaction with 16-bit resolution.

----------

Finally, it has come to my attention that there was much unhappiness regarding a recent blog post on the importance of noise (here also) in digital audio reproduction to the point of using speculation to support an underlying belief that expensive ethernet cables could somehow impart beneficial effects (as you know, I found no evidence of significance in my testing with various types of ethernet cables). As usual, no empirical data or real-life examples were provided and support came from more testimony from the like-minded and some links that are at best tangential to high-fidelity audio. It looks like bans from commenting were issued for what seems like rather fair statements calling out the obvious lack of substance. I guess that's how people not felt to be "true believers in the audiophile experience" are dealt with. IMO, this is unfortunate behaviour for a site reporting on mature audio computing technology.

There is much that can be said, argued and refuted in that article, but I think for most reasonable audiophiles it's rather obvious and many excellent points can be found in the comments... What is of relevance to this blog entry is that if one believes that expensive ethernet cabling can reduce noise in the "system" (in a way that appears difficult for these people to produce empirical evidence for), why would any audiophile who subscribes to this theory even want to listen to DSD64 (where the noise is obviously demonstrable and a potential cause of distortion)? Or even consider DSD64 superior to 24/192 at times? Would it not be just as likely that some folks actually like the ultrasonic noise and what it actually is doing through the system? Perhaps similar to how some tube-lovers talk about certain types of distortion being unobjectionable? In fact, back in late 2013, I posted on my impressions with realtime PCM-to-DSD transcoding with JRiver 19 and felt that DSD64 did impart a subtle change to the sound. I wouldn't say that I felt the sonic difference compelled me to convert all my PCM files to DSD for listening, but it was an interesting effect. Maybe that's why some people would prefer a DAC that purposely converts PCM to DSD like the PS Audio DirectStream DAC (the signal is purposely downsampled to DSD128, and then only noise filtered by 80kHz according to this review).

I hope you enjoyed this exploration into the world of DSD (again)... I got a few projects piled up to work on so might not be around as much for the next couple weeks. I'll also be in Boston in the next little while so if anyone has a recommendation on music store I should check out near downtown, let me know!

I'm also thoroughly enjoying David Byrne's book How Music Works (2012, with 2013 update) - check it out for entertaining reading!

Enjoy the music folks :-).

----------
Addendum:
Note that Adobe Audition renders the PCM data with a linear phase interpolation filter. Here are renderings of some impulse waveforms using Audacity which does not do the fancy interpolation for reference:

24 comments:

Mnyb11 April 2015 at 22:06
A comment regarding Low pass filters , the Sony scarlet book specify some sort of low pass filter for SACD players ( DSD64 i assume ) ,so the intention has always been that some of this noise should be removed .

What are we doing when converting to PCM ? If we miss to use a filter or don't filter enough we can end up with playback that includes more ultrasonic noise than the creators of DSD ever intended ?

Is it ok to stick with Sony's proposed filter for hardware players when doing DSD to PCM ? Or can we do better with current knowledge . I liked your idea about the isotope filter at 30k personally I would used 25k .

I also notice that it can be a bit silly , none or some of this migth not yield any audible differences , but sticking to specifications can be good thing it gives predictable results .
ReplyDelete
Replies
Evgeniy12 April 2015 at 09:10
Hi
Are you tested Foobar SACD plug-in in mode "Direct 30 kHz filter" or in mode "Installable 30 kHz filter" ?
In my mesurements "Installable 30 kHz filter" works much more precise, than "Direct".
Results is independent from type of calculations (integer, float, double).
ReplyDelete
Replies
Evgeniy12 April 2015 at 09:15
(my results)

small comparison of converters.

1) generate test files, RMAA 6.4
24bit 88.2 kHz

2) convert to dsd64 and dsd128 :

software : Saracon
gain: 0db
8th order

3) conversion to pcm 24bit 88.2 :
(all tests in 24 / 88 mode)
3.1) Saracon
gain 0db
tpdf dither

3.2)
foobar old version: 1.1.8 (with foobar new: 1.3.5 - some problems)
sacd plugin: 0.7.3 + filters

4) analyze in RMAA

overall results:

Testing chain: Saracon+foobar dsd64 (88 kHz 24-bit)
Sampling mode: 24-bit, 88 kHz

Summary
Test Saracon+Saracon dsd64 (88 kHz 24-bit)-p2d-d2p Saracon+foobar-old multistage dsd64 (88 kHz 24-bit)-p2d Saracon+foobar-old direct 30khz dsd64 (88 kHz 24-bit)-p2d Saracon+foobar-old I FIR 30khz dsd64 (88 kHz 24-bit)-p2d Saracon+foobar-old I FIR 40khz dsd64 (88 kHz 24-bit)-p2d
Frequency response (from 40 Hz to 15 kHz), dB: -1000.00, +1000.00 -1000.00, +1000.00 -1000.00, +1000.00 -1000.00, +1000.00 -1000.00, +1000.00
Noise level, dB (A): -141.9 -140.9 -119.0 -144.9 -144.0
Dynamic range, dB (A): 133.1 132.9 119.7 133.3 133.3
THD, %: 0.0000 0.0000 0.0001 0.0000 0.0000
IMD + Noise, %: 0.0002 0.0002 0.0004 0.0002 0.0002
Stereo crosstalk, dB: -140.9 -140.0 -117.9 -145.1 -144.3

dsd128:
Testing chain: Saracon + foobar-old comparison dsd128 (88-24)
Sampling mode: 24-bit, 88 kHz

Summary

Test Saracon+Saracon dsd128 (88 kHz 24-bit) Saracon+foobar-old multistage dsd128 (88 kHz 24-bit) Saracon+foobar-old direct 30khz dsd128 (88 kHz 24-bit) Saracon+foobar-old I FIR 30khz dsd128 (88 kHz 24-bit) Saracon+foobar-old I FIR 40khz dsd128 (88 kHz 24-bit)
Frequency response (from 40 Hz to 15 kHz), dB: -1000.00, +1000.00 -1000.00, +1000.00 -1000.00, +1000.00 -1000.00, +1000.00 -1000.00, +1000.00
Noise level, dB (A): -142.0 -143.8 -101.2 -95.6 -69.2
Dynamic range, dB (A): 133.1 133.2 98.1 92.5 66.7
THD, %: 0.0000 0.0000 0.0010 0.0016 0.017
IMD + Noise, %: 0.0002 0.0002 0.0075 0.012 0.146
Stereo crosstalk, dB: -142.2 -142.0 -114.0 -99.4 -64.8

full data with graph in archive : https://mega.co.nz/#!AtAiQBJZ!hAO63e...85QU6r_vepFIAU

ReplyDelete
Replies
Bill Lord12 April 2015 at 11:31
Cambridge, Just off the Red Line, both near Harvard/MIT.

http://www.theaudiolab.com

http://qaudio.com
ReplyDelete
Replies
Bill Lord12 April 2015 at 16:29
I've been following your great analysis! Excellent work!

I request that, should you return to DSD measurements, that you use your 24/96 signal and convert to DSD using Korg's AudioGate 3.XX and burn a DSD Disc (DVD). Place the DSD Disc into a PS3 and measure the HDMI and Analog outputs. The PS3 plays DSD Disc however, converts to 24/192 PCM on the fly.

This would be excellent in archiving LP should the performance be acceptable.
ReplyDelete
Replies
Yuri Korzunov13 April 2015 at 04:59
Hi Archimago,

Thank you for testing my audio converter AuI ConverteR 48x44.

I want add some details:

1. Used "brick-wall" filter with cut under 20 kHz allow pass by any possible ultrasound troubles for some hardware and/or playback software (in single system with hardware).

I suppose any hardware has the best (for itself) features in range 0 … 20 kHz.
But upper frequencies (ultrasound) may be shifted and mirrored to audible range.

Thus AuI ConverteR 48x44 do "half work" of hardware. It can considered as optimizing audio stream to apparatus.
And allow avoid existing of extra distortion by ultrasound of original (non-converted) stream.
This technology used also for PCM/PCM conversion.

In general, as I suppose, it must impact even to cheap apparatus in direction sound improving.

I already not once get suggestions by AuI’s users for extending cut border up to 25 kHz and/or use less steep filter (where it possible).
I planned it for future experiments.
ReplyDelete
Replies
Yuri Korzunov13 April 2015 at 04:59
2. About high frequency noise for some high resolution records
As I assume, and how wrote in the article http://www.realhd-audio.com/?p=1739
DXD records is not pure PCM. DXD has high frequency noise what, as I assume, inheritable from DSD.

I got feedback from AuI’s users about noise removing by converting with AuI ConverteR without resampling.

As I assume AuI ConverteR 48x44’s cutting all above 20 kHz allow playback such files at any software and hardware.

Demonstation of noise removing for hi-res file in video http://www.youtube.com/watch?v=67_3Qmbq8Y4

After cutting we don’t lost transparency, but got sound without significant noise and open hidden in noise weak details.

3. Why AuI ConverteR 48x44 has minimum settings

In AuI used manually tuned resampling filters for each combination of input/output sample rates.
Via measurements all contradictory features (steepness-ringing-performance, etc.) adjusted to optimal combination of values.
Regard to resampling, user available only: turn DSD ON/OFF for unchanged sample rates, switch linear/min. phase filter.

ReplyDelete
Replies
Yuri Korzunov13 April 2015 at 05:00
4. AuI ConverteR 48x44’s minimal phase filters

AuI default use linear phase filters. In Settings window /General tab possible turn to minimal phase filter mode.
Phase response of AuI’s minimal phase filter very close to linear form.

Need consider that pre-ringing artifacts for minimal phase filter moved behind to front of output (converted) signal.
Thus we get 2 times more energy post-ringing than linear phase filter. Same things we can see in pictures of ringing comparing in the article.
Here all according to theory of conservation of energy :)

AuI’s minimal phase filter can be applied for any conversion as PCM as DSD.

During several years since begin applying minimal phase filter, I don’t got any unambigous feedback that anybody prefer minimal phase filters.
Here I said not only about AuI’s minimal phase filters.
I got not once feedback about preferring of linear phase filters. Not only AuI’s.

5. AuI ConverteR 48x44’s and DSD128 (5.6 MHz), D256 (11.2 MHz), D512 (22.5 MHz)

Indeed D64 has level noise/distortions better CD's -120 … 130 dB (vary by DSD coder/decoder combination).
I was compared D128 vs. PCM24. D128 he’s significantly lower level of noise -146 vs. -177 dB
PCM 32 bit float (what I always use for measurements during testing software) superrior D128: -201 vs. -177 dB.
Details here:
http://samplerateconverter.com/content/how-impact-audio-quality-pcm-dsf-conversion-1-bit-dsf-vs-pcm

D256 and D512 has so low noise that (as I assume) it restricted 32 bit floating point precision.
While for me unavailable enough precise tool for correct estimating D256 and D512, for checking this hypothesis.

About level of ringing

In general, I suppose, ringing artefacts, phase and other distortions of final master's offline conversion is not compared by energy with same distortions during mixing and postproduction. Including different effects, infinite impulse response filtrations, real-time algorithm simplifications, …

However we have possibility get audio stream more optimized for playback at available apparatus, even cheap.

Best regards,
Yuri Korzunov
ReplyDelete
Replies
A Pair Of Eyes26 April 2015 at 16:24
Hey, Archimago!

Which Precision setting did you use in foo_sacd on your final experiment (the one where you added the Izotope RX 30kHz filter)?
ReplyDelete
Replies
Steve Elkins4 December 2019 at 07:44
I have now used EZ CD Music Converter because its Ulta High precision DSD to PCM gives me the best audio quality. Have you tested it ? It Is free download at https://www.poikosoft.com/music-converter

Thanks for the great professional audio blog.
ReplyDelete
Replies

Add comment