Archimago's Musings: BLIND TEST Results Part 2: "Do digital audio players sound different playing 16/44.1 music?" - Relative objective performance. [And a few words about "Legends"...]

Saturday, 11 May 2019

BLIND TEST Results Part 2: "Do digital audio players sound different playing 16/44.1 music?" - Relative objective performance. [And a few words about "Legends"...]

Last week, I revealed the four CD-resolution / 16/44.1 playback devices I used for this blind test. (By the way, if you want to have a listen to the original 16/44.1 track excerpts used, I added a link as an addendum to the post last week.)

While I'm still doing some counting, calculations, and writing the summary of the data, I think it's best this week to start by having a better look at those devices and seeing what objective results tell us about them so we can hypothesize what we might find when we examine the blind test results.

While it may be controversial to some in the audiophile world, I think we should keep our minds open to the idea that science and technology for digital audio playback has already surpassed the human auditory system. As discussed years ago, the human perceptual and cognitive systems do not have infinite resolution. I'm of course not discounting that the ears and mind have excellent abilities when it comes to appreciating miniscule differences, yet I think we have to remain humble; even if we believe we have "golden ears". Also, even if we don't believe measurements capture everything, it's not unreasonable to accept that the vast majority of what is heard can already be quantified in terms of fidelity to the source. This is why I think we need to explore the objective performance first... Then we can see if the subjective preferences from the respondents line up with expectations. I think for many of the well-respected audio engineers that design our hi-fi gear, this sequence makes sense. Ensure that the measurements are decent first, then verify and tweak with subjective listening.

Over the years, I've measured most of these devices in the blind test separately with different ADCs. For this post, let's run each one through the RME ADI-2 Pro FS and compare objectively using the exact same "measuring stick". For completeness, here's what the measurement chains look like with each player/DAC:

Device A:
Music stored on Firecuda HDD as FLAC, played using Foobar 1.4 with WASAPI driver, ASRock Z77 Extreme 4 motherboard , Intel i7-3770K CPU, 16GB DDR3 RAM, nVidia 1080 GPU, rear motherboard analogue audio output --> 6' phono to RCA cable --> RME ADI-2 Pro FS --> 6' shielded USB --> Windows 10 laptop

Device B:
Music on iPhone 6 as ALAC files, played with iTunes (iOS 12.2) --> 6' phono to RCA cable --> RME ADI-2 Pro FS --> RME ADI-2 Pro FS --> 6' shielded USB --> Windows 10 laptop

Device C:
Music on Windows Server 2016 as FLAC, Intel i7-770K CPU computer, Roon 1.6 software --> gigabit ethernet --> Oppo UDP-205 as Roon endpoint --> 6' XLR cable --> RME ADI-2 Pro FS --> 6' shielded USB --> Windows 10 laptop

Device D:
Music burned on Memorex CD-R at 16X, Sony SCD-CE775 SACD/CD player--> 6' RCA cable --> RME ADI-2 Pro FS --> 6' shielded USB --> Windows 10 laptop

To start, let's look at the impulse response of each device:

Each of these were captured with the ADC sampling at 768kHz. As you can see, we have 2 devices - iPhone 6 and the Oppo UDP-205 - using minimum phase filtering, while the ASRock motherboard and Sony SACD/CD player are using linear phase filters. Based on the amount of ringing, we can predict that the iPhone and Oppo are relatively steep, certainly steeper than the ASRock motherboard.

[Thanks to Eric W. aka ezman for sending me the Sony SCD-CE775 service manual. I see that this SACD player is using TI/Burr-Brown DSD1702 DAC chips (3 of these in there for multichannel SACD). A standard delta-sigma architecture that can handle both PCM and DSD dating from 2001. 8x digital oversampling linear phase digital filter.]

The relevance of minimum phase vs. linear phase is in my mind questionable. Unless the frequency response is grossly altered (like for the Pono Player in the past), the blind test here a few years back didn't show significant subjective preference even with steep filters that would have altered the phase of the minimum setting by quite an amount.

Using RightMark testing, we can get a good sense of some standard device measurements like frequency response, noise level, stereo crosstalk and distortion:

Notice that 16-bit measurements these days tend to yield little variation between decent devices when summarized like this. Perhaps predictably, the ASRock motherboard's measurements do show the least flat frequency response, highest overall noise level, and higher THD and IMD+N distortion results. We see that the stereo crosstalk is highest with the iPhone 6; remember though that -76dB is still very good especially with today's dynamically compressed, noisy masterings in mobile listening situations where ambient noise tends to be high. Besides, I have never come across any real music where this would present as an audible limitation (remember, phono cartridges have something like 20-40dB channel separation in comparison).

We can visualize these numbers better and in greater detail with the graphs...

As you can see, when we look at graphs, we can pick out details not readily apparent with just the single-dimensional numeric summary. For example, we can see that the ASRock MoBo and iPhone 6 have similar frequency roll-off which isn't a big deal overall. Compared that to how remarkably flat the Oppo achieves.

What is a much bigger deal is the low frequency noise. It's markedly higher with the motherboard and a 60Hz hum is much more prominent peaking at around -104dBFS. The Sony SACD player also demonstrates some 60Hz hum but -10dB lower with a peak at -114dBFS. The iPhone, being a mobile device not attached to a 60Hz power outlet is free from hum and likewise the Oppo's power supply is excellent at rejecting the hum and achieves impressive low noise level.

As mentioned above, we see that the iPhone has higher stereo crosstalk than the others but it's consistently flat while the ASRock motherboard varies a bit and has a pattern of increased crosstalk in both the low bass and above 6kHz; at least keeping crosstalk low in the frequencies where hearing is the most sensitive. Finally, the iPhone has a higher IMD+N sweep followed by the Sony SACD player while the motherboard performed quite well from 2.5kHz to 20kHz. The Oppo performed fantastically.

And how about jitter?

I suppose we can say that the Sony SACD/CD player performed the "worst" on the jitter test with symmetrical sidebands which could look like "skirting" on a lower resolution FFT (I'm using 1M-point FFT in these plots) but notice that the sidebands are only reaching up to -120dBFS and they're congregated very close to the primary signal, hence in reality would be masked when listening even if the sidebands we at significant amplitude levels. Remember that the J-Test was designed to stimulate jitter sidebands which means than in reality, most music playback will not even result in such observable anomalies. When we compare these tiny anomalies with say the magnitude of hum with both the ASRock motherboard and Sony SACD/CD player, we get a sense of just how small of an issue this is. Over the last 15 years, I've noticed that reputable devices typically do very well while back in 2001 when the Sony player was released, jitter was worse. The Oppo UDP-205 does have a number of sidebands that are not likely jitter related at very low levels - below the jitter modulation signal down at -130dBFS.

Looking at the J-Test graphs above, one can also appreciate the Oppo UDP-205's extremely low noise floor achieved through the balanced XLR output. This is clearly superior to all the other devices and this allows us to appreciate the low level J-Test jitter-modulation signal (229Hz low level square wave) rising out of "silence" even better.

Considering the various factors, I think audiophiles would probably suspect that from "best" to "worst", supported by objective results, the sequence would be something like this:

Device C - Oppo UDP-205 XLR out - also remember, no digital volume adjustment of the recorded audio for level matching

Device B - Apple iPhone 6 headphone out - low noise but higher stereo crosstalk and distortions

Device D - Sony SCD-CE775 SACD/CD player RCA out - 60Hz hum @ -114dBFS, little more jitter, middle of the road distortion levels

Device A - ASRock Z77 motherboard, i7 CPU, nVidia GPU, switching power supply, headphone/phono out - nastiest 60Hz hum @ -104dBFS with harmonics, least flat frequency response, higher THD

Depending on the emphasis put on the objective parameters, I think we could say that the iPhone and Sony SACD/CD player perform "in the middle" of the pack. The performance difference between "best" and "worst" (Oppo vs. ASRock motherboard) is objectively quite obvious!

So then, now that we know this, next time let's dig into the results from the 101 respondents. Is there a correlation between what is measured and what is heard by the group? Can we even assume that "better" objective results as discussed here will correlate to sonic preferences?

We'll see :-).

For now, what I can say is that with a number of demographic variables collected and a number of subgroups to analyze, it'll take me a good chunk of time this next week to assemble the results... Stay tuned.

----------------------------

To end off, let's see what's in the news this week...

Munich High End has started so I'm sure we'll see a slew of product announcements and show reports. As usual, impressions of sound quality would be hard to pin down given the showroom limitations. It will be interesting to see if there are any exciting new developments beyond specific products on the horizon. Based on observations from previous years, after Munich, there's usually a bit of a slow down as we get through late spring and summer months, then picking up in the news cycle in early September with Rocky Mountain Audio Fest (Sept 6-8 this year) for us North Americans at least.

Another interesting post I saw this week was the interview with the "legendary" Ted Smith of PS Audio on AudioStream. While I think there are some who have achieved legendary status in hi-fi audio - I particularly think of all those who have contributed technically to hi-fi and the interesting people who have given of their thoughts, time, circuit and speaker designs - we can all consider for ourselves if Ted Smith fits our definition. While the word "legend" is OK I think, I do strongly feel that there are no gods, popes, imams, muftis, mullahs, ayatollahs, rabbis, ministers, monks or high priests in audio.

I find it fascinating that just like in 2014, Mr. Smith continues to harp on the harms and audibility of jitter. Let me remind everyone that I am not a "believer" that jitter is a big deal over the last decade plus. As you can see above with the J-Tests, even with inexpensive motherboard audio output, there's no good indication of anything to worry about. Have a listen to some samples of simulated jitter if you haven't already from late last year.

I suspect that Mr. Smith is simply "talking his book" about why he thinks his FPGA-based, upsampled-to-DSD DACs are great (don't get me wrong, the stuff sounds good when I've heard them at the dealers). He has the right to sell his products and tell interesting back stories about his life and how his designs impressed Paul McGowan... (Every "legend" and "hero" needs a good origin story!) But when it comes to his assertions that improvements in sound are due to jitter suppression, I wish he would just stop talking. Hasn't he said enough? As an engineer, for the sake of not (perhaps) inadvertently sowing fear, uncertainty and doubt, why don't you just show us what jitter you're dealing with? (I am of course assuming that FUD isn't the intended marketing tactic being used here.) Why don't you just put it into context what level of distortion you're correcting? Since jitter is a temporal phenomenon, what marker of temporal anomaly are you using to fine-tune the engineering beyond claims that "the bass may not feel right" when jitter is high or "(people) start listening" with lower jitter as per the interview?

[Notice his bizarre story around 24:40 where he spoke of hearing high and low jitter CD's at some demo that he doesn't seem to understand and then just runs with it to perpetuate an unsubstantiated, unconfirmed, unsourced myth... I personally don't think this is befitting of his "legendary" status. Also, observe how the interviewer doesn't produce any challenging questions around these claims! Why's that?]

Sure, the name "Obsidian" for the new DAC is cool - hard like a strong foundation, dark, mysterious, very masculine - but for US$20-25,000, regardless of exotic components, how big and heavy the boxes are, how "uncompromising", or how the thing looks, can he just show us where and how much the performance improved? Specifically, where is the improvement in jitter he's talking about all this time especially compared to his own DirectStream DAC priced around $US6-7,000? Notice the objective limitations of the PerfectWave DirectStream DAC as measured by Stereophile. It's too bad they didn't check the J-Test when the firmware was upgraded in 2015 (we can maybe guess why). And we have not seen any further measurements since with the various firmware/FPGA upgrades. Mr. Smith, please show us some facts next time.

A shout out to an interesting cable company in Australia - NB ("No BS") Speaker Cables. Be careful with the Snake Oil, and maybe someone can let us know how the Burn In worked out. Love the honesty and humor, guys. Good luck with the business!

Wishing you all happy listening and a great week ahead!

** Part III: Listener Results posted **

13 comments:

NB Speaker Cables11 May 2019 at 17:56
Quite the spike in traffic this morning! Traced back to this blog :). Thanks for the shout out mate. You've got some solid content in here. It's great to see someone taking the time to take measurements, analyse them and share/explain them.

Flick us an email at sales@nbspeakercables.com.au and we'd be happy to send you out a free pair of cables of your choosing! We can even apply some snake oil and burn them in for you too ;).

Thanks again!
ReplyDelete
Replies
ezman12 May 2019 at 13:15
You wrote,

> The relevance of minimum phase vs. linear phase is in my mind questionable.

One of the major bits of hoopla that vendors have been using to distinguish their products in the last decade is the type(s) of digital filters in their product. I've been completely ignoring this.

My mind was recently changed about the audibility of minimum or linear phase filters, but not the relevance. I bought a UDP-205 and a yggdrasil, to find the best one to replace my ailing DAC1PRE. This gave me the chance to play with the seven or so different filters supported by the ESS ES9038PRO chip in the Oppo. I *could* differentiate between them, just as I could the devices in this test. But my preference varied from recording to recording. I tended to prefer the linear phase filters, but not often enough for it to be statistically significant.

I hypothesize that some recordings sounded better with one or the other because the signatures or artifacts of the filters or other processing used during the recording process are either exaggerated or complemented by the playback filter. This makes the ultimate playback effect of a filter recording dependent. It's even more muddled since each stem in the recording may have different artifacts than the others. This is like the problem of absolute polarity: you can't always correct for a recording with the "wrong" polarity because some instruments may have correct polarity while others are reversed.

Filter have a very minor effect. Audible, yes. Consistently and relevant, no.

OTOH, I rated the Sony SQ best, with the Oppo next best. Could be the Sony's linear phase filter. :)

> Unless the frequency response is grossly altered

I'm also puzzled by the audibility of the different filter types, because the passband ripple is typically minuscule at the cutoff frequency, and less further down into the passband. I found some graphs of the FR of the filters in the ESS chip and the worst ripple was under 0.1dB at cutoff, a frequency (~20kHz) few of us can hear, let alone being able to discriminate a FR anomaly this small.
ReplyDelete
Replies
Anonymous13 May 2019 at 12:51
Thanks for the measurements. Apart from power supply spikes and probably GPU pollution, this particular mainboard with ALC898 seems really better than ALC892, especially IMD and jitter. With ALC898 those 250Hz square wave spikes are clearly visible in the 16-bit J-Test rather than blended with the elevated noise floor in ALC892.

Also interesting to know the Sony SACD player uses TI chips. Your Teac has a newer TI chip but there is no intersample headroom at all! From your old blog entry it clearly shows that the Sony has some headroom. That means Sony, as one of the designer of the CDDA format, cared about this phenomenon from the very beginning.

About smartphones, one potential issue is output level. I have a Xiaomi and it only has 0.3Vrms max output, a Samsung with 0.5Vrms, and a Nokia Windows phone with 0.6Vrms.

https://www.phonearena.com/phones/HTC-One-M8_id8242/benchmarks
Click the "AUDIO OUTPUT" section and use the "Add phone" drop-down list to search through their database and see the output voltage of different phones.
ReplyDelete
Replies
Allan Folz15 May 2019 at 08:47
The proof will be in the survey results, of course, but to my mind crosstalk and distortion should be of higher importance than noise and vastly higher than jitter.

Apple is able to skate on its use of battery power for noise, but nevertheless I feel the highest crosstalk and distortion numbers should put it at the bottom of the objective list.

Between the ASRock MB and the Sony, I'll call a tie until the survey results come in. That mains hum is ugly relative to the competition, but in absolute terms, -105dB should be completely inaudible, unless I've gotten confused on scales. The distortion products may be enough to bump the ASRock above the Sony. OTOH, cross-talk is higher across the spectrum. These differences actually make for a pretty interesting test case.

Cheers.
ReplyDelete
Replies

Add comment