Last week in Part 2, we reviewed the objective measurements of the 4 devices. I hope the readership recognizes the importance of doing this to set the context of what we're looking at this time as we dive into the results from the blind test respondents. As with many things in life, it is only with having facts at our disposal first, then we can make comparisons and develop ideas based on this foundation of knowledge.
Part I: Demographics of the Respondents
Looking into the data set, the "high end" at >US$100,000 consisted of a system based on the Benchmark DAC3 (~US$2000), Meridian 818v3 "Audio Core" (~US$16,000) driving Meridian DSP8000SE (>US$60k) active flagship speakers. Furthermore, this tester used a Roon Nucleus server and microRendu streamer. He also listened with the Benchmark's HPA4 (~US$3,000) headphone amp with Audeze LCD-4z headphones (~US$4,000). Nice!
An example of the "low end" of the price range is a respondent who used a FiiO X1 (capable of up to 24/192) and Sennheiser PX 100-II, and another respondent with what looks like a similar type of source (not specified) with Audio-Technica headphones.
In between we have all kinds of gear. Here's most of what the respondents used (somewhat in the order of the submission, apologies if I missed a few devices here and there):
Speakers: ATC SCM40A, Joseph Audio Pulsar, Dynaudio 62, Linkwitz Orion active, Piega Coax 30.2, DIY transmission line, Tannoy DC10ti, Omega Alnico, PMC IB2, Infinity Renaissance 90, Elac Debut B4, Definitive Studio SM45, Magnepan MMG, Impulse Model 24, Totem 100, PSB Alpha PS1 + SubSeries 100, Elac FS407, PMC Fact.8, Harbeth M40, Vivid Giya G3, ProAc D40/R, Amphion Krypton3, Dali Zensor 1, Linkwitz LX-Mini, Quad ESL 63, JBL LSR305, Focal Chorus 726, Rogers LS3/5A, Linkwitz LXmini+2, Harbeth P3ESR, KEF LS50, Magico S3 MkII, KEF Blade, Ino Audio piP, Dynaudio Special Forty, System Audio Pandion 20, JK Acoustics Optima IV, Zaph ZRT 2.5, Wavecor Facette, Goldenear Triton 5, Genelec 7070A sub, Amphion One18, Martin Logan Vista + SVS PC-Ultra subs, Kreisel Sound Quattro Cinema, Focal Electra 1028 Be, AVI DM12, Martin Logan ESL, Usher Dancer Mini 2, Magico V3, Verity Sarastro II, Paradigm Persona 5F
Headphones: 1More Quad IEM, AKG K701, Sennheiser HD650, Stax SR202, Ultimate Ears Reference IEM, Beyerdynamic DT1350, Oppo PM3, Audio-Technica ATH-M50x, B&W P3, Klipsch Heritage HP-3, Bose Quiet Comfort 2, Denon AH-D600, AKG K271, Sennheiser HD280Pro, MEE Pinnacle P1, AKG K518LE, FiiO F5, Sony MDR-Z900, Sennheiser HD800, AKG K702, NAD Viso HP50, MrSpeakers Aeon, Etymotic HF5, Sennheiser HD380, HifiMan HE-400i, Sennheiser Momentum 2, LZ Audio A4, Beyerdynamic DT 1350 Pro, FiiO F9 Pro, Beyerdynamic DT 770 Pro, Mad Dog modified Fostex T50rp, Beyerdynamic DT 1990 Pro, Sonus Faber Pryma, Shure SRH440, Sennheiser HD700
DACs: RME ADI-2 DAC, Oppo BDP-105D, Monarchy M24, Allo DigiOne with Pi, Oppo UDP-205, Aune X1S, Denafrips Terminator, TEAC UD-503, Schiit Yggdrasil, BlueSound Node2, Audio-GD NFB 11.28, iFi nano iOne, Mytek Liberty, Yamaha WXC-50, Linn Akurate DSM, Benchmark DAC2 HCG, Objective DAC, Lynx Hilo, Topping D50, Holo Audio Spring DAC L2, RME ADI-2 Pro FS, Light Harmonic Vi DAC, Slimdevices/Logitech Transporter, Cambridge Azur 851D, Tascam US-2x2, Berkeley Alpha DAC 2, T+A DAC 8
Amps: Devialet 120 Expert, ATI 6012, Pass Aleph 3, TACT, NAD M32 with Bluesound MDC module, Bryston 4B SST2, Bow Technologies Walrus, Parasound A23, Yamaha RX-V2092, "vintage" Classe, Cambridge Audio CXR120, Cyrus 8 XPd, Marantz PM6006, Pioneer SC-LX79, Devialet 250 Pro, Devialet D440 Expert Pro, Benchmark AHB-2, Cambridge A1, B&W AV5000, Quad 606, Wadia 151 PowerDAC, Pioneer Elite VSX-30, Hypex UcD180 modules, Peachtree nova150, Gainclone amp, Hypex NC252MP module, XTZ A2-300, Devialet Expert 200, Devialet Expert 1000Pro, Simaudio Moon 240i, NAD C390DD2, JK Acoustics Active 65, NAD M22 V2, Denon X7200, Q-Watt DYI, Red Dragon S500, DA&T A38, Ayre MX-R Twenty, Coincident Dragon 211P
Headphone Amps/DAPs: Oppo HA-2, Stax SRM212, Geek Out 1000, Topping NX4 DSD, Naim DAC-V1, iFi nano DSD, Dragonfly V1.2, Behringer U-Phoria UMC204HD, Schiit Fulla 2, SMSL iDEA, iFi xDSD, Chord Hugo, Pono Player, FiiO Q1MkII, iFi iDAC, Focusrite Scarlett, FiiO X1, Schiit Jotenheim, Chord Mojo, Sennheiser HDV 820
Preamps/Others: MiniDSP DDRC-22D, Lyngdorf DPA-1 preamp, Bow Technologies Warlock, miniDSP 4x10HD, Quad 34, Hypex DLCP, Daphile music server, Pink Faun 2.16 streamer, Chromecast Audio digital out, RME DigiFace, iFi iSilencer 3.0, JK Acoustics Reference PreAmp, Marantz AV 8003, iFi iTube2, Ayre KX-R
Whew! What a list... As you can see, some of the devices are DACs/pre-amps/amps so I just listed them in the most relevant category based on the system description. Notice that the list includes quite a variety of gear ranging from vintage to modern HiFi, commercial products and DIY projects, well known and esoteric brands... Most importantly, I think this is a nice cross-section of the devices "real people" in the audiophile world use, at least the guys (and gal) interested in audiophile technical discussions and participate in testing on a blog like this!
By going through each response and reviewing the equipment list, it allowed me to check that the entries were complete and that the gear looked reasonable for the hi-res (24/96 playback) demands of this blind test. This also gave me a sense of the lengths many went through in performing the listening test as well as the caliber of the audio systems used! Some of you are clearly adept in DIY audio, constructing devices from Hypex amp modules, Linkwitz speakers, and I see custom speakers with Scan Speak drivers and such. Some described their treated and custom sound rooms. Some tried listening with and without DSP room correction. Some of you used ABX testing and other blind-test software. The impression I get is that overwhelmingly this is a group of audiophiles who know what they're doing regardless of the price tag listed. Thank you for doing the "work"!
Part II: Was it easy to hear a difference? What device(s) did the listeners prefer?
A. ALL Respondents (n=101)
Was it easy to hear a difference between the devices?
I think it's clear from the graphs above that for most listeners, the audible differences between the devices were small, it was not an "easy" test. Keep this in mind as we go through all the subgroups! The first graph asked if respondents could subjectively quantify the difference between what was thought to be the "best" device from the "worst". As you can see, about 20% thought there was either a "huge" or "big" difference, 80% thought at best the difference was "small". Included in that 80% who thought they heard a small difference, the majority, 58% felt it either wasn't worth spending money on an upgrade or felt they heard no difference at all.
An even harder question was whether the "best" and "second best" devices sounded different as shown in the second graph. The reason I did this was because the price difference, use of XLR cabling, and the lack of post-digitization volume correction with the Oppo UDP-205 test samples all in theory might have separated the Oppo from the rest of the devices, thus potentially creating a significant gap between "best" and "second best" sounding devices. Hypothetically, if this happened, then this second question's results could have been similar to the first graph; those who thought the difference was "huge" or "big" might have detected that most of the difference was between the Oppo and everything else. As you can see, there did not appear to be any special ability to differentiate between "best" and "second best". Only 3% thought there was a "big" difference. Only 14% thought the difference was small but worth spending money to achieve an upgrade. And a large majority of 84% of respondents thought that either there was no noticeable difference or even if present, "not worth money to upgrade".
So what devices did respondents prefer?
Without any filtering of responses or looking at sub-samples in the 101 respondents, this was what the result looked like:
This graph is the average score if we were to assign the number 1 as "best" and 4 as "worst" for each device "voted" on by the respondents. Therefore a lower score means that on average more listeners ranked the device as sounding "better". Notice that Device B (iPhone 6) on the whole scored best followed by Device A (ASRock motherboard)!
Let's not get too excited about this and crown Apple the winner just yet :-). Remember that often we do need to look deeper into the numbers to discern what's actually going on... Instead of just averaging things out, let's actually count the number of votes and look at the preference pattern for each device:
Isn't that interesting? For each device, the largest number of "votes" was in the order of presentation! Many respondents, specifically the ones who thought there was "no noticeable difference" simply voted "A-B-C-D" to create this pattern. In fact, for those who thought there was "no noticeable difference", the "A-B-C-D" pattern of response from best to worst accounted for almost 80% of those votes. This is basically "noise" that needs to be filtered out if we are to hopefully understand the true preferences of those who felt they could hear a difference.
B. Respondents who reported hearing a difference (n=73)
If we filter out the "no noticeable difference" group (80% of which simply voting "A-B-C-D" as mentioned above), the total number of respondents to analyze goes down to 73, and here are the average scores:
The average scores now can then be expanded in the same way as what I did above to show preference patterns for each device. To keep it simple, if we assume that this is all random, with 73 raters, distributed 4 ways (best to worst) for each device, we would predict an average of 18.25 "votes" for each level of preference. Statistically, we can run a simple χ-square test with 3 degrees of freedom (4 ranks for each device) and compare the respondent preferences versus the "null hypothesis" of a purely random distribution.
As you can see. Based on the usual two-tailed p-value of <0.05 as threshold of significance, indeed the pattern of preference shown for Device A (the ASRock motherboard) is significant! What it suggests is that the blind test respondents ranked this device as "worst" to a significant degree.
In comparison, none of the other devices had a pattern that significantly deviated from the random "null hypothesis". However if you examine the distributions, we see that both the Oppo and Sony SACD players had fewer people ranking them as the "worst" sounding devices which is why on average they scored better than the iPhone 6. In this group of 73, there did appear to be a trend with preference for the sound of the old Sony SCD-CE775 player but the pattern was not statistically significant.
C. What can we say about other subgroups?
Did the older groups (41+ years) compared to younger age groups (<41 years) have different preferences?
If we look again at the complete data set of 101 respondents and just picked out the "younger" folks <41 years old, here's what their demographics look like (there were only 24 respondents in this "younger" age group):
We can compare this to the "older" group of 41+ who were more numerous (n=77):
Although the sample size for the "younger" testers is smaller, the results do support the impressions we might have as audiophiles that the younger folks are more inclined to be listening with headphones, generally have less expensive systems (many in this age group used systems in the US$200-500 range), and interestingly felt the Joe Satriani "Crowd Chant" (more "modern" production, lower DR sound from 2006) was more resolving of differences if they had to choose one of the tracks.
For the "older" age groups 41+ (I would be included in this category), "we" tended to listen to our music through speakers, had more pricey sound systems (largest number of systems in the US$1000-2000 range), and more respondents thought the Maxi Priest "Wild World" track from 1987 provided better sonic differentiation. The "older" age group thought the least of the Joe Satriani track as audibly different between devices.
As for the magnitude of audible difference heard, both the younger and older subgroups had ~60% of respondents saying the difference was either unnoticeable or too small of a difference to spend money for an upgrade between the "best" and "worst" sounding devices.
As above, if we now filter out the listeners who were unable to hear a difference (and their tendency to vote "A-B-C-D"), how did the younger and older subgroups rank the devices?
While both the younger and older subgroups were able to rank Device A (ASRock computer motherboard) as lowest quality, the older subgroup as a whole ranked Device C (Oppo UDP-205) as being "best" followed by the Sony SACD player and then the iPhone 6!
What's interesting is that the younger group ranked the iPhone 6 as being tied as "best" with the Sony player. In particular, it's the "30 somethings" who selected the iPhone as "best". Perhaps it's tempting to think that younger folks are more "used" to the sound of ubiquitous devices like our cell phones? With only 18 "younger" respondents who thought there was a difference in sound, the number is too small for statistical significance. A finding to keep in mind though for future consideration and further testing perhaps.
As above, we can look deeper at the 41+ "older" subgroup with 55 respondents and examine their preferences in greater detail:
Despite a reduction in total number of respondents from 73 to 55, though not strictly p<0.05, the motherboard's pattern of being rated as "worst" was essentially a significant finding (p=0.055). With this subsample, the Oppo UD-205 did quite well with most respondents ranking it consistently as "best" or "second best" and few thought it sounded "worst".
How did the musicians, audio engineers, and audio reviewers/writers rank the devices?
Remember that I had asked the respondents whether they have musical performance experience, had formal training in audio engineering / sound evaluation, and if they published audio equipment reviews. It's complicated in that there is some overlap between these groups as you might imagine; about half of the audio engineers and audio equipment reviewers also had musical performance background. Let's keep this simple and examine each subgroup separately despite the understood overlap.
Once we filtered out the "no noticeable difference" group, here's how they ranked the devices:
Not bad. All 3 subgroups were able to rank Device A (ASRock motherboard) as the "worst", again consistent with objective testing expectations. I am impressed that the "musicians" as a subgroup had a particularly strong tendency to vote the motherboard as sounding "worst"! Here's their breakdown of preferences:
Despite the smaller number, it's obvious that the musicians had a strong distaste for the ASRock motherboard! To the point where it's clearly statistically significant.
For all 3 subgroups consisting of listeners with extra expertise in audio and music, both the Oppo UDP-205 and Sony SACD player again beat out the iPhone 6 and took turns at being "best" sounding.
Remember though that this is after excluding those who could not hear a difference. Of all the engineers/trained listeners, 85% felt they could hear a difference of any magnitude. Of the musicians, 66% thought there was an audible difference. And finally, of the equipment reviewers, 64% felt there was a difference.
I wondered which music track each subgroup thought was best to hear a difference with (when they reported hearing a difference of course):
Whereas the audio equipment reviewers and audio engineers were more agnostic as to which track was best, the musicians focused on the ones with more dynamic range, especially the Stephen Layton & Britten Sinfonia "Handel Messiah" track (DR15). Again, an interesting contrast with the "younger" <41 y.o. group above who used the Joe Satriani track (DR9) more and picked the iPhone 6 as equivalent to the Sony as the "best" sounding device.
Let's examine one more subgroup...
How did those with more expensive audio systems fare? Let's focus on the groups using >US$10,000 worth of gear...
Remember that we have multiple overlapping variables here. Keep in mind the age correlation for example:
As expected, those with "higher end", more expensive systems are generally older and in this test, there was a large cohort in the 51-60 age group using US$10k+ of equipment for this test. This group was 100% male. While not shown here, 84% of the respondents in this group used speakers, while 16% used both speakers and headphones. It would be rather unusual to have just a headphone system cost over $10k although given today's prices, far from impossible!
Within this group, 28% could not hear a difference and another 24% thought any difference was so slight that there was no point paying to upgrade. So even with rather expensive sound systems, 52% were not reporting significant differences in the sound. Also, notice that none of those using >US$10,000 worth of gear for this test thought the difference was "huge".
If we now take out the 28% who reported not hearing a difference, this is the device preference ranking (n=18):
|Remember: lower average number means ranked as "better".|
This ranking is similar to the 41+ "older" group (again, remember the overlap between age and price of sound system). Though subtle, this "more expensive system" subgroup's average score for the Oppo was slightly lower ("better") and the ASRock motherboard slightly higher ("worse") than those in the the 41+ age group; in other words, the "spread" widened. While I would not be able to make a case for statistical significance, this is at least consistent with the idea that better equipment might provide better resolution to make differences more noticeable. Alternatively, maybe the "serious" audiophiles who buy expensive equipment were more attentive or capable to tease out the sonic differences.
Part III: Discussions
What are the take-home messages from this blind test?
1. I think these 101 respondents provided an interesting demographic glimpse into what I suspect are a rather "serious" bunch of audiophiles ;-).
I don't think it's too bold to suggest that audiophiles/music lovers reading this blog are likely better educated about computer technology for handling the downloaded files and digital playback than the average audiophile. So while some demographic factors might be skewed, I think we're seeing the results from a discerning group of respondents interested in high-quality audio reproduction. Other than a small number of submissions that had to be removed or duplicate submissions corrected, the respondents answered all questions completely and appropriately.
As a publicly distributed test, there are limitations due to lack of controls in place as one might have in a formal "lab" situation. We can't be sure the hardware is set up properly or is adequate for 24/96 playback, can't confirm that drivers are bit-perfect, that the files transferred without error, we cannot check auditory acuity of the listeners, or that the software might not have inadvertently altered the sound... However, respondents listening in the comforts of their own homes and using their own equipment allowed for a level of familiarity and intimacy one would not be able to replicate in an artificial test site. Despite the limitations, I think there's value in this kind of "naturalistic" distributed blind test which can give us an idea of how things sound "in the wild" and with "real audiophiles".
2. On the whole, despite the expected disparity in sound quality between a computer motherboard, Apple iPhone 6 headphone output, Oppo UDP-205 (with current "flagship" ES9038Pro DAC) XLR output, and a Sony SACD player with RCA out, the results suggest that audible differences from 16/44.1 playback are not easily heard.
About 60% of audiophiles in this sample of 101 either did not report hearing a difference or did not think the difference was worth spending money on an upgrade for. Only 20% thought there was a large difference.
I referred to the Steve Hoffman Forum poll in Part 1 which started me thinking about doing this blind test. Clearly from the results I've presented here, the idea of "CD players" or more generally 16/44.1 digital players having (audibly) significant "sonic signatures" is not a simple answer that can be spoken of in a binary "yes" or "no" fashion. Objectively, from last week's post we can easily see that there indeed is a different "signature" to each device with various noise levels, distortion amounts, crosstalk differences, jitter, etc... But by the time we ask people to listen, it's no longer so obvious with factors like nonuniform hearing acuity, age, experience, musical preference and equipment used influencing the final result.
3. Despite small differences reported by many, the data did suggest a significant ability for those who heard a difference to identify the computer motherboard as sounding "worst", this is consistent with the objective measurements.
However, the Apple iPhone 6, Oppo UDP-205, and Sony SCD-CE775 SACD players were essentially equivalent with no statistical advantage between them although there were some trends depending on the subgroups. This suggests that there is such a thing as a point of diminishing returns, a "threshold" in quality (and probably price) beyond which it's unlikely that listeners would be able to differentiate between reasonably competent devices.
Where exactly that "threshold" lies is of course up for debate and would depend on the listener him/herself, quality of equipment, perhaps listening experience as suggested by the subgroups. Looking at the objective results from last week, my feeling is that the most likely audible difference is the relatively poor noise floor of the ASRock Z77 Extreme4 motherboard with that CPU / GPU / power supply combination. In particular, the 60Hz hum with various harmonics, and other low frequency noise. If indeed it is the hum, this listening test suggests that the threshold of audibility as a group effect for those who reported hearing a difference is somewhere below -104dBFS and perhaps above the -114dBFS from the Sony SACD player (since many ranked the Sony highly in sound quality). Noise level must also be referenced to the music used. Remember that with more dynamic music, typically with lower average amplitude, one will need to increase the playback volume when listening. This could be why subgroups that focused on more dynamic music (eg. the musicians who preferred the Handel Messiah track with average RMS volume of -20.5dB and DR15) were able to tease out the imperfections of the ASRock motherboard remarkably well. While ideally noise floor should be as low as possible and hum should be absent, it's good to have these numbers in mind as reference when doing objective testing.
Remember that the Realtek ALC898 DAC on the ASRock motherboard is one of Realtek's better audio chips so this suggests that less expensive sound chips in budget motherboards like the ALC892 might perform worse if put into a blind test like this. As you can see in last week's comments, it is quite possible that my nVidia GTX 1080 GPU card is a major source of the poor noise floor. On paper at least, newer solutions like the ALC1150 should sound better with lower noise assuming that its other technical qualities are good.
As a reminder, note that "statistical significance" just means that with a large enough number of trials, we can detect that the "roll of the dice" appears to be weighted unevenly; specifically in this blind test, against the ASRock motherboard to a degree that approaches or surpasses a typical 95% confidence (p value of 0.05). Remember this does not imply that any one person necessarily should or even could hear the difference in the same pattern. Also, it doesn't imply that the difference is strong! As sample size increases, the "power" of the study improves at picking out subtle differences as a group effect.
Always remember this issue of actual perceptible magnitude when you look at studies reporting statistical significance and especially when conclusions are drawn using large sample sets such as meta-analyses (consider this one from 2016 combining all kinds of studies looking at hi-res audio).
4. It's interesting that the "younger" audiophiles (<40 y.o.), who were also more likely to be using headphones ranked the Apple iPhone 6 higher. Although the sample is small, this was the only subgroup that ranked the iPhone quite highly (tied for first place with the Sony SACD/CD player). It is tempting to wonder if there might be some subtle familiarity with the sound itself given the ubiquity of Apple devices these days. Perhaps it's simply the use of headphones. While headphones remove room effects and can improve clarity, the presentation of the sound (eg. the impression of a "soundstage") is different. This difference could affect how listeners evaluated the sound and what qualities listeners paid particular attention to.
Another interesting difference with the "younger" listeners is the music sample they thought was most useful to detect differences in sound between devices. My kids are starting to enjoy pop and rock and the production quality is much different these days (louder, more bass-heavy, "harder", "crunchier", more "synthetic" sounding) than when I was growing up. Maybe this also affected device preferences. The <41y.o. respondents thought a modern mastering like the Joe Satriani "Crowd Chant" was better to hear differences compared to those 41+ preferring to use more dynamic older pop (Maxi Priest) and the multilayered vocals of "Chorus" from Handel Messiah. Again, could this preference affect what was listened for when comparing?
For completeness, here are graphs highlighting this difference in track preference:
Who knows, maybe there's an interdisciplinary dissertation here on studying the effects of modern audio production, audio hardware, human perception, and sociological trends. :-)
In any event, regardless of age, use of headphones or speakers, and musical experience (eg. musicians, audio engineers), each subgroup agreed that the sound from the motherboard was the "worst" as per item (3) above.
5. Unlike previous blind tests where I could identify "golden ear" individuals who scored 100%, this is not that kind of test.
However if we believe that the Oppo UDP-205 "should" be the best device in this field of 4 as shown objectively, then I'd have to give the "Golden Ear Award" to the "older" listeners 41+ years of age as a subgroup.
With 55 in that subgroup reporting hearing a difference, they were able to rank the ASRock motherboard as the "worst" sounding essentially to a statistically significant level, while on the whole suggesting that the Oppo sounded the "best", with the Sony CD player and iPhone in the middle of the pack. This is also the group who used loudspeakers more than headphones, overall spent more money on the sound systems, and felt that more dynamic music helped differentiate the sound.
Also, I must congratulate the musicians in this blind test who reported hearing a difference. With only 19 respondents, you "nailed it" by selecting out the lower quality motherboard DAC. Phenomenal job!
Part IV: ConclusionsRemember that 16-bits PCM and the 44.1kHz sampling rate were not parameters selected out of thin air. Within the limits of technology available at the time, significant research and listening tests helped Sony and Philips develop the Compact Disc and their claim of "Pure, Perfect Sound Forever". These days, after many generations of products with refinements in low-level linearity, lower noise floor, filter optimizations, and jitter reduction, the logical expectation is that 16/44.1 DACs have matured to the point where one has to assume that it should not be easy to detect differences between devices. The idea of "transparency to the digital source" should result in a "common" kind of sound among many devices (especially those with "high fidelity" aspirations) once we equalized listening levels.
Considering the multitude of DACs and players out there, of course some devices can sound vastly different. However, it's also quite likely that such a different-sounding device is either of very low quality (like this) or a company purposely "colored" the sound to differentiate their product (perhaps like this). Needless to say, do not assume that an expensive "high end" DAC/player is necessarily also of "high fidelity" because it sounds different or is subjectively preferred by some!
While opinions are plentiful, unfortunately documented blind tests are precious few. Thanks to vwestlife on the Steve Hoffman Forums, we see this little article from January 1997 in Stereo Review:
|BTW - if anyone's looking for a huge catalog of back issues of Stereo Review to remember what audio journalism looked like and what audiophiles discussed back in the day, check it out here!|
Despite the fact that a number of respondents could not hear a difference, for those who felt they could, it was certainly interesting to show that the data wasn't all just random. Even if the objective superiority of the Oppo UDP-205 and its ESS ES9038Pro "flagship" DAC could not be demonstrated to a significant degree in the listening test, the obvious objective and in turn subjective limitations of the computer motherboard were heard by a significant number.
This is overall good news I think.
For those who don't want to spend too much money on audio gear, this is good news because it means that reasonably low noise, low distortion digital audio devices sound great already (the Oppo, Sony SACD player, and iPhone 6 all did well). Therefore, one should prioritize upgrading other parts of the system like the room acoustics, speakers, and amplifiers (probably in that order) which will yield far greater improvements in the sound.
For those who want to spend more money on the digital source - by all means! There's nothing wrong with seeking out essentially ideal objective performance when we compare the Oppo UDP-205 with something like the Sony SACD player. There are even hints in our results that blinded listeners using more expensive (and presumably better) systems were able to show a preference towards the Oppo with its superior objective performance. The important thing is to be mindful of diminishing returns, linked with value. In the big picture, a device like the Oppo UDP-205 which I bought at around US$1300 (before discontinued) is still "cheap" considering its performance compared to so much of the "high end" these days!
I think this blind test reminds us that ultimately it is important to be realistic and not boast about remarkable audible differences. I've generally felt that flowery descriptions and dramatic backstories in equipment reviews of DACs and various players simply come across as hard to believe and lacking in credibility. I suspect for anyone who has tried blinded listening tests like this or took their time to do volume controlled, blinded, "shoot-outs" of various hardware have also found that even when differences are there, they are typically subtle between competent devices these days. (Yeah, I know some companies, certain press people, and their ad departments don't like to hear this...)
Remember that science is based on empirical observations to confirm or reject results. Consider these test results as "data points". Nothing here is dogma. In time, perhaps the conclusions may change with further systematic testing.
Feel free to do your own blind tests and document what you found. I believe there is no better way to keep oneself honest! Let me know how it goes.
A word of thanks again for all the participants/respondents. You guys (and gal) are to be commended. I hope the blind test exercise provided for an interesting experience with the objective results last week helping to "calibrate" what you heard. Regardless of whether you heard a difference, I know it wasn't a simple exercise - if it were obvious, what gain or challenge would there be? I certainly respect those who take up challenges that promote reality-based perspectives, especially in our little corner of the universe called the audiophile hobby where sometimes reality, Industry-sponsored hype, and pure fantasy can be difficult to tease out. Even worse, at times reality-testing appears to be discouraged by some.
Have a great week ahead. Enjoy the Victoria Day long weekend fellow Canucks. Time for me to spend time with the family, relax, and enjoy the music after this long write-up :-). Cheers!
** Part IV: Listener Subjective Responses posted **