In this installment, let's have a look at the results from the 24-bit vs. 16-bit listening test among respondents.
First I need to remind everyone that the test procedure was not easy. As demonstrated in Part I, the sonic difference between the original 24-bit track and the 16-bit dithered version is down below -90dB. This makes the test much more difficult than the previous high bit-rate MP3 test from last year... Whether you were able to detect the 24-bit version or not, I applaud your efforts and input.
As I noted previously, there were 140 total respondents and looking at the transfer statistics from my FTP server, I know the test was downloaded at least 350 times. Response rate just based on my FTP server transfer was therefore about 40% of all who downloaded. The actual response rate would likely be significantly lower since there were other download sites.
First let us consider the characteristics of the respondents taking this blind test. Being that this is an internet test, involves downloading 200MB worth of high-resolution audio data in FLAC, and given the target audiophile forums where the test was advertised, it is reasonable to conclude that many if not most are tech savvy audiophiles rather than the "average" music listener.
Not surprisingly, the vast majority (98%) were men which is expected (just have a look around audio clubs, audio shows, etc.) - thanks to the 2 ladies that responded!:
The survey also asked if some of the respondents belonged to specific categories such as musicians and those with audio engineering experience. This could be useful in the sub-analysis to see if there were more "golden ears" in these groups:
II. Were the 24-bit audio files distinguishable from the same files dithered down to 16-bits (and fed into the DAC in the 24-bit container) by the respondents as a whole?
III. How certain were the respondents that they answered correctly (ie. able to identify the 24-bit sample)?
IV. Were the respondents who felt more certain about their answer more likely able to identify the 24-bit audio?
V. Were the subgroups (musicians, sound engineers, hardware reviewers) able to identify the 24-bit audio better?
VI. Were those with more expensive hardware able to identify the 24-bit audio better?
VII. Did Headphone Use Improve Accuracy?
VIII. Did age have any effect on the accuracy?
This survey was targeted to audiophile enthusiasts who in general reported using equipment beyond typical consumer electronics. The majority (77%) were using audio systems reported in excess of US$1,000 and 22% were listening with systems in excess of $10,000. Furthermore, 20% used an ABX utility in the evaluation process suggesting good effort in trying to discern sonic differences. There were no surprises in terms of demographics with the vast majority being males, with an age distribution centred around 41-50 years old.
Subgroup analysis of "musicians" and those who work with the technical aspects of recording, editing and mixing ("engineers") did not demonstrate evidence of special abilities at discerning the 24-bit audio. The "engineers" group did perform slightly better overall. The small group of individuals who identified themselves as writing hardware reviews did not show an increase in accuracy.
About 50% of respondents admitted that they had low confidence in their ability to discern differences. Conversely, 25-30% (depending on which musical sample) of respondents reported a strong sense of "certainty" that they were correct in identifying the 24-bit sample. Nonetheless, analysis was not able to demonstrate improved accuracy despite claims of increased subjective confidence by the respondents.
Furthermore, analysis of those utilizing more expensive audio systems ($6,000+) did not show any evidence of the respondents being able to identify the 24-bit audio. Those using headphones likewise did not show any stronger preference for the higher bit-depth sample. No difference was noted in the "older" (51+ years) age group data (not surprising if there is no discernible difference even with potential age-related hearing acuity changes).
Limitations of the study includes the fact that this was an open test distributed via the Internet in an uncontrolled fashion. This allowed the opportunity for test subjects to analyze the audio files objectively rather than through pure listening. However, this is also the mechanism of delivery for high-resolution downloads and the test participants would likely be using the same equipment to listen. The benefit of course is that the results may reflect realistic feedback from potential consumers (if not the target audience) of high-resolution audio. Respondents were able to listen in their own home using their own equipment rather than an artificially controlled environment. The fact that there was no time limit (other than a 2 month window to gather survey submissions) should have been a less stressful experience for the testers.
140 participants is not a particularly large number of data points but it was adequate to demonstrate an even 50/50 split in preference across the 3 musical samples; a level of consistency which adds to the idea that listeners were unable to differentiate 24-bit audio from the dithered 16-bit counterpart. Replication of the results is of course advised.
As expressed previously in "High-Resolution Expectations" (See "Good Enough Room?" section), there is no good rationale for a dynamic range of greater than 16-bit digital audio in the home environment. The results of this survey appear to support the notion that high bit-depth music (24-bits) does not provide audible benefits despite the fact that objectively measurable DACs capable of >16-bit resolution are readily available at very reasonable cost these days.
If 24-bit audio imparts no audible benefit when listening to music compared to the same data dithered down to 16-bits, how certain can the audiophile consumer be that higher sampling rates (eg. 88/96/176/192kHz) would make much of any audible difference? This perhaps should be the target for another blind test. Methodologically, it would be extremely difficult to maintain the blind testing condition over the internet since it would be trivial to run the audio files through a spectrum analyser with no easy mechanism to conceal the bandwidth limitation of lower sampling rates (eg. 22kHz frequency headroom for 44kHz sampling). The reader is encouraged therefore to explore the effect of higher sample rates for him/herself.
One final comment in closing. Notice that the Goldberg track was soft and had a peak amplitude of -10.35dB as demonstrated by the DR Meter (see PROCEDURE post). This means that the full potential dynamic range was not being utilize and for the 16-bit dithered sample, the dynamic range can be encapsulated in <15-bits. Even with this limitation, there was no evidence that respondents were significantly able to identify a difference in aggregate or within subgroups.