Previous - Part II: Results
So, what does this all mean?
Firstly, it's important to keep in mind the limitations of this survey. As an attempt to gather testers around the world, there are numerous uncontrolled variables including the varying degrees of technical savvy among users and competence in terms of maximizing the sound quality of their gear. Having said this, looking at the responses I got, I believe most respondents did give the test a fair trial and looking at the responses where equipment was listed, it's clear that the cohort doing this test is beyond the average consumer of audio electronics in terms of quality of hardware. For the most part, even those describing equipment used as <$100, the models chosen are generally highly regarded within the price bracket.
Despite the lack of control of equipment or listening methodology, this test is 'naturalistic' and captures the preference of the "audiophile" in his/her own room, and own equipment. Even if unfamiliar with the music, there's a familiarity with the sound of the gear and the room which one would expect should help with sound quality evaluation. Furthermore, plenty of time was afforded so there should have been no stress since this is not a time-limited task nor were the respondents forced to choose one or the other (as I said in the instructions, I was also interested in those who did not think they could hear a difference).
As I noted in the PROCEDURE page, the MP3 encoding is somewhat unorthodox in that the parameters used were chosen to mask certain anomalies easily detected in MP3 files sourced with standard settings. Nonetheless, I believe the resulting quality still reflects approximately the same lossy characteristic as a direct 320kbps encode. In fact, one might even suspect that these test files could actually be worse (from an accuracy perspective in comparison to the lossless source) because the audio was run through the psychoacoustic process twice (once at 400kbps, second time 350kbps), and in retaining the full 16/44 audio spectrum, significant portions of the bitrate were devoted to encode inaudible frequencies rather than more accurately represent the audible.
Reading the comments on the various message boards, I believe that I have been successful in maintaining the anonymity of the MP3 files. There was one board where someone commented on how the frequency spectrum appears unusual but was not able to identify which was MP3 sourced.
As for the test itself (a blind AB comparison) and the survey question "which Set sounds inferior", the respondent has to make 2 choices:
1. Is there a difference between the two Sets of audio? If not, the respondent can vote "no difference".
2. If there were a perceived difference, which is "inferior"?
For question 2 above, intellectually we can imagine that "lossy" compression implies the music has been altered such that the loss is somehow bad or a degradation in quality. Likewise, the general consensus in media (as per my links in Part 0) suggests MP3 should be "bad sounding". But isn't it also possible that running music through a psychoacoustic model may "clean up" the sound by retaining a focus on the most relevant signals? One might imagine that this might come across as a less noisy background or reduced ultrasonic intermodulation distortion since high frequencies are often filtered out. An alternate model like the ABX paradigm would have resolved these two concurrent decisions but ensuring the integrity of a blind test would be impossible.
Even based on the result from this admittedly small survey of 151 respondents, there was a significant preference for the sound of the MP3 Set (ie. most thought the lossless Set sounded "inferior"). The fact that a significant result was achieved suggests that high bitrate MP3 is NOT strictly "transparent" since this would imply exactly the same sound and presumably a random insignificant result. The fascinating suggestion from this dataset therefore is that in a blind test, most listeners would actually consider the MP3 tracks as sounding better! This pattern of preference surprisingly appeared EVEN STRONGER in those using more expensive equipment to evaluate. Furthermore, respondents who thought there was a greater difference in the more "noisy" and distorted track 'Keine Zeit' also showed an even stronger preference for the MP3 encoded version (some were very vocal in noting how "obvious" this was) even though from an objective perspective, this was the most difficult track for MP3 encoding.
As with any survey / study based on group results, even though the consensus points to one conclusion, this does not necessarily apply to everyone. To be clear, there were a few respondents who appeared very sure of their perception in the survey and proved to have been correct.
Going into this endeavor, I expressed that my reason to do this test was to find out whether MP3 encoding resulted in significant deterioration in sound quality. From what I can tell with 151 responses from around the world, a majority did not find a significant deterioration, and surprisingly most thought it sounded superior! Let me know if you've seen any other tests show such a bias.
Thanks again to all the respondents in contributing their time! :-)
Continue to - Part IV: Subjective Descriptions