As we close off discussions and posts around the Internet Blind Test of devices playing 16/44.1 music, I want to publish some of the subjective comments from respondents who undertook this test... Impressions in the respondents' own words about the test when they submitted their results to me.
Remember that these are subjective. Human perception, especially when differences are at the margins of our perception are of course tough to describe. And when we compound that with the limited utility of words to describe ephemeral experiences (even with codifying the terminology as was attempted years back), it's no surprise that meaning can often only be conveyed as impressions. It's great to see the respondents trying their best and in many instances, I certainly appreciate the impressive use of language to express the experiences. Let's have some fun with these!
I think the best way to do this is to present the comments organized from those who thought the difference was "huge" or "big", down to those who ultimately thought there was no audible difference. Remember that while objectively we can say something about which device measured better and I'm happy to say "well done" to those who appear to have "golden ears", there is ultimately no absolute "right" or "wrong" to the preferences we make. I'll focus on the longer comments, especially ones that described the perception of specific devices, and add some of my thoughts where applicable.
I. Those who thought the difference was "huge" or "big":
"If I was to listen to these devices on their own, without the ability to make blind comparisons, I’d be happy to own D or C. I’d pick up the problems with A and B, even in isolation.
General comments: The language in my subjective descriptions below must be taken as relative, not absolute! However, I think your recordings or methodology may have issues. Comparison with Original 16/44 sources: I sourced original CD copies of these tracks, but I did not listen to them until after evaluating your 24/96 tracks. I turned R128 loudness compensation turned on (after evaluating your tracks) because the originals were between 3-5dB louder. I heard specific improvements in the original Handel and Maxi Priest tracks versus your recordings:
Handel: better stereo spread, much better individual voice definition, more solid continuo.
Wild World: better mono/center focus on vocals (lead and backing), bass, drums. The other two tracks just seemed clearer overall. I didn’t scrutinize them very hard. Since the only thing common to the 16 recordings is your ADC, this would indicate that the crosstalk and noise in the ADC, and perhaps frequency imbalance between channels, hurt the recordings. Anyways, I’ll leave this as an open issue until you release the actual tracks you used. It’s possible the ones I sourced are different masterings.
Salvant: There is a very low level single squeak at 0:47 just before a note is struck on the piano. This is audible in D only. There is what appears to be noise from the seat of a chair around 0:20-0:30 that is clearly audible on C/D, but only barely on A/B. Starting at 0:50, the singing of “le mal de vivre” is muted on all but D. This line is repeated starting at 0:57, where it shows clipping distortion on A/B but not on C/D. A seems quieter and less weighty, perhaps with a response dip in the low mids. Vocal sibilants are very sloppy, turning "s" into “shhh” sounds, very prominent at 1:38-. B is the noisiest, same sibilance problem, and with distorted vocal transients too. The sibilance on C is very clean but is somewhat separated from the lower vocal harmonics. I'm pretty sensitive to this common phenomenon. D has the longest reverb tail in the first phrase of the track, seeming like one extra reverberant bounce is audible versus the other devices. C is very clean, but D is more realistic and gets my vote for best. During the lead-in, D seems to have the highest noise floor, but this is not apparent elsewhere on the track. For better or worse, my initial ranking was derived from this track didn't change as I listened to the other tracks which spotlighted other differences.
Handel: D has the best focus on the voices and violins, and the strongest continuo. C is close, A/B have very hashy highs in the voices. (Note that the original 16/44 has much better delineation of individual voices and groups in the choir. To be confirmed...)
Wild World: After the intro (after 0:26): C has best, solid bass line, most depth to the vocal, but it is somewhat hashy or grainy. D has the best vocal, with good depth but excellent clarity. A’s lead and backup vocals buzz!
Satriani: No difference in the main components of the music: the kick drum and the lead guitar. The clapping on the backbeat, once the guitar comes in, sounds like clapping on B/C/D, and sounds like noise on A. The crowd chanting is clean on C/D, not so on A/B.
Original 16/44.1 CD music:
What this means is that if you as a listener / tester hear a significant channel imbalance, the imbalance is a result of the playback Device (like the ASRock motherboard) or there might be an imbalance in your playback system that you're listening with, not a fault of the RME ADC.
Another thing to keep in mind is that playback at 24/96 of the ADC recordings might sound different from a straight 16/44 playback on one's DAC due to your playback machine's characteristics. The 24/96 recordings will have captured much of the digital filter characteristics of the test devices (up to 48kHz or so) whereas a 16/44 playback of the source would be using your DAC's digital filter (or even no digital filtering if you're using a NOS DAC).
"D - sharp, detailed. I liked it more, but I put it below."
"Device A sounds plastic and not clear. Devices B and D give a feeling of constant tinnitus in the background. In addition D sounds rough. Device C is cleanest and doesn't put immediate annoyances to front."Interesting descriptions. Well done with selecting Oppo > iPhone > Sony SACD > ASRock motherboard in your response.
"Device A (#4)
Prominent treble and bass (i.e., V-shaped)
Strikes me as very “bright”
Would be uncomfortable to listen to over time
Device B (#3)
More laid back (i.e., less treble) than A
“Sweet” / warm sound
Better balance overall than A
Loud parts got a bit splatty, though
Device C (#2)
Less sweet than B (i.e., more neutral) - balance between treble / mid / bass about the same as Device B
Less “splatty” than Device B on loud parts
B and C very similar otherwise
Device D (#1)
Immediately noticed more “spacious” sounding than any of the others
Very precise positioning of various voices / instruments
Loud parts well controlled
Overall the most detailed and pleasant to listen to"
"First of all, I believe there is a problem with Device B - something is wrong with imaging - is the only track where music comes far outside left/right from the speakers, + too much left image. Phase error?
Device A has difficulties with placing of the instruments, most in the center and difficult to place the voices - mostly noticeable on 'For unto us a child is born'
Device C is very controlled but sounds the most digital (sometimes slightly harsh). Also limited in soundstage depth.
Device D fullest sound, nice depth, especially more palpable voice from Cécile McLorin Salvant.
The difference between (Device) D & C was bigger with source 2 (Roon without filter, i.e. DAC filter is active) , than with Source 1 (DSDIn HQplayer, 256 HQplayer poly-sinc-xtr-lp).
For me the difference between the filters is far more noticable, than upsampling rate, ...Interesting comment... Not sure if I hear that much difference with Device B. Remember that the iPhone 6 like all Apple products I have tested recently use relatively steep minimum phase filters. Not sure if that has anything to do with it though. (Remember that back in 2015, we tested minimum vs. linear phase filters here with the same steepness and I could not find significance for preference... Might be a highly individual thing.)
As I am not capable of hearing anymore >10kHz , there cannot be a direct link what a filter is doing in the frequency area of 20kHz and up (maybe an indirect impact)
So I still do not quite understand how filters really work (keeping in mind that I normally use the same dithering as well)
So, would be interesting to see a test/blog on this subject.
(with HQplayer, I can convinced that with music I know well, I can pass a blind test between poly-sic-xtr, poly-sinc-ext2 and ClosedForm filters)"
"Between the best and worst device I could hear a difference in a variety of things such as overall sound, detail of the voice and background, timbre of the singer, the dynamic range- it felt more powerful in the best device, more spacious but also more realistic- felt as if the singer was in front of me.
For the best and 2nd best there were small but obvious differences: the best device felt more 'real'as if I was hearing it live- all the detail of the recording was faithfully reproduced but felt less in the 2nd best device. Another major difference was the sound felt far more powerful in the best device whereas the 2nd best had a slight less power to it."
Thanks man. Looks like you selected the Sony SACD player as your favourite.
"Best was full bodied last place was not liked at all. 2nd place had crisp sound but not as full as the best. This is my partners system."Thank you! You were the only woman in the blind test and you liked the iPhone > Oppo > Sony > motherboard.
"I came to this test with the strong bias that I probably would not be able to differentiate and yet... C was consistently the one I preferred.
The difference is not dramatic, but everything sounded cleaner/more defined on C I would definitely pick C over A and D at all times.
Personal experience includes 9 pairs of speakers in the house, 4-5 amplification combos and probably a dozen of sources (from Pi + Khadas Tone Board to Linn Akurate DSM). Not a high-end fetishist, some combos definitely sound better than others when mixing and matching, but price doesn't always win by far. If I have that much gear it is because I am a compulsive buyer, not a pilgrim in search of audio nirvana. And not a golden ear either, can't reliably (if at all) detect hi-res from cd quality.
For all I know, C could be cleaned up/processed and less "authentic", but that is the one I prefer ;)
thx for the blog!"Well, surprise surprise! You, sir, obviously picked the Oppo and I suspect you're more of a "golden ear" than you thought :-). Be careful with compulsive buying though!
Device A sounded weird for lack of a better word on my mostly Martin Logan system. Like there was a halo around the performers. Not as obvious on the the Squeezebox monitor system, but still there. I thought Device D was better, but it was kind of "soft" sounding to me for lack of a better word. Not quite as detailed as B or C. I thought B&C were pretty close to each other. Definitely a step above A and D. If I had to pick one of those two it would be C. It just seemed to have the detail of B, but without a very slight brightness that seemed to show up sometimes in B.You clearly did well. I see you selected Oppo > iPhone > Sony > ASRock motherboard. Nicely done.
"I have compared to the original 16/44 tracks that you used for the comparison and I must say that all four sound better than your converted samples! Therefore it was not as easy as with a good source to evaluate a ranking."Right, remember we have to be careful about comparing with the original source recordings as there have been significant changes to the levels made. The comments above with the first respondent applies here. Plus of course I'm recording off a motherboard in one of them :-).
Just to give you an idea of how much change, between the original CD rip of "Handel Messiah" and the Oppo 24/96 recording, in order to equalize volume to an average of -18LUFS target using EBU R128, the original rip would have needed -0.92dB reduction, while the RME ADC recorded version of the Oppo needed +2.3dB gain. That's a pretty significant volume difference of >3dB which if not properly compensated will typically bias against the ADC recorded 24/96 version which is of lower amplitude.
"Wild World - A sounded like was playing in a cave, very small sound stage; D had some weird sibilance, noticable with the bell; B/C sounded similar, good separation with C being the best. Le Mal de Vivre - A sounded a bit nasal; D again had some weird sibilance; B/C sounded very close, with C sounding "more present / life like". For Unto Us A Child Is Born - track content was a too busy to distinguish between A-D.Crowd Chant - I did not like this track so did not use it."Great job with the listening! Oppo > iPhone > Sony > ASRock motherboard.
II. Those who thought the difference was SMALL, but worth spending some money to upgrade the sound.
"I noticed very SLIGHT differences in tone and dynamics. Overall the differences were extremely small and if one of the players cost say $100 at the low end and a few thousand at the high end the tiny performance jump wouldn't be worth that sort of a price increase now if the "cheap" one is $100 and the "expensive" one is say $250 or so then it may be worth it. But honestly it wouldn't surprise me if my "favorite" pick was the least expensive option. Anyway great poll and test had a lot of fun doing it."
"From these tester songs and past experience, I think a higher sample rate mainly benefits highs, but the clarity overall, to me, seems noticeable. Also thank you for introducing me to La Mal de Vivre, beautiful. Please validate my purchases."Hmmm, not sure if the results validated your purchases!? But I see you did pick the Oppo as best and the motherboard in 3rd place :-).
"The bottom end on Maxi Priest was the best differentiator, followed by the sibilance on Cecile."I see you preferred the Sony CD player > iPhone > Oppo > Motherboard based on listening to the bass and sibilance impressions.
"I started with listening for less than 5 seconds to the first track of each device. That first impression told me I could not hear any differences and it would be very hard to hear any differences. All 4 devices sounded good.
After listening to all songs on all devices, I noticed that with device B I was sometimes distracted and least involved with the music.
With device A and D I listened to music. The difference between A and D are zero and had to choose which one came first.
Device C sounded the best. But the differences are very subtle. Small details where easier to hear and to follow. Also details where more alive and had more speed/presence. Device C got me the most involved with the music.
It is always a pleasure to read blogs and I learned a lot, Thank you,
Arjan V******g The Netherlands"
Great work from The Netherlands, Arjan! Clearly you liked C (Oppo) best.
"Soundstage varied slightly, instruments seems placed slightly varied positions, some voices and instruments weren't well separated in D, very similar in ABC."
"Strongly disliked A. Sounded very shouty and thick. Loved B, which was liquid and deep. C and D were closer. C seemed less resolving than B, and at first I was sure D was better, more lifelike. Then on another listen C didn't seem that deficient. That's when I quit as trying to rank the two of them wasn't much fun."Yeah, like I said... This ain't easy :-). The "strong dislike" of A in this comment and among others is what ultimately made the results significant in pushing the ASRock motherboard as the "worst" sounding.
"C seemed to me to be the most natural and well articulated, especially on vocals (solo and choral). It seemed to define the sound stage a bit more fully, e.g. the tenor & bass (male) voices in Child seemed more clearly behind the sopranos (females) and the oboe line was both more distinctly separate from the voices and perhaps a hair more realistically "reedy" without being hash.
I consistently found C to be more realistic and rich than A (and, to a slightly lesser extent, B). The lower fundamentals from the piano in Le Mal de Vivre seemed clearer, a bit more distinct, and less "bundled" within the harmonic structure, which was also the case for D vs A. My proven impression on initial listening was that the highs were less distinct and maybe even less extended on A than the others, with B also below C and D in this regard."I think ESS Technologies and Oppo very much would be pleased by this listener response :-). I see you selected Oppo > Sony > iPhone > motherboard. Good job.
"I want to note that while I hear (or think I hear) small differences between the samples, I cannot point which one is the "correct" sound. Ask you raised the question, I can order them in subjective preference. However, if I had a reference (e.g. original file) I would prefer the sound that's closest to it - no need to sugar-coat audio. We're chasing "high fidelity" after all, right?
I also want to add that I did not attempt to ABX the samples to confirm that I am hearing the difference, so take this with a grain of delicious, subjective salt.
Impressions (I'm overstating the differences for the sake of comparison. I found the overall differences fairly small): A - I consistently found this to sound the "harshest" and to lose clarity in the dynamic parts, compared to the other samples. It's my least preferred sound of the four. B - I found this to sound very close to D. Compared to A and C I found it more "spacious" and "reverby" - this could mean more of the ambience in the recordings came though, or that the transients were not as clean - not sure. I found the imaging and dynamics better than A. C - I found this to sound the "cleanest", maybe what one could call "dry". Bass seemed the tightest, transients the clearest of the bunch. My gut feeling is this file added the least amount of coloration to the original sound - although I can't be sure of it without a reference. D - Again, sounds very close to B, maybe having the tiniest bit more definition.Additional comments:- I heard the least amount of difference on "Le Mal de Vivre"- I had the chance to listen to Joe Satriani live last year. Definitely prefer the sound quality of the recordings compared to the live sound.- I was only aware of Mr Big's version of "Wild World". This one has a great sound as well!"Another wonderful comment and excellent observations! Yes, it's hard to know what is "correct" isn't it? But yet in reality we make adjudications all the time despite none of us really knowing what the "actual" sound was like in the studio/live (unless we attended the concert, but even then, it all depends on where the mics were placed). We "guess" at what is "correct" every time we go into a dealership and listen to a device we might be interested in. Since most audiophiles do not run objective tests, most would not have the benefit of instrumentation to tell them if their devices actually perform with low noise or can confirm low distortion. For this test, Device C (Oppo) is objectively the "most correct" based on measured fidelity. If you pull the 24/96 recording up in an audio editor, you'll be able to confirm that it has the cleanest ultrasonic profile on account of the high quality digital filter. And if you compare the FFT with Device A (ASRock motherboard), you'll also see that Device A in quiet passages will have the 60Hz hum seeping through and has high low-frequency noise.
Despite the uncertainty, it looks like you picked the Oppo still as the "best". Oppo > Sony > iPhone > Motherboard. Well done.
"A - had a small treble spike or something that made it sound exciting, but tiring. Great imaging on "Wild World" though B - The vocals seemed a little thin C - Best vocals, did not seem fatiguing D - Seemed too smooth sometimes, very mellow and neutralAnother vote for ESS Tech and Oppo. I swear I'm not picking out the Oppo comments! These are just the folks leaving their detailed impressions...
All in all, the differences were very small, and perhaps imaginary in some cases. Device C would probably be my favorite, though I would avoid that Crowd Chant thing. ;)"
"The only noticeable difference I hear, is in the high frequency details of the / 'S'-s, 'SH'-s /, otherwise not much of the difference between the 4 devices."Oppo > Sony > Motherboard > iPhone.
"In 'For Unto Us A Child Is Born' there seems to be a better location and separation of the different voices and words are more clearly pronounced with device B than with the others. The differences between C and D are very small, but the bass seems more precise on C than on D. Device A seems a little flat or unengaging? 'Le Mal de Vivre' is a beautiful recording and all 4 devices sound good when you enjoy the music. Only the piano and 'space' might be better with device B. There are small differences regarding the voice character, but hard to say what is most correct.
In general differences between devices seem small. If the test should be an even better help to choose between devices I would prefer more examples of well recorded classical music and acoustic jazz.
All together this test is very interesting and I'm looking forward to see how much I've been fooled. Thank you for this initiative and your other great work!
Best regards Peter, Copenhagen Area, Denmark"Thanks for the comments and feedback, Peter of Denmark! You selected iPhone > Oppo > Sony > Motherboard.
"'A' seemed more clearly differentiated from the rest than they were from each other. "A" was a bit veiled overall, with less clear separation among instruments & voices. "C" may have been a hair more rich and lifelike than "B" or "D", but "A" was more different from the rest than any of these distinctions."Nicely done. Another example of why A (motherboard) scored poorly overall. Oppo > iPhone > Sony > motherboard.
III. Those who thought the difference was very small and not worth spending money for an upgrade...
"I can hear very very faint hints of differences but I can not decide which is better as I have not heard the music live...An interesting example where the listener put the Sony and Oppo on different ends of the spectrum. What's interesting here is that B and C (iPhone and Oppo) are minimum phase filter devices (the Oppo defaults to minimum phase but can be easily switched to linear phase), and it seems this listener has a preference towards the two linear phase devices.
"A" feels everything is on the same level...
"B" is similar but has slight details but lower authority...
"C" has everything on a different level...
"D" less fatiguing for me...
For me D ~= A > B ~= C"
Respondent, you might want to check out the Minimum Phase vs. Linear Phase blind test from 2015 and confirm if you might be sensitive to the digital filter phase setting.
"B Device is faster, better timing, better pace, base tighter, more air C Device nearly same pace as B but less air A Device is same as B but slower D Device is less musical of them but pace is same as B DeviceThanks man! Appreciate the feedback and perceptions. I definitely think the press needs more writers who can be critical and debunk much of the silliness out there. Otherwise it's really up to the audiophiles, the "grassroots" participants in the hobby to find truth and balance ourselves. Unless things change (I hope it can!), I think this is likely the foreseeable default trajectory.
Keep the good work Archimago. The industry needs more person like you to debunk the myths. Wish you good health."
"I set up a play list with all 16 tracks so I could easily skip around via Roon remote. I listened on 3-4 occasions also employing my 12yr old daughter and wife. At times I felt I could hear a subtle difference yet it was not consistent and not something I could articulate. My daughter and wife thought they all sounded the same (but I can't say they were totally invested in the experiment). For all intents and purposes they were identical. If my life depended on it I would say device B seemed to be the most different but I have about 5% confidence in that. Certainly my equipment is not fancy but I think it gives pretty good bang for the buck and almost certainly outperforms any mass market consumer brand systems."Thanks man... Based on writers in magazine articles, I always thought that the wives had golden ears and could casually just hear cable changes from the kitchen! And the kids probably can tell hi-res from CD quality upstairs in their bedroom playing Minecraft!
You've confirmed for me that wives and kids really are not much better that the "man of the house" :-).
"Here are a couple of ABX tests along with some of my comments. Bottom line while I could tell some differences between the devices, but without concentration and over speakers, it is unlikely I could tell any difference with casual listening over speakers and probably headphones without ABX'ing the tracks. My sorted preference above is just a guess :-) While I like all of the tracks, I went with the Satarini track as it is the music I am most familiar with, has high frequency transients over a continuous repetitive sound, which is a good cue when switching ABX tracks. I wish I had more time as I would have done all the ABX permutations and then identified the devices for which ones were a bit brighter or the stereo image was a bit different. In the end though, they are so close in sound that all 4 devices sound virtually identical to my ears, especially if I took the ABX tests out of the loop, it is unlikely that I could tell the difference between any of them.
File A: A - Crowd Chant.flac
File B: B - Crowd Chant.flac
ASIO : ASIO Lynx Hilo USB
22:25:47 : Test started.
22:28:16 : 01/01
22:28:58 : 02/02
22:30:01 : 03/03
22:30:15 : 04/04
22:30:32 : 04/05
22:31:41 : 04/06
22:32:05 : 04/07
22:32:45 : 05/08
22:33:03 : 06/09
22:33:43 : 07/10
22:34:06 : 08/11
22:34:36 : 08/12
22:35:13 : 09/13
22:35:31 : 09/14
22:35:55 : 09/15
22:36:29 : 10/16
22:36:29 : Test finished.
Probability that you were guessing: 22.7%
-- signature --
My notes. I could not tell you which one, but the hi-hat seems louder or slightly different tone than the other. Also. one had a better phantom center and the other had a wider presentation or a different phantom center. Filtering differences I presume. It is the slightest difference too. I listed at my regular volume, which is not that loud, and took me over 10 minutes. On some it took a lot of switching back and forth. On others, sometimes I catch it on the very first switch.
File A: C - Crowd Chant.flac
File B: D - Crowd Chant.flac
ASIO : ASIO Lynx Hilo USB
22:40:33 : Test started.
22:41:11 : 01/01
22:41:50 : 01/02
22:42:25 : 01/03
22:42:49 : 02/04
22:43:07 : 03/05
22:43:36 : 04/06
22:44:04 : 05/07
22:44:24 : 06/08
22:44:51 : 07/09
22:45:02 : 07/10
22:45:22 : 08/11
22:45:42 : 09/12
22:46:06 : 10/13
22:46:19 : 10/14
22:46:49 : 11/15
22:47:03 : 12/16
22:47:03 : Test finished.
Probability that you were guessing: 3.8%
-- signature --
My notes. Again, one sounded a bit brighter than the other or a wider sound stage or both. Could not tell you which one, but there is an audible difference to my ears and took 7 minutes compared to the other one. My ears got tired towards the end as it after the A versus B test. I suspect that any differences are due to the different filters frequency and phase response. Ever so slight and you need to know what to listen for the hear a just noticeable difference. Like I say, take away my ABX testing and I would not be able to tell the difference.Thanks Mitch, now that's dedication; I like how you roll. ABX testing to statistically show that there was only a low probability of "guessing" - now that's serious :-). Like you noted, this is with a good amount of concentration on the sound (rather than just enjoying the music). In "normal" listening, I agree that the sonic difference is far from a "slam dunk" in audibility.
"D - sounded most detailed and open A - produced the most "warm" sound - great midrange B - was as warm sounding as A but otherwise highs and lows were boring C - it was the most boring and a bit unfocused but as I stated earlier the differences were minor"
"I recognise that at my age I should expect a deterioration in my hearing. However after a hearing test 6 months ago I was shown that my hearing range was far higher for my age - which was very pleasing. Last year I introduced the high resolution Benchmark Line Amplifier into my system and this made a big difference in subjectively hearing a difference between the USB and SPDIF feeds. Maybe it's a question of timing versus noise anyway the Auralic Aries (WiFi input) sounds better (more drive, pace, finer detail) than the Linn (LAN input).
In your blind equipment test I really struggled to discern any real difference after some considerable time of evaluating."Congrats on the good results of the hearing test 6 months ago! Thanks for taking the time and giving this a spin :-).
"I only really used 2 tracks, Wild World primarily and a cross check with Le Mal de Vivre. None were really my taste. Would have loved something electronic!
Anyway I was torn between saying all the same or subtle, but I did feel there were subtleties but again the tracks were unfamiliar and repeating - playing in different order abcd dcba a d random I could really be sure. Ultimately I needed longer then the hour I had but certainly I was trying to find something rather than it be obvious.Nice work, you were able to consistently pick out "A" (motherboard) being worse... Have fun with the 2.5 year old :-). Kids grow up fast! Make sure to take lots of photos and videos.
My main notes hotted down were:b and c very similar but both better than a which seemed slightly ‘flatter’d seemed similar to b and c but with more bass detail.
Again, that was on one round, in later ordering I was less sure other than a sounding slightly worse. (Less defined voices)
great test - it’s a lot of effort for you to setup so thanks - and fiddly to listen to with a 2 1/2 year old, but looking forward to results."
"Overall extremely similar sounding. Device C seemed to have less resolution but I kept wondering whether it was just a touch softer. In some cases I thought I heard more detail from 1-2 devices but it sounded a little harsh, so depending on the music I preferred one device over another. I mostly listened in blind "shootout" mode to keep me honest and found the test very difficult."Yeah, not easy. Here's a comment that ESS Tech and Oppo should not use as an endorsement. :-)
"Device A definitely sounded slightly flatter, less involving than the others. Better separation of instruments, vocals on B, C and D - but very little difference between those three. Definite feeling of more 'air' around voice in 'Le Mal de Vivre' in these three devices as well, possibly device D being the best in this respect. Soundstaging very similar across all devices. Have not had the chance to listen on my main system yet - Roon/Mac Mini via Schiit Eitr into Naim Supernait driving Leema Xone speakers."
Sony > iPhone > Oppo > Motherboard.
"'B' was the only one that I think I could hear a difference. A, C and D very similar. Compared to the others, I found B had a more open, airy sound with deeper bass. At times A, C and D all sounded (very slightly) compressed, constrained and distorted."Interesting vote for the iPhone 6.
"I have listened to all devices/tracks through speakers and headphones as well. I could not hear significant differences between the devices by my 60 years old ears. I can not say definitely, one of four devices sounds better/worse than the others. :-(Thanks for the input. I have not read any good articles to suggest that the expensive, special hi-fi racks will do anything for digital equipment.
Tip for testing (for you) - there are specialized racks (some of them pretty expensive) for hifi equipment. Do you think, they may have any serious influence on the sound quality, specially from digital sources?"
We obviously want good looking stands and sturdy racks, but not necessarily because this would affect the electrical properties of things like DACs... Unless of course one has vacuum tubes in the digital device which could be microphonic and "vibration control" could have an effect. In this regard, I can also appreciate the importance of vibration isolation for things like turntables (as you can see, I put mine on the floor!).
"Device B sounded musically more realistic and alive, but the difference was just barely noticeable. My wife was of the same opinion. I am not sure the distinction would survive a double blind test, but we did not try that."More iPhone 6 lovers :-). Again, looks like the wife's ears also did not show any magical abilities unlike reports elsewhere!
IV. Those who thought there was no audible difference between device recordings.
"Listened to all tracks with my wife, and we could not hear any audible difference."Yet again! The wife has let the audiophile down. :-(
"More umff in C but its probably wrong."Well well well... Looks like you were perhaps "right" in picking out the Oppo! :-)
"My opinion....mmmm ...Didn't attempt to listen to my main system with SoundLab speakers as I no longer believe that difference will be audible once you match the output. Subjectively, I liked B but only got 8/16 on a quick blindtest. Maybe because it slightly brighter or louder."The iPhone 6 strikes again!
"I HAVE BEEN INVOLVED IN MANY DAC AND CD COMPARISONS OVER THE YEARS AND IT ALWAYS AMAZES ME HOW SMALL, IF ANY, THE DIFFERENCES ACTUALLY ARE. IT IS CRITICALLY IMPORTANT, OF COURSE, TO MATCH VOLUME LEVELS AND I BELIEVE YOU HAVE DONE AN EXCELLENT JOB IN THIS. I ONLY USED THE MESSIAH AND CECILE FOR EVALUATION AS I HAVE THIS MUSIC IN MY COLLECTION AND KNOW THEM WELL. INCIDENTALLY, THE SOUND QUALITY OF THESE TRACKS PLAYED FROM MY NAS THROUGH JRIVER MC IS FAR SUPERIOR TO THE SOUND FROM YOUR DOWNLOAD. THERE DOES APPEAR TO BE A SIGNIFICANT LOSS OF RESOLUTION IN YOUR DOWNLOAD. ANYWAY, THANKS FOR TAKING THE TIME TO DO THIS; IT HAS BEEN FUN."Thanks for the note! I've included the original digital files for anyone to download and listen in Part 1. It's also possible that there are different masterings of the music... You're right about the importance of volume leveling (see note above). Remember that there should be a loss of resolution in some of these devices, especially the Device A recordings from the computer motherboard! (Remember that the respondents had no idea just how "poor" some of the devices I was recording from were.)
"Sorry, but I could find no noticeable difference, not with the ATMOS setup nor with a direct analog stereo from my DAC on the 1st setup. Sure not with expensive or mid-priced headphones and not on my entry price level 2nd setup. Did you really present different recordings or did you trick us to find differences where are no difference at all?
I am able and could prove that I can tell the difference of a 256 kbps MP3 from 320 kbps MP3 in 9 out of 10 cases. I am able to [hear] the difference between a 320 kbps OggVorbis (Spotify) from a MQA on Tidal in 8 out of 10 cases. I am not able to find differences between 320 kbps OggVorbis and CD or Tidal Hifi in a statistical relevant manner."
There is a case to be made that the psychoacoustic bitrate reduction of MP3 would results in even more distortion than these hi-res device recordings. Also, the psychoacoustic algorithm while good, might not apply transparently to everyone with some being more sensitive...
"Interesting test! Don't think my hearing is bad at all, but I've always said that the difference between DACs are so small nowadays that it doesn't matter. Hope this blind test can prove this once and for all ;)"I think you'd be happy with the results. Obviously the test was not easy and other than the relatively poor output quality from the motherboard, the others (iPhone 6, Oppo UDP-205, Sony SACD/CD player) were certainly difficult to differentiate! IMO, there is a "threshold" of accuracy in reproduction beyond which noticeable differences are hard to tease out.
"Quantitatively and qualitatively I found negligible difference across the test tracks. I thought I could hear differences but these went as soon as they came as I continues to switched between tracks using the ABX software - results copied in below for reference. I will be interesting the see the results, for pride and price. p.s. "le mal de vivre" sounded wonderful on the LX system.
Acer Aspire 5738 output to Behringer UCA222 using Lacinato ABX shootout
Etymotic HF5 custom fit
1 time (20.00%) C - Crowd Chant.flac)
0 times (0.00%) B - Crowd Chant.flac)
3 times (60.00%)D - Crowd Chant.flac)
1 time (20.00%)A - Crowd Chant.flac)
1 time (20.00%)D - For Unto Us A Child Is Born.flac)
2 times (40.00%)B - For Unto Us A Child Is Born.flac)
2 times (40.00%)A - For Unto Us A Child Is Born.flac)
0 times (0.00%) C - For Unto Us A Child Is Born.flac)
1 time (20.00%)D - Le Mal de Vivre.flac)
3 times (60.00%)C - Le Mal de Vivre.flac)
0 times (0.00%) A - Le Mal de Vivre.flac)
1 time (20.00%)B - Le Mal de Vivre.flac)
1 time (20.00%)A - Wild World.flac)
0 times (0.00%) B - Wild World.flac)
2 times (40.00%)C - Wild World.flac)
2 times (40.00%)D - Wild World.flac)
2 times (40.00%)A - Crowd Chant.flac)
2 times (40.00%)D - Crowd Chant.flac)
1 time (20.00%)C - Crowd Chant.flac)
0 times (0.00%)B - Crowd Chant.flac)
2 times (40.00%)D - For Unto Us A Child Is Born.flac)
1 time (20.00%)C - For Unto Us A Child Is Born.flac)
1 time (20.00%)B - For Unto Us A Child Is Born.flac)
1 time (20.00%)A - For Unto Us A Child Is Born.flac)
2 times (40.00%)B - Le Mal de Vivre.flac)
2 times (40.00%)A - Le Mal de Vivre.flac)
0 times (0.00%) D - Le Mal de Vivre.flac)
1 time (20.00%)C - Le Mal de Vivre.flac)
0 times (0.00%) A - Wild World.flac)
1 time (20.00%)D - Wild World.flac)
2 times (40.00%)C - Wild World.flac)
2 times (40.00%)B - Wild World.flac)
line in to Hypex DLCP x6 UcD180 driving LXmini+ 2Wow. Very cool testing procedure and nice selection of gear used! Even though you could not hear a significant difference, I certainly appreciate the time spent.
1 time (20.00%)B - Wild World.flac)
2 times (40.00%)C - Wild World.flac)
2 times (40.00%)D - Wild World.flac)
0 times (0.00%) A - Wild World.flac)
1 time (25.00%)A - Crowd Chant.flac)
1 time (25.00%)D - Crowd Chant.flac)
2 times (50.00%)C - Crowd Chant.flac)
0 times (0.00%)B - Crowd Chant.flac)
1 time (20.00%)A - For Unto Us A Child Is Born.flac)
0 times (0.00%) B - For Unto Us A Child Is Born.flac)
2 times (40.00%)C - For Unto Us A Child Is Born.flac)
2 times (40.00%)D - For Unto Us A Child Is Born.flac)
0 times (0.00%) A - Le Mal de Vivre.flac)
2 times (50.00%)B - Le Mal de Vivre.flac)
1 time (25.00%)C - Le Mal de Vivre.flac)
1 time (25.00%)D - Le Mal de Vivre.flac)"
"For hearing test we often use this: https://tech.ebu.ch/news/ebu-cds-now-online-31oct08Cool, someone familiar with the EBU sound testing method. Thanks for the link to the test material!
"I did think I noticed differences (unfocused female voice in B for example) but I was simply hearing more detail in the recordings. Every time when I tried another device the same effect was there.
I did hear more quiet non repeatable clicks than I expected in the X1.
I find inexpensive audio electronics to be very acceptable nowadays, but less so speakers and indeed unchecked rips from dodgy computer drives...
I value and enjoy your articles.Thanks Chris! You bring out an important phenomenon.
Many Thanks and Kind Regards
When we concentrate and listen intently, we often pick up those little subtle sounds that we thought we heard for the first time. However, when we go back to listen again to the other devices, we realize that they were there all along! Without the ability to do quick A/B testing, the listener typically will not have a chance to go back and confirm...
This results in a bias problem and comes up for example with cable tests that the manufacturers put together in audio shows. They'll use a "poor" generic cable, then prime listeners with expectations of how good the expensive one will be, then play the same song with the $5,000 cable, then ask "Did you hear the difference? Notice the fantastic natural reverb!". Not exactly fair if the listeners can't request quickly going back to the "poor" cable and double checking on that reverb, is it?
And so is born those classic (if not cliché) reviewer comments like: "I heard things I never heard before thanks to this new DAC/preamp/amplifier/speaker/cable!" :-)
V. Parting comments on blind testing and critiques...I hope you've enjoyed reading the subjective comments above. It was actually lots of fun digging through the results and going through these impressions from the respondents, answering any questions I can along the way.
Of course, I hope the blind test participants had "fun". Like I said last week, I believe this was not an easy test and I hope ultimately the experience was of value and now you can say you've taken part in a type of blind test and the results are out in cyberspace for all to see. I doubt most of us want to perform blind tests regularly (rather masochistic I think to want this, and there's only so much time in life to enjoy good music!). However, I do hope in one's lifetime as a "hardware audiophile" seeking high quality audio reproduction, one does have the opportunity to try a few of these once awhile. Remember, I recently quoted J. Gordon Holt's reference to the importance of "basic honesty controls" in the audiophile hobby. Measurements and blind tests have important roles to play by providing reality testing and keeping tabs on boastful claims.
By all means, hang out at one's audio dealership occasionally, perhaps make a pilgrimage to an audio show to hear the "latest and greatest", and go over to a fellow audiophile's home to check out his/her gear while listening to music. Just keep in mind to try an "honesty test" every once awhile! Doing this is probably good for an audiophile's soul to maintain equilibrium.
I believe "rational audiophiles" (nay, just simply rational human beings), appreciate that there is a world of "objective truth" out there that we should be tapping into and that not everything in this world is about one's own preferences or beliefs even if often these are the most important factors in decision-making. As we've seen in the news recently, there have been sad examples of solipsistic thinking leading to poor outcomes. Consider the unsubstantiated and irrational fears around immunization and measles outbreaks that should never be happening especially in the developed world in 2019. Though obviously as audiophiles we don't generally have to worry about disastrous outcomes (I hope nobody drains the bank account too much, feels too ripped off by "snake oil", or experiences marital discord on behalf of poor "acceptance factor" purchases!), it is important to keep an eye on achieving reasonable goals around "hi-fi" rather than going extreme into what J. G. Holt called and Brent Butterworth recently wrote about - "My Fi".
Something I have noticed over the years when conducting blind tests here is that some individuals are very much opposed to the exercise (especially the act of blinding). These are individuals who will come out and say that the "ADC isn't good enough", or the "90 second sample isn't long enough", or that the test will likely yield no significant results because of some characteristic they don't like without actually having any reference to evidence. While I appreciate the use of critical thinking skills, I wonder, do these individuals apply the same critical thought process to their own subjective impressions before expressing them online? Do they critically consider the subjective opinions of those they read in the media, the comments expressed on YouTube audiophile channels (like this and this), or the testimony of those in the Industry with financial interests? IMO, the fear of testing is a sign of insecurity much of the time.
"Pure subjectivist" audiophiles often have no qualms about writing in reviews that they heard "obvious" changes, how "thick veils" were lifted, how a cable or component rejuvenated their system, how much the bass improved, or the "sweetness" of the treble was elevated. But the moment one wants to suggest a test to prove the veracity of these claims in a controlled, blinded fashion, they either disengage, disparage, or just whine. In some of instances, they'll bring up a failed test from ages back and suggest that somehow this is representative of controlled, blind testing for all time (ahem... JA and his 1978 Quad experience). Despite blind testing being widely recognized as an essential part of most serious research (let's not split hairs at this time about whether single or double blind, ABX, etc...), many audiophiles prefer to find faults and make unreasonable excuses (like this).
Seriously guys, if some commonly-held subjective claims like changing USB cables resulted in massive differences (as recently claimed by Paul McGowan in his video ~5:30 where "every head" turned, asking "what just happened?"), then how hard would this be to prove with a blind test? Shouldn't that be even easier than picking out Device A as a computer motherboard in this test based on the "strength" of that story? Why not use basic "honesty control" techniques which can be used to elevate countless anecdotes such as this into the realm of evidence that can be replicated and hopefully substantiated?
I read with interest Steven Stone's recent article ("An Audio Test That May or May Not Prove Something") on our test here. I appreciate Mr. Stone for letting his readers know about the test; I suspect the article added a few more respondents to the database of 101. As you can see, the article starts not with trying to recognize the importance of independent blind testing but rather, by implication and language used, seems to want to cast doubts for readers as to what it means "if no discernible difference was found". What if "no discernible difference" simply means exactly that? There was indeed nothing to find! Sure, we have to be critical about research design, proper data collection, reasonable analysis, consider the respondents, etc., but isn't it OK also that results are just negative? (In fact, in the research world there are valid concerns about "positive publication bias".)
When it comes to high fidelity audio, with the evolution of digital technology getting better, faster, cheaper, and the human auditory system obviously not evolving any quicker in response, should there not come a point where the technology is more than "good enough" and testing at the margins of audibility will show no further difference? By the way, reading that first paragraph by Mr. Stone, it sounded like even before any results were known or published, he expected our results here to be negative - O ye of little faith!
Since Mr. Stone's article summarized nicely some criticisms of this blind test I've also read expressed elsewhere, let's address them briefly, but I hope satisfactorily.
One criticism is around the ADC/DAC steps involved... Folks, have a listen to this 8th generation AD/DA comparison done by esldude on Audiophile Style if you think modern AD/DA conversions are so terrible that they result in massive changes/coloration in sound. For the blind test here, I'm only doing a single generation audio capture in 24/96 of 16/44.1 playback and the listeners will have heard 2 DA steps (once from the Device playing while digitized by the ADC, and the other from their own DAC in 24/96 high resolution). Again, I refer the reader to the comparison above where I showed the exact left-right balance for peak and average amplitudes when we compare the original CD rip with the Oppo's ADC recording. IMO, there is no big problem here.
As for the other critiques from Mr. Stone, such as "the device or App used for the actual volume adjustments was not noted" - well, why should I need to say every little detail at the outset? If this was such a big deal, thinking that I was going to use some poor software or "app", anyone could have just posted a comment on the Test Invitation page and I would have answered. I trust there's overall satisfaction with 32-bit volume adjustments using Adobe Audition CS6 which I explained in Part 1 of this series.
Also, "For me the main issue in this test was the use of 16/44 files". Huh? Why is this the main issue? As I explained, these days, are we not still collecting and playing 16/44 digital audio primarily? And why would "whether the original source file was a digital file or CD" matter? Although I cannot be sure, but it seems to me like he is hinting at the insecurity of the "bits are not bits" crowd who still seem to think that a bitperfect rip from a CD somehow "sounds different" from an otherwise exact digital file as if things like jitter somehow can travel with the data or maybe FLAC conversion affects the sound even on modern hardware (looked into awhile back). I hope this is not what he's getting at because it obviously flies against how digital audio works.
Remember that digitizing the analogue output from 16/44.1 playback to 24/96 is not the same as taking a digital signal and upsampling in a non-integer fashion (not that these days that's even a big problem!). There is no issue - remember, digital audio playback is not "stair-stepped" as some ridiculous ads might show or misinformed commentators might insinuate. The analogue output is smooth and the 24/96 digitization just sampled that.
Nonetheless, it's good to see that Mr. Stone was able to conclude that "this one has at least a chance of arriving at something..." (despite "My initial response to this on-line test was decidedly negative" - why!?). As you've read last week, indeed there are findings worth thinking about. I wonder if Mr. Stone himself might have been surprised by these results. But why is it that so many in the seemingly "subjective" camp are so eager to find faults with more objective methods when really IMO they should be putting more thought into their own "blind spots", biases, perceptual and cognitive limitations. Do some listeners think they have no such limitations? What preconceived notions do some hold about the nature of sound and human perception? Do they honestly think that the best engineers in this world who conceived the devices we listen to will turn a blind eye to objective verification of their designs?
I can only assume that some actually believe they are immune to psychological biases - for example, consider this "professional scientist"... For those curious, here's a little review to consider (even if you're not a medical student) and a more scholarly paper on confirmation bias.
I know that Mr. Stone also writes for The Absolute Sound. Considering that TAS is quite widely circulated here in North America, it is unfortunately also a place where objective exploration of technological products is nowhere to be found. Or when it is, it's honestly kinda weird.
As much as I enjoy running these kinds of tests and exploring the results with you, I've always believed that it should be the "professional" media's role in educating and trying to find truth in the claims the Industry makes. Instead of finding truth, sadly these days, the media often act as nothing more than perpetuators of many questionable claims and as advertisers for such products. IMO, they should be the ones engaging with the audiophile hobbyists to show what is important and what isn't. Would it not be noble as actual journalists to sniff out the snake oil, correct misconceptions, squash myths, in effect presenting a balanced picture for consumers? Are the audiophile press (both in print and online) doing this at least to some extent? If not, then I think audiophiles should be asking in whose interest does the media serve.
I recognize of course that the audiophile media is but a tiny grain of sand considering all the places where these same questions can (and should) be asked.
That's all for now. Hope you enjoyed the blind test and reviewing the results... A final thanks and congrats to all the 101 participants for a job well done.
Until next time, enjoy the music!