Saturday, 13 June 2020

BLIND TEST RESULTS Part III: "Is high Harmonic Distortion in music audible?" Subjective Descriptions

As we've seen in Part II last week, based on the preference data, there was a pattern for the respondents to this blind test to choose the samples with lower added distortion as sounding "better". I believe this is encouraging for audiophiles who seek "high fidelity" and "accuracy" in the reproduction of music. It's a demonstration that correlates objective levels of distortion with a subjective preference.

Today, as we end off the write-up for this blind test, let's consider the subjective descriptions of what was heard by the respondents. Words describing experience and feelings can be difficult and imprecise, but by correlating how listeners expressed themselves with knowing how they ranked the samples, perhaps we can appreciate the scope of adjectives used when people listen to content with significant harmonic distortion...

I. The "Golden Ears"

To start, why don't we examine what the 5 "Golden Ears" said? These are the individuals who achieved a "perfect" score and were able to list the samples from "least" to "most" distortion and were able to correlate them from "best" to "worst" sounding...

For reference, remember the samples and how much THD applied:
Sample A = -50dB/0.3%
Sample B = -175dB/0.0000002%
Sample C = -75dB/0.02%
Sample D = -30dB/3.2%
Of the group of "Golden Ears", only 2 of them provided a subjective description; remember, this is an international group of listeners so it's not unusual to not receive a subjective comment perhaps due to english as a second language when responding.

Here are the 2 comments:
A sounded slightly more "bright" than B. C sounded very nearly the same as B, very hard to tell the difference. D sounded more "blurry", transients not as well defined.
I heard distortions at the the high and even more at the low range of my speakers.
Both of the respondents felt that the differences were only "small". Unlike typical flowery reviews of equipment with all kinds of words about how soundstage varied or depth of perception changed, notice that the descriptions were far from highly detailed for better or worse :-). There are hints of anomalies like tonality difference ("bright"). Resolution loss ("blurry"). And perhaps temporal change ("transient" affected).

That's all!

II. Those who heard a "HUUUUGE" difference between samples

Okay then, let's consider the folks who felt that there was a "huuuuge" difference between tracks, what did they hear? A sample of some descriptions:
'A' was horrible. I DNF'd on it with Tootsie, and almost DNF'd on Horse. It was like listening to cassette tapes, which I've always hated. 
'B' was slightly better than 'D' but maybe almost too much of a good thing. When people talk about clinical, I could see them meaning that in regard to 'B' vs. 'D', esp. with poorly mastered material. 'D' felt ever so slightly smoothed-over. I'm not sure I would pay for 'B' improvement over 'D'. 
'C' wasn't overtly objectionable, but compared 'B' and 'D' it was not as good. I would pay to upgrade from 'C'. 
Interesting story time: I listened to them in reverse order. So, I listened to the cello piece last. I had pretty much decided on the order by then, but I was reserving judgement on 'B' vs. 'D'. In the cello piece on a couple of his notes (for whatever reason I decided to max vol for the last test) it caused two items touching each other in our living room to resonate. At first I thought it was A's bad recording. Then with 'B' it was worse, almost broken speaker time. 'C' not so bad, by which time I was pretty sure it what was causing it, but I didn't want to get up out of my chair. It came back for 'D', but not as bad as 'B', which "settled it" for me between 'B' & 'D'. Hope that's not cheating. ;) 
This was fun. I'm going to have my son re-randomize the files and I'll re-run & re-submit the test with a couple of my headphone setups.
Thanks for organizing!
Wow. Thanks for the detailed description! Indeed, A, although not the highest level of distortion was the second highest. And I see this respondent got the -175dB sample as "best". This respondent chose B > D > C > A.

This same respondent also included another comment after listening with headphones:
Hey Arch, I finally re-ran the test with headphones (had my son re-randomize the files for me). It was much, MUCH harder to differentiate with headphones than my loudspeakers from when I tested  the first time. With a pair of ATH-M50's I couldn't tell /any/ difference. With the Beyerdynamics DT-1350's, I felt I could tell a difference, but it seemed slight. 
As I recall my preferences from the first time using my loudspeakers (best to worst) were: C -- B / D -- A vs. C -- D -- A -- B this time. 
Last time I did not like (A) at all. I said I would pay $ to improve my system to have any of the others relative to it. 
I preferred (C) a slight amount but not enough to pay for it.
This time it seems I still preferred (C), but not enough to pay for it, and (A) wasn't even last! Wow. 
It's interesting (surprising) that I experienced such a big difference between headphones and loudspeakers. I absolutely expected headphones to be more revealing/make me more picky. I guess if one is a glass half-full kind of guy, everything sounds great on headphones! ;~)
Thanks for taking the effort with headphones and commenting on the difference! Not easy at all and I think with different transducers, one could certainly come out with different preferences.

Here's another respondent who thought the samples sounded hugely different:
I'm not really sure about the rankings. When I used ATH-M40x, I could hear very subtle differences, and it was very hard to tell which was better. Using ER4SR the differences were huge, but still hard to tell which is more distorted. I think C sounded cleaner than the rest, more detailed on the "Horse", with more clear separation of instruments on "Tootie", and more clear sound of background players' breathing on "Clavier"
B sounded noticeably more "mellow", less harsh than the others, with less attack, and much less pronounced voice reverb on "Horse". I think it was maybe the most distorted of the four, as it sounded the most unlike C, but it was more pleasant than A or D due to its mellowness.
A and D sounded dirty, just dirty. Creschendo on "Rhapsody" turned to mess, Drums on "Tootie" sounded "brickwalled", etc.
I think I'd have hard time separating A from D, with C clearly on one side of this spectrum and B on the other distortion-wise.
"Cleaner", more "separation of instruments" used as subjective differentiators. He described the highest THD samples A & D as "dirty, just dirty" sounding. Ultimately he chose C > B > D > A - not bad at all in that he was able to separate the 2 groups: C & B are indeed the least distorted, while D & A were the higher distortion group.
Best C has very clear and full sounding harmonics of piano, stings, percussion, voice. All other samples exhibit increasing amount of destruction of these harmonics, not natural and not full sounding.
I wonder how much cleaner the clean sample could sound with some equipment upgrades, such as amplifier.
This respondent ranked from "best" to "worst": C > D > B > A. This is an example of one of a number of respondents who felt the -75dB/0.02% sample sounded best. The amplifier used is the Yamaha R-N500 which is certainly not bad at all!

III. Those who felt there was a "big" difference

Next are the folks who felt the differences were still quite audibly significant in a "big" way.
Dear Αρχιμαγο, (since it's Greek), sample B, was clearly the best sounding in my system. You will tell us why, I hope not because of the distortion, hahaha. It became clear to me when I focused on Jenifer's vocals, the center image and placement on the Hootie's song, and very much on the last piano sample. I won't go into analyzing how bad were the worst samples, except saying that sample A was almost annoying. 
I rip my cd's, I listen 90% files, I don't stream, and i don't care about the bitrate, I love DSD, BUT It's all in the recording. 
I do listen to the same song, in 44 and 192 to determine which is best, and I don't mind admiting the truth when  44 is the winner. 
I love great Dynamic Range, and hate highly compressed Hi-Rez files! 
I'm reading your pages for many years now, it's been eye-opening and educating for me. I believe that something that measures well, sounds well too, though I hate to admit I've heard great music from guys with valve amps and old speakers. 
It's an oxymoron, that although you are a 'Mago, you mostly demystify things in this magical hobby of above all, matching things!
Thank you for the nice response my Greek friend! Love the spelling of the name :-). Yes, absolutely, it's much more about the mastering of music than the technical stuff like 44kHz or 192kHz or DSD. Guess what... You're "right"! Sample B is objectively clearly the "best" by a rather wide margin as this was the -175dB/0.0000002% THD track. I see you rated the samples: B > D > C > A.

By the way, you're right that "Archimago" is the name of the conjurer in Spenser's Faerie Queene. In high school, I liked the name and used it as my alias playing D&D. And continued using the name online as my handle over the years :-). These days, for the audiophile consumer, I hope to demystify things (which is not all that magical of course!), but at the same time make it less comfortable for those who sell snake oil and appear to be dealing with magic.
All in all, the destruction of the scene, the hiding of sounds, the loss of overtones and the fullness of music with harmonics.
A good description of what high harmonic distortion might do to music. This respondent rated: C > B > D > A. Again, like one of the respondents above, able to prefer the lower distortion tracks over the higher ones.
Bass tends to move
Distortion especially at the top
Better balance in the better versions
The worse versions tended to have a bland sound
Interesting subjective descriptions. I assume "distortion especially at the top" might be referring to higher frequencies felt to be affected more? This listener rated: C > B > A > D. Again, a general preference for the lower distortion tracks over the higher ones.
Track 4 (Lang-Lang): My twelve-year-old plays the piano and rehearses several times a day. Therefore, I got an idea of how the piano sounds when listening to it from different rooms of the house. 
Sample A sounds as listening to the piano from the room above the room where the piano is located, doors open (no reverb left) 
Sample B sounds as listening to the piano from the room next to it, door closed (sound subdued) 
Sample C sounds as listening to the piano from sitting next to it, lid closed (more direct sound) 
Sample D sounds as listening to the piano from the room next to it, door open (more indirect sound) 
Nevertheless, I always enjoy listen to my daughter playing the piano, no matter in which room I am in! 
What amazed me most was that only Sample A of the 4 tracks created a slight stress for my brain in putting a sound image together. For every other sample B-D my brain could obviously refer to a listening experience enabling it to create a realistic sound image.
On the other hand, I probably would not buy an amplifier which makes the music sound as it was coming through closed doors…
Interesting use of a real-life comparison of how the sound was spatially experienced! This listener selected: C > D > B > A. Another preference for the -75dB/0.02% THD track as "best".
more detail on question 10: the "Horse" track seemed hardest to hear differences, seemed mostly like a vague sense of "natural presence" vs "distance" (maybe "veiled"). The others varied by what you could hear and for what ranges, but all three seemed easier to assess than that one. 
second comment: I think that having the distortion in the same order for all tracks may bias the results. Having first evaluated one set, the listener is then biased to what to expect in the next set. I found myself listening for anticipated changes and seeking reasons to put them in the same order, instead of truly listening to differences. I wonder if you did the same test, but with all sets of tracks randomly sorted, if you would get a different set of answers.
Yeah, good point about maintaining the same order on all tracks. Alas, I find it's a balance between simplicity of the test for data collection and also trying to make it not too complicated for listeners, hence the decision to keep the distortion settings in the same order. I think if I were to run this in a lab situation, I would certainly randomize the sequence. You ranked: A > D > B > C. Interesting selection which suggests that you might like the changes in sound harmonic distortion make. Nothing wrong with that of course... You might want to play with Distort more and listen to the effect of harmonic distortion to verify if this is the case.
I find the more distortion there is (in my opinion), the less clearly each musician or singer stand apart from each other, there is more perceived blending (or bleeding) and dynamic range is reduced. There is no place where I could say: Ah, here this sound is distorted (except maybe the fortissimo piano chords of Lang Lang). Good choice of test files with a female singer having some sibilance, a baritone with a grainy voice and slow bowing from Yo Yo Ma also having some grain. As for Lang Lang, he has a hard touch and those block chords stimulate the piano inharmonics a lot, so there is inherently something recalling distortion in all four pieces. 
It was not very hard to select the best and worst, but the others were harder to differentiate, so I'm less sure of those.
Thanks for the feedback on the music selection and what you heard. Here's what you selected: C > A > B > D. A mixed selection but like a number of others, you also thought the -75dB/0.02% THD tracks sounded "best" and was correct in identifying D as "worst".
I noticed that the higher ranked recordings of D and A had a much broader sense of space and depth that the lower ranked recordings.
Interesting. Samples D and A are in fact the ones with more distortion added. You ranked the samples: D > A > C > B. Fascinating - you ranked the samples "correctly" but from highest distortion to lowest! This is actually not unexpected since we are looking at subjective preferences. Increased harmonic content for some people will make the sound "fuller" and this can be interpreted as "space and depth", I suspect.

IV. Those who felt there was "only a small difference"

Of all the categories, this is the largest group with 21 individuals. For brevity, I'll select the most detailed descriptions in this group to discuss...
I found that throughout all music samples track B was clearest and most subtle. I would love to know whether B this was the one with the lowest or no distortion added. C has for me the roughest (hoarse sounding) and sometimes blurry sound. Between A and D I could not really make my mind up which one is more natural.
Yup, indeed Sample B was the lowest distortion added. And both A and D were the tracks with more distortion added. Your ranking: B > D > A > C.
I think B sounded slightly more transparent, more obvious "air", more detail, less rounded, less full.  B seemed leaner than the other tracks.  My wild guess is that as you add more and more harmonic distortion the sound becomes fuller, warmer, less transparent.  As a result, I am guessing B has the least amount of harmonic distortion.
Nice work, another vote for B (-175dB) sounding best. Certainly with less distortion added, there could be the sense of "leanness" - literally less content being added to the signal.  Overall, you voted: B > C > D > A.
No matter how hard I tried, I couldn't put my finger on a specific difference between the samples. I certainly couldn't say "I hear distortion". However, there was something said to me "I like this" and "I don't like this" almost immediately (within a few seconds) for each sample. Consistently I did not like Sample C. Consistently I liked Sample A. B and D were somewhere in between, and I would be happy listening to them.  
Because I felt that confirmation bias might be creeping in (because I had decided I liked A and didn't like C right from the first track), I asked my son to anonymize the tracks before I tried again using headphones. I still consistently didn't like C (it was bottom or second from bottom across all tracks). For Track A again I picked it as my favourite 3 times out of 4 - but for some reason on Track 2 I placed it in position 3 with D as my favourite. 
Complete results in terms of order of preference for each track on headphones: 
Track: 1  2  3  4 
A:        1  3  1  1 
B:        2  2  3  2 
C:        3  4  4  3 
D:        4  1  2  4
Wow! Thanks for the extra work on this and the data! As you've experienced, this is not an easy task and speaks to the subtlety of harmonic distortions when we're listening to actual music rather than just test tones. I see your "final answer" was: A > B > D > C.
B and C are indistinguishable to me,  similarly A & D sound about the same. 
First pass, C seemed the "cleanest" generally, and I stuck with that initial impression (although I could be wrong: who knows what the actual recordings sound like without running through an amp of some kind?) 
The versions I marked low (A&D) sounded a bit clangy in the piano chords and a bit edgy on voice sforzandi (JW's "and the NIGHT...") and a tad less clean. 
Clavier revealed almost no differences.  I didn't listen to Hootie because I don't like them. 
I think choral music and string sections might be more revealing: that always seems to me where systems fall down.  Try listening to Hyperion's 40-voice motet "Spem in Alium" on different setups (especially headphones vs speakers+room.)
I'm interested to see the results, as from best to worst, the audible effects to my ancient ears are smaller than I expected.
I might revisit with more revealing headphones (Koss ESP950s)... I doubt that I would hear much difference over speakers.
You answer was: C > B > A > D. Great work! Another example where the respondent was able to identify the two low-distortion and two high-distortion tracks as a pair even if not in exact sequence as revealed in the first sentence of this response! Interesting comment about the "clangy" piano chords as a marker of the harmonic distortion.
Re Q11 - I debated with myself how to answer that one.  After all, how do I know for sure why I liked one version over another?  Maybe I like harmonic distortion, or maybe not.  Lets just say I am looking forward to the results.  To be clear, I have answered what I believe to be true, but I have not discounted the possibility this belief might be wrong.  I could always PX the Devialet for a valve amp I guess :-)  Finally, for the record I do have public performance experience, but only as an ex DJ, which doesn't really count for much here …..  Thanks for offering this test, fascinating, and I look forward to being humiliated by the results.  Whatever the results, it will be educational, so much appreciated.
Final response: D > C > B  > A. Not easy! Good discussion on the debate around whether we might like harmonic distortion or not. We'll talk more about this at the end...
In the more distorted versions the bass feel more extended, fuller, some sounds seem to overlap, to mix. Not clear in some examples. 
To be honest, I think that a piece or two of full orchestral music (modern soundtrack or classical symphony)  and a more modern track (not only acoustical) instrument would have been more representative. The four samples are limited to very few acoustic instruments with a lot of added distortion by its own nature. Maybe, just maybe, more complex tracks or less "acoustics" instruments would have shown clearer differences.
You could be right about the tracks selected... Debatable of course since I find large orchestral music also have their own potential confounding factors including the venue of the recording, and the complexity itself might make harmonic distortions less audible compared to a more intimate vocal track or sparse individual instruments where we might be more intimately familiar with what a lone voice or instrument "should" naturally sound like. The response here was D > A > B > C. Interesting since there was a preference towards the tracks with more harmonic distortion, perhaps subjectively it was the other way around? The less distorted tracks were actually the ones this listener experienced to be "more extended" and "full"?
I heard a certain harshness in the track I liked the least and a certain hissing that made the vocal intelligibility worse, but only in direct comparison with the other tracks. This was best heard on my headphones, closely followed by my AV receiver in "Extended Stereo" mode. 
It was fun, even though I'm not sure if I actually noticed the harmonic distortions.However, the selected pieces of music do not make harmonic distortion particularly clear.  I am curious about the results.
Answer: C > D > B > A. You like the -50dB/0.3% track the least. Not sure if this is the reason about the "hissing" but it was certainly one of the more distorted samples.
Traces of scratchiness/edginess on some of the tracks.
Answer: D > A > C > B. Interesting! You selected the opposite order in terms of added harmonics! Could the "scratchiness/edginess" actually be things like finger movements on the instruments, subtle vocalizations, or fret noise that were actually obscured by the added distortions?
This was a tough test! The differences were small (even though I suppose the differences in harmonic distortion were large :-) ) and initially I found myself attracted to the "liveliness" of C which  I eventually decided was due to more distortion and I changed my ranking. My old ears are not so acute anymore!
You said: D > A > C > B! Again, like the respondent above, your selection was in the inverse sequence of the amount of distortion added! Maybe your initial attraction to the "liveliness" of C is actually a reflection that you indeed liked low distortions... But you second-guessed yourself thinking that this was actually due to added harmonics?
This is Taylor C. I had an interesting experience because I forgot that the order of best to worst should end up being the same for all songs, so after listening and ranking my first three songs, I looked back at my notes and realized that I had A and B as the worst two and C and D as the best two. I figured I was developing a preference somehow related to the order of listening because the differences were subtle enough that they very well could have been imagined. But then I re-read your instructions and was pleasantly surprised to see that they were supposed to be like that, which increased my confidence level that I was actually hearing differences rather than imaging them. 
Regardless, if I couldn't switch back and forth quickly, I'm not confident I would notice even the highest amount of harmonic distortion you added, which I subjectively interpreted as maybe a very faint buzzing character or less-smooth quality to the tones. Maybe I would notice such a thing for a song I'm very familiar with, but even that is questionable.
Hey there Taylor. Yeah, as you can see, not easy eh? Even with the ability to switch quickly between samples... How much more difficult it would be if one had to switch cables or move components in and out of the system! I see you ranked: C > D > A > B.

V. Those who felt there was "very little difference" - basically very subtle...

Next, here are some comments from those who felt the differences were minute and "not worth an upgrade" if this is the difference between products. 
At first I listened with my speakers to all the tracks straight through, ranking each one from best to worst. It was hard noticing any difference, but I was pretty sure all the A samples were the worst (most distortion). As for the other ones, it was really all over the place. 
Then I listened with my headphones. This time I listened in reverse order – from D to A for each set, instead of A to D. It was actually harder to tell the difference and I was no longer sure that A as the worst. 
I finally put every set of samples into audacity (I DID NOT look at the spectrogram, I promise) just so I can jump between them quickly and to a/b test between different samples (using headphones and speakers). This was the least revealing test, and even jumping quickly back and forth while looping A second of audio yielded no noticeable difference. 
I tried also listening at different volumes. It didn't help.  
My ranking from best to worst is pretty much a guess, except from sample A which I ranked the worst, which was the most consistent thing that I could hear (though not in the quick a/b testing).  but the difference could be imagined for all I know. I chose the piano peace as the most revealing, though this is also just speculation and could be an imagined effect. 
Subjectively, this is all music I never listen to normally (I'm a proud metalhead). I wouldn't know how it supposed to sound or what am I supposed to listen for when I listen to it. The only thing close to "audiophile" experience I had with this music, is the piano piece had the most realistic imaging (funny thing – when I put on the sample of the rhapsody with the headphones the first time, at first I thought I was still hearing it from the speakers). I think this is just due to it being a single instrument recorded in a way that realistically captures the acoustic space around it. Still pretty boring stuff, though ;-)
Thanks for taking the time Mr. Metalhead. Clearly not the "usual" type of music for you :-). Your final ranking: B > C > D > A. Congratulations! Despite all the years of head banging and listening to loud (distorted) music, you did a great job with selecting the least distorted tracks as "best"! Keep rockin'...
Если честно,  то заметил я лишь (по моему убеждению) максимальный уровень искажений и только на одном жанре, с насыщенной басовой партией (Tootie by Hootie),  все остальные практически невозможно было ранжировать - расставил практически интуитивно. 
Translation: "To be honest, I noticed only (in my opinion) the maximum level of distortion and in only one genre, with a rich bass part (Tootie by Hootie), it was almost impossible to rank all the others - I arranged it almost intuitively."
Nice, a response written in Russian (I believe the survey system tagged this as a response from Ukraine). As I mentioned previously, this is an international effort :-). The respondent answered: B > A > C > D. Exactly as the comment suggested. He correctly picked the "best" as lowest distortion Sample B, and the "worst" as highest distortion Sample D. Difficult to arrange the middle samples. Well done!
Horses: The "Sss" seems cleanest a I sorted, however the difference is not reliably detectable (ABX compare) Ranking: Best: D, B, C ,A: Worst 
Rhapsody: the ranking is different again not reliably detectable: Best: C, B, A, D: Worst. In rhapsody the parts with octave jumps and thins sound is the best for me to test.The ABX Test failed in all 4 pieces. I could not reliably detect in the blind test the ranking (: 
This means the distortion is not at all obvious. 
In a personal experiment I detected that with music rather high distortion is not detectable in the amount of 3-5%. With a sinewave, the distortion was detectable at an amount of around 0.5%. The sine sounded harsher with distortion. 
Conclusion: In music the masking effect is strong.
Thanks for commenting on your ABX results! Your final ranking was: D > B > C > A. It is interesting that your ranking for "Rhapsody" would have been closer to the sequence from low to high distortion added. Thanks for also commenting on the personal experiment and your result with a pure sine wave being significantly easier to hear harmonic distortions by the time 0.5% THD added. I have done some tests myself and indeed my result is similar; depending on specific amount for each harmonic and higher harmonics also less easily masked by the fundamental.
Room characteristics and noise floor will be just as important as how much is spent on equipment. And the type of equipment. My MC225 probably adds more distortion that what was added to the samples. 
Big fan of your blog, been a reader for many years.
Thanks for the feedback man. Yup, indeed all kinds of factors will play a part in the final sound / preference. Indeed, your McIntosh MC225 tube amp will have its own unique harmonic signature which could be quite high, potentially further modified by the type and age of the tubes used. I see your final selection was: A > D > B > C.
First of all, I really tried to find any difference as that seemed to be very hard. After a while I became convinced that track A was sounding warmer, while D was more "airy", less "filled" with sound.
Cool descriptions of the 2 tracks with the most amount of distortion added. I see you selected: A > C > B > D. Yup, the "airy" track is the one with the most distortion as you suspected in the ranking.
The ordering I performed above is totally bogus. For Clavier, I felt track A was "best", and for Rhapsody it was track D. For the other 2 pieces I could not rank them. I might have heard slight differences in "transparency", "fog", or "artificiality", but on different listenings these would be apparent on different tracks - no consistency. So, for the most part I heard no rankable differences. 
I will admit that I have little patience for A-B type tests. I do much better hearing differences in long-term listening. The other day I was playing an album I am very familiar with - the MQA version on Tidal. After about 5 cuts I realized that it just wasn't moving me the way I expected, the enjoyment was lacking. I switched to the same album in my local library and immediately noticed that it was more alive, had more transparency, and better dynamics. I thoroughly enjoyed the second half.  (I am not blaming MQA here, but rather the different mastering employed. Not massive compression/equalization, but enough to rob the album of  some of its vibrancy.) 
Anyway, thanks for creating and administering this test. I am looking forward to the results and your analysis of them. Pardon me but now I have to go listen to Famous Blue Raincoat.
Thanks for the "bogus" rankings man :-). I see you selected A > D > C > B. Possibly a preference for the sound of the samples with increased harmonic distortion? Totally agree with your comment on mastering of course. Perhaps the Tidal MQA version used a more compressed master - I have seen this happen over the years with the MQA version being slightly louder, more fatiguing. If this hypothesis true, I bet that if you could quickly A/B compare the same song from your local library and the Tidal MQA version, you would easily hear the difference without needing any kind of long-term listening and also be able to decide which one sounds "better" as well.

Hope you got a chance to enjoy Famous Blue Raincoat again!

VI. In conclusion...

As I mentioned, I didn't include all comments I received above; the ones I did include I believe represented the general flavour from my data set. This should give the reader a glimpse into the different ways that listeners subjectively described the same set of songs and sample variations.

I think this brief comment from a listener who heard only a "small difference" sums things up quite succinctly:
Harmonic Distortion seems not to have a big influence on sound quality - to my surprise!
There were a few "surprised" comments like this that the listener was unable to hear much of a difference despite having high expectations. This is why I believe it's important for hobbyists to try listening for themselves instead of just reading (and potentially repeating) what others claim. We see comments about "harmonics" all the time in magazines and among online posts. I suspect if any one of us were told that our amplifier had "3% harmonic distortion", we'd be rather aghast at the terrible performance! But how many of us have taken the time to measure our own equipment to appreciate the level of nonlinear distortion present or listen with this knowledge for ourselves to experience what the distortions actually sound like?

Although I received the results from 67 listeners, I know many more have downloaded the test tracks and likely have listened as well but not filled out the survey. I'll leave the download as linked in the Blind Test Invite up as long as I can. By all means, use it like the demo tracks I made for jitter and for the "bits are bits" discussion. As shown in a few responses above, it's quite possible that some listeners preferred the sonic effects of the nonlinear distortion added. On the other hand, others assumed that characteristics like "fullness" or "air" might represent the effect of these distortions but in fact were found in the tracks with lower distortion once we "unblinded" the selection. Such is subjectivity and speaks to the limits of the human perceptual and cognitive systems.

At the end of the day, I think we can see from the subjective descriptions that there really isn't some common language shared in these responses. Nor is there a universal type of preference for each of us. Despite this, as shown last week, when we look at the average responses as a "group of audiophiles", we can see that there is still a preference for the "low distortion" samples B (-175dB/0.0000002%) and C (-75dB/0.02%) over the "high distortion" samples A (-50dB/0.3%) and D (-30dB/3%). This is the "power" of increasing sample size and blind testing when teasing out subtle effects.

Remember to consider these facts when we see testimonies from self-proclaimed "Golden Ears" and individuals who might be outspoken about what they believe they hear. Unless one knows that the individual has had experience as a "trained listener", keenly disciplined and aware of what distortions actually sound like, we can never be sure that what sounds "good" for any one individual actually translates to a broad group of audiophiles. For all we know, the individual could have a preference towards the sound of higher distortion!

What I can say from the "objective" results based on this blind test is that as a group, listeners still prefer the sound of lower non-linear distortion, and one can certainly determine whether audio components add distortion through objective measurements. I believe this knowledge is of benefit to the "more objective", and rational audiophiles.

Again, thank you to the 67 respondents - see, no need to fear blind testing! :-)

And of course a special thanks to Paul K for Distort and the thoughtful collaboration over the last few months!


As the world turns, often in very unpredictable ways, I wish everyone health, safety, and peace - while enjoying music along the way as we head into summer.


  1. I'm the guy that heard "huuuuge" difference on ER4SR. ;) After reading the previous part I re-visited the notes I was making while performing the blind test, and from the notes it's obvious that I heard (almost) no difference between A and D (and they sounded the worst), but I heard quite some difference between B and C (and, naturally, between each of those and the more distorted pair). My notes show (as does my rating) I liked C better than B, because, and it literally says so in my notes, B sounded "less alive".

    Now that I know that B was the cleanest, I think I liked those first harmonics that "spiced-up" the C. A and D was obviously too much (my notes say "DIRTY" in caps) for me, to the point where I couldn't quite tell them apart.

    Thank you for your awesome work! I never fancied I had a good ear, and now (thanks to you) I know for a fact that I hear at least the harmonic distortion — even on my modest gear. I never considered myself an audiophile either ("what, audiophiles? those guys that listen to the gear instead of listening to the music? meh!"), and I may just have started thanks to you!

    1. Thanks for the note Evgeny,
      Also thanks for participating. Great that you kept the notes to refer back to. It's certainly this iterative process of trying and obtaining feedback that we can appreciate what we hear and learn about what we like!

      One thing I've learned over the years is never to underestimate the quality of "modest" priced gear in hearing distortions. While many high priced gear are highly transparent, that's certainly *not* a given. One only has to hang out at high-end audio shows to see/hear examples where >$50,000 gear sound no better than more modest <$10,000 systems!

  2. Ooops! I posted my subjective comment one day too soon in Part II...

    1. Thanks Gilles,
      As usual, well done! And I agree with Serge last week that this is a great self analysis.

      I agree that it's never a guarantee that we will like the lowest distortion version even when we do hear a difference. Sometimes we will hear those little subtle anomalies in the music itself. Sort of like subtle blurring in digital photos of portraits... Perhaps better not to see the sharpness of each individual pore :-).

  3. Archimago, thank you for putting together this test and the excellent analysis that followed! I had my own expectations going into it and while the results confirmed some, others really surprised me. Unexpected, surprising results are always something I cherish as these represent an opportunity to learn and possibly to discover something new. Looking forward to more blind tests and more opportunities to learn. Keep up the good work and stay safe!

    1. Thanks Paul,
      Hope things are settling in your part of the world.

      I don't think it's inappropriate for me to let readers know that in fact you were one of the "Golden Ears" for this blind test...

      Great software, great gear, and I certainly appreciate the quality of your ears :-).


  4. I'm Mr. Metalhead, and I see that I was THIS close to get a perfect score! If I had switched places between D and A I would be in the golden ears category – despite the fact that I heard very little difference. I almost put down not hearing any difference at all, but for the sake of fairness I did hear something, though I believed it was probably my imagination or some sort of bias. I don't think that people who heard a huge difference are totally honest with themselves, and the golden ears probably got it right by accident mostly.

    Together with the borderline results, the sample size here is not big enough to conclusively show that harmonic distortion is as audible as some people think. There may be a trend towards less distortion sounding better, but more research on many more participants is needed. And we need to remember that many people still adore audio products that have measured THD of 1% and above.

    1. Hi

      if you have a huge difference, an ABX test gives you 100% right answers.
      If you have just 49%, 51% its pure chance means no difference is hearable.
      I wonder what are the percentage of submitted ABX results.
      On my part I did ABX and the result was, even with -30dB, unreliable, pure chance 50%, 50%.

      The work of Zwicker (1960) gives quite an insight about the masking effect of distortion.
      So I do not wonder anymore. The ear is not a at all precise measuring instrument is my conclusion.

      Best regards


    2. Hey Metalhead and Dipolaudio,
      Yes, agree, will need larger numbers to better define the results.

      I do hope though that the data already demonstrates to those wondering just how subtle such distortions are - which scientific reports over the decades have already shown. And also how fortunate we are these days to have access to inexpensive audio equipment (primarily DACs and amplifiers) that achieve excellent performance!

  5. Thanx Αρχιμαγο,it was very exciting to put my ears to the test,and to know that my Mytek-Parasound-ProAc combo is so accurate!
    Greetings from Greece!

    1. Nice combination of gear!

      Thanks for participating... Greetings from Vancouver, Canada!

  6. Hi
    I just discovered that around 1960 Zwicker and Feldtkeller, TU Munich did a lot of studies on masking of distortion.
    This can explain some of the effects we see here.
    masking of Distortions is SPL dependent and frequency dependent :)

    See here (in german)

    Best regards


  7. here the link which was missing above:

    Best regards


    1. Thanks Peter for the information and link.

      Some good reading through the German --> English translator. Will need to have a look at some of the original papers from Zwicker!

  8. Thanks for this important part of the results. What I saw from the reports/results:

    (1) The dependence between level of distortion and subjective scores does exist, but it is not vividly expressed in the results. Because of the two main sources of variance in data, I think:

    - not all sound samples reveal this particular type of distortion equally (some sound samples may hide the distortion, others - emphasize it).

    - each listener has its own musical taste and listening practice, even perception of consonances and dissonances are affected to some extent by musical environment/culture the listener grows up; so, perception of distortion is individual, varies from person to person and may evolve with time.

    Though, I'm pretty sure that in case of more participants the dependence will be more pronounced.

    (2) Even small distortions (-75dB in this case) can be perceptible but their annoyance is another story. According to the above reports -75dB is very acceptable level of distortion. It should be noted that indicated (overall) level of distortion does not remain the same throughout test sample, some parts are distorted more (my measurements with time window show this).

    I would like to write a short article about this listening test with some additional objective measurements of the samples. Original (untouched) sound samples would be very helpful for the purpose (I've sent kind request to your outlook mail box, Archimago, but probably to spam folder )). If possible I would also look at the reports of listeners from that h16 group. Not for publishing, just for research (if I need citing I will ask permission).

    1. Hi Serge,
      No problem! Let me have a look at the E-mail shortly... I typically only check the blog E-mail every couple of weeks or so on weekends due to the busyness of weekdays :-).

    2. Hi Serge,
      Didn't see an E-mail - just checked my box. Anyways, feel free to get in touch.

    3. I've just resent my email to

    4. My emails do not reach your mailbox for some reason, so I will ask here. Could you provide, please, the reference sound samples that were used in your listening tests:

      - INTERNET BLIND TEST: Do digital audio players sound different? (Playing 16/44.1 music.)
      - INTERNET BLIND TEST: Is high Harmonic Distortion in music audible?

      You can send download links privately to my email ssmirnoff(at)soundexpert(dot)info or through the contact form -

      Thanks in advance,

  9. Archimago, I posted this in a previous thread, but too late I guess as the thread went quiet:

    Thanks for the work. I'm trying to work out the implications of your results for tube amplification.

    I use Conrad Johnson Premier 12 tube monoblocks and have always had the impression they have a slight coloration that I enjoy. You can see the Stereophile measurements here:

    Do have it right in thinking the distortion elements, impedance interactions with speakers, may well be in to the audible zone? (Depending on the speaker, of course). It sure sounds like it to me (and I enjoy it).

    1. Hi Vaal,
      Yes, absolutely, the tube amplifiers will add their own "flavor" to the sound and I see from the link that indeed these will also have frequency/distortion fluctuations with the speakers as shown in the graphs of the simulated load (a reflection of the output impedance / damping factor).

      One of the respondents above used the McIntosh MC225 and there too I think is another example of the effect that tube amps will have on the result of a test like this. I see that the McIntosh spec sheet lists THD as "0.5%".

      Ultimately, great to hear that you're enjoying the amp!

    2. Thanks Archimago.

      That's what I suspect too. Though as a strict empiricist I'm always wary of the confounding variables (bias and the like).

      Tube amps aren't usually well received in the accuracy-go-by-measurements camp. The reply is usually "Look, it doesn't make sense to add any distortion, even if pleasant, at the amp stage. Then it's uncontrolled and added to everything. It's much more logical to start with an accurate amp and source and if you want to flavor your sound add an equalizer. Then you can tweak it when you want, or turn it off when you want!"

      Which, of course, DOES make sense for some people - especially to the type of person expressing that sentiment. But in cases like mine (and many others who like the classic tube amp sound) I find I prefer the subtle tube amp characteristics pretty much across the board - I like all music via the tube amp vs the solid state amps I've tried. I never have any desire to "tweak" the sound differently, so why would I bother going to the trouble of adding an EQ in to the system? Many ways to skin a cat given divergent tastes and goals in this hobby.

    3. Your sound preferences are clear and appreciated. May I offer you then a thought experiment and corresponding question. Imagine if the sound signature of your tube amp is modeled with DSP inserted into fully transparent audio path/system. It is modeled so precisely that you can not tell apart the sound of this system and your tube amp. What audio system will you choose - hardware based (tube amp) or simulated? And why?

    4. Serge,

      Good question!

      The first reply is that, if someone were to offer me a DSP system you describe I wouldn't bother with it because I already have the gear that does what I want. No need to change.

      But getting more to the heart of your question, stepping back and making a choice between owning my tube amps vs a digital system that could produce the same sound: I would still choose the tube amps.

      Because they bring to the table some other things I value that the DSP system doesn't fulfill. Similar to the reasons I can often enjoy playing vinyl even though I also have available a wonderful digital source.

      I really like the basic romantic, old-school vibe of tube amps. The general concept of how they work is appealing to me. I really LOVE the way a good tube amp looks, especially the glowing tubes. Just as owning and holding a vinyl record cognitively maps the idea of the sound to that physical object in a way that seems satisfying, turning on the tube amps, seeing the beautiful glow in the tube and the fact it represents the musical signal going through those tubes is also an aesthetically/conceptually satisfying

      There are various aspects like that which are not satisfied by just inserting a black box in the system, similar to how streaming 1s and 0s to my DAC doesn't offer some of the conceptual and experiential aspects of records that impact my experience.

      (Sort of like "why would you own an elaborate mechanical watch when you can get even more precise time from a cheap digital watch. Well, it's because, for "watch lovers," the mechanical watches can satisfy certain engineering/conceptual/aesthetic aspects that add to the experience of owning a watch, which aren't typically found in cheap digital watches).


    5. Thank you, Vaal,

      Definitely your reply is better than my question. And I'm pretty sure most audiophiles will say something very similar describing their listening practice/experience. I see that a huge part of this experience is, say, a “technological satisfaction”. As if beauty of music converges with the beauty of technology. May be this is the main reason why audiophiles are mostly male - while women are very sensitive to music, they are usually less interested in tech aspects of sound reproduction.

      You mentioned mechanical watches ... I can add cars and weapons - areas where technological beauty matters. It's interesting why in some areas such technological attractiveness exists and is absent in others. For example video processing/playback is less exciting and “touching”; as well as work of computers. Are there some psychological reasons underneath or it's pure by chance?

    6. Serge,

      Yes, you can see the importance of concept/engineering/aesthetics in hobbies everywhere.

      And it's amazing how hard it can be for some people to understand another person's choices. For instance, in the more "objectivist" crowd (I count myself one) when I write about enjoying vinyl even though I also have a very good digital source, some just think it's nuts. Why would you "bother" with all the work, aggravation and extra crap required for vinyl when you can get "better (more accurate)" sound with a cheap DAC?

      Well, it's because for me many aspects of the vinyl experience actually enhance my pleasure in the hobby and in listening to music. I love the physicality of an LP in the hand, and often how attractive it is as a physical product. I LOVE the look and feel of a well made turntable and it, like a mechanical watch, has aesthetic/engineering/conceptual appeal. And I get to interact with that cool object every time I play a record. Basically all the stuff one person finds to be a bother and unattractive are things I like about playing records and enrich the experience. But some just can't see beyond their own values and goals and simply, by their lights, judge others as "irrational" or just "hipsters/sheep" for enjoying vinyl.

      I appreciate that you asked your question with an open mind.

      BTW, I generally value aesthetics in my hi-fi gear, in particular speakers.
      Speakers become essentially a permanent piece of furniture, which face you the entire time every time you listen. The idea of staring at a big, ugly, cheaply made box with ugly drivers is depressing to me. I don't select speakers merely on looks of course (far from it; sound first!), but fortunately there are plenty of options that offer great sound and great looks. Just as an example, my current speakers, Joseph Audio Perspectives, are to me drop dead gorgeous in design, fit and finish, and I love the way the "high class" looks fit with the high sound quality. Beautiful to listen to, beautiful to look at.

    7. Hey Vaal,
      I had to but in on your conversation here. As an electrical engineer I lean to the objectivist side myself, it's my training. However I also understand that personal tastes can have a big impact on what sounds best to a person. And your explanation of why tubes and vinyl are so appealing to you really strikes a chord with me. I'm still using my 1987 Yamaha Receiver that I bought new. I have thought about replacing it, but I had to ask myself why. I like the way it sounds, looks and the fact that it has tone controls and a variable loudness control (the horror). I did quite a bit of research when I bought it, and I get a sense of satisfaction every time I turn it on and it continues to operate like the day I brought it home. Yep, nostalgia adds to my enjoyment of listening to music for me. And I still have and play my CDs. I do stream music quite a bit, but when I really want to do some focused listening, I love grabbing a CD, opening the jewel case, pushing a button and watching the drawer open and close, and then letting a whole album play without interruption. Sure, I could rip, catalog, and set up a NAS to play my music, but again I asked myself why. And once again, I came to the conclusion that nostalgia adds to the music listening experience for me. I also am hunting for some speakers partly because I have come to despise the black ash finish of my Paradigms. How the speakers look will play a large part in what I purchase.

    8. Yes, I agree that some sort of connection ritual is important while preparing yourself for focused/attentive listening. I think it is necessary even in case of using less fascinating audio equipment or when the equipment is hidden completely and the only visible part of it is a tablet in your hands with endless choice of hi-rez music. The connection ritual in such cases could be for example reading of some additional info about the particular piece of music and its author or - why not - listening to some warm-up tracks of the same genre.

      Talking about the aesthetics of audio equipment ... to some extent this is a sequel of tradition of live performances where appearance of music instruments and performers are integral part of a show. I can easily assume that exterior of the audio equipment correlates to the type of music a listener prefer. Fitting to room interior and general aesthetic taste matter as well. But I think all this is very flexible and personal and the only thing that can be universally agreed is that there is no sense to have audio equipment of such exterior that spoils your listening experience )).

  10. Thanks for keeping the tracks up. Just downloaded and listened. I'll need to enlist a family member to shuffle the order since I already read Part I and II. Listening non-blind, tracks and A-D seem to get on my nerves as the track plays through. On the Warnes tracks the vocal reverb seems too distinct from the singer on A and D. B and C seemed a bit more natural with C the most natural and least fatiguing. On some notes the instruments on tracks A and D sound twangy; not how a real one should sound. Decays don't seem quite right. Will be interested to see how I do when I try this blind.

    1. Yup, try it blind and with family if you can!

      The human mind can be affected subtly and subconsciously when we know things (like A&D vs. B&C) so you'll need to randomize and try again to see if you can for sure hear those differences.

      Have fun :-).

  11. I tried it blind, managed by my totally indifferent teenaged daughter. To my ears the high harmonic distortion track was readily distinguishable/not preferred on two of the recordings- but not on the "ballad of the runaway horse" or the piano recording. In fact, on the Warnes track I preferred the distorted one :-( , as both the voice and double bass sounded more "real". On the piano recording I couldn't tell a difference between the distortion levels.
    On the others there was a noticeable "fuzziness" that I found particularly obvious on loudspeakers, but less so on headphones. Otherwise- I could not tell a difference.
    Interestingly enough I've also previously conducted tests on harmonically distorted multi and single tone signals and I can detect a 0.5% level with somewhat better than a 50% confidence level (about 70%- but on a limited number of tests) and 1% with 100% certainty, so the results, alas, are consistent with that.
    So much for having "golden ears".

  12. I think it says quite a lot when the few who actually managed to rank the samples correctly think the difference was small (even at 3.2% THD!), and the guys that thought the difference was huge didn't rank it correctly. Distortion doesn't play that huge of a role that everyone thinks, and people that often say "I can hear a huuuge difference between X and Y" are often overestimating they're own abilities quite alot. And those people are sadly waay to many among audiophiles :\

    Thanks for doing these blind tests :)

  13. Sirs/Mesdames, of necessity I had to resample to 44100Hz in order to play the Clavier samples (first listening) and found my dB meter showing much less than -30dB difference between samples B and D. Does playing these tracks on CD-level equipment negate my listening results in terms of THD detection?

  14. P.S. It should be added that I tend to make quick judgements when liking a type of music.