Thursday 25 July 2024

SUMMER MUSINGS: Defining "subjective" and "objective" audiophile evaluations.

Hey everyone, I thought I'd make a "quick" post in response to this comment in the recent "SUMMER MUSINGS: On the perils of subjective opinions in High-End Audio (dCS v. GoldenSound)" article:

SY 24 July 2024 at 13:07

Could I beg and plead with you to join me in refusing to misuse the term "subjective" to mean "uncontrolled?" Something can be subjective and absolutely valid and rigorous (e.g., subjective reactions made with basic ears-only controls) or subjective and absolutely invalid (e.g., subjective reactions made with peeking, preconceptions, and non-auditory inputs).

Subjective =/= uncontrolled. So much fuzzy thinking has arisen because of that conflation of terms.

Greetings SY,

Sure! I agree with you that "subjective" (when referring to audio hardware reviews specifically) is simply a reference to the form of evaluation and does not imply whether the evaluation is valid or invalid, nor whether controls were applied or not. Certainly some subjective opinions are clearly valid if the difference is obviously detected by the listener. I trust no audiophile worth his street credibility would have difficulty telling the difference between AM mono and FM stereo sound quality for example and will accurately point to the stereo FM playback being of higher fidelity.

Indeed, subjective reviewers listening under controlled conditions also can produce highly valid reports. I hope I have not confounded that over the years. For clarity, let me expand the position in some (pedantic) detail for those who like reading this stuff. πŸ˜‰

The adjective "subjective" just means that the opinion was derived from human (mental) perception and could incorporate the individual's emotions (eg. mood when listening), cognitive biases (eg. "I prefer monoblocks over integrated amps"), prejudices (eg. brand names, MSRP, country of origin), previous history ("LS3/5A speakers remind me of the old days, and I love that sound!"), relationships in the industry ("Ted Denney is my buddy and came over to set this up!"), intent (eg. gaining favor from a company, making a name for myself, seeking more clicks, increasing monetization of content), etc.

Humans, being complex psychological creatures do not have full conscious awareness of all our mental processes (ie. the subconscious is very powerful). The level of insight varies between individuals with some who are cocksure they "know exactly" what they're thinking and why as opposed to those who have a bit more humility. Some listeners may have better hearing acuity (as discussed, factors like age have a role), some may communicate their experience more thoroughly because of their intellectual or language abilities. Personality traits may also affect the way the sound is described - obviously some people are more dramatic than others, or prone to exaggeration!

Subjective listening can be tightened with the use of controlled conditions like making sure one is listening at the same output level or visual blinding to conceal the identity of the devices. This will take out some of the biases and reduce the effect of confounding variables. Blinded listening using various methods like repeated A/B or A/B/X trials can be statistically analyzed to see if the person can consistently tell a difference. If they can, especially for subtle differences, then this is evidence that in fact what was heard was a real difference. However, blind listening tests are still a form of subjective listening because it's using the human mental system. The listener can still be influenced by internal psychological states which can affect acuity, he can still pick up subtle cues if the blinding was done poorly, and the subject can still render preferences based on emotional factors.

I've seen subjective reviewers inappropriately claim that their assessments are objective when clearly not! For example, The 13th Note and I also think Jay's Audio Lab make this claim erroneously in some of the videos (and no, Jay still doesn't have a "lab"). At best, they might perform subjective controlled listening, but inevitably, these are comparisons affected by mental biases and prejudices; hence, subjective.

In many circumstances, subjective evaluations are essential and much more important than objective analysis. No machine will tell us if art is "beautiful", it cannot "enjoy" music, or tell us if wine tastes "good". Reviews of art demand subjective eyes/ears/taste/minds because they are meant for human/mental consumption.

To make sure I am clear about my intent is why I often will qualify my articles with statements like "where I'm coming from as an audiophile hobbyist who is aiming for high-fidelity audio reproduction as the ultimate goal", as can be seen near the start of that dCS v. GS article. For me, audiophilia has always been intricately linked to the concept of fidelity. I believe "high-fidelity" as the goal is NOT art. It's a technically definable intent of achieving transparency to the source material as the ideal. Thus, whether a device or system is capable of high-fidelity is a measurable, quantifiable matter that can be empirically observed and applicable to scientific examination even if there are artistic elements to the appearance of the device such as the steam-punk looking Dan d'Agostino amps or the organic shape of the B&W Nautilus. A transparent high-fidelity device will simply pass the music on to you, the listener, complete with all the nuances and without coloration; whether you enjoy the subjective experience is up to you. If the music/art is not up to your expectations or standards, then by all means take it up with the artists or audio engineers who might have butchered the sound with dynamic range compression for example! πŸ˜’

Objective testing would use tools to measure devices and we can compare them using the same "measuring stick". Results can be confirmed "independent of a mind" without all the emotions, biases, opinions, etc. While no two persons are exactly alike, machines can be calibrated to very close specifications with definable error range. Testing can therefore be reliably replicated. (Yes, we could and should train human listeners to improve subjective inter-rater reliability when running listening tests; but we won't be looking at the same quantifiable accuracy compared to high resolution machines.)

While high-fidelity reproduction may be defined and measured scientifically, this does not mean each of us cannot or should not apply our own subjective will and at times purposely deviate from hi-fi ideals. "Euphonophilia" is a perfectly acceptable pursuit for those mixing-and-matching components to satisfy intended preferences such as system tonality. For example, even though the Rogers LS3/5A speakers may not measure particularly well by today's standards, nobody should be dissuaded from giving them a listen or buying a pair if that frequency response suites their taste. However, when talking to other audiophiles, if one believes that the LS3/5A is the best speaker ever made, it's important to express that this is a personal subjective opinion, not an objective fact. Even if one conducted a detailed blind listening comparison versus say a modern KEF LS50 Meta (measures quite well) and still preferred the LS3/5A, that preference remains the result of a subjective assessment.

Most of the time, there's simply no need to debate subjective preferences when objective testing confirms significant differences (such as between speakers). There's absolutely nothing wrong with holding subjective opinions that differ from those of others (assuming not bizarre or grossly delusional or could cause illegal actions of course!). One is free to claim that the Rogers LS3/5A subjectively beats the sound of a fancy new hi-tech BΓΈrresen 01 stand mount speaker for example; yeah, I'm sure they both will sound different in one's room and to one's ears.

The biggest areas of contention among audiophiles are when objective testing clearly demonstrates high performance, such as likely "perceptibly perfect" DACs, yet certain individuals claim massive differences (or even claims to hear major deficits as per dCS v. GoldenSound), then there are things that make no meaningful objective difference like cables, or ultimately, those devices or tweaks that make no objective difference at all yet some will subjectively claim they hear significant effects - Class A snake oil like bags of rocks, foil stickers, or green marker pens.

As "rational audiophiles", I hope we can agree that snake oil products and those who are so uninsightful (or unethical) as to falsely hype such items should not be promoted prominently in the mainstream audiophile hobby.

Hopefully that's all adequately clear and consistent... Thanks again for the comment SY.

--------------------

Pssst, want to try some new music?

My 19 year old son's favorite artist is Porter Robinson and he just dropped this new album SMILE! :D (2024, DR5 stereo - ouch!, DR12 multichannel/Atmos) which I guess would be of the indie synth pop/rock genre.

Hmmm, let's see if my subjective enjoyment of music resonates with those of a younger generation. πŸ™‚

Hope you're enjoying the music!

19 comments:

  1. Hi Archimago
    "For me, audiophilia has always been intricately linked to the concept of fidelity. I believe "high-fidelity" as the goal is NOT art. It's a technically definable intent of achieving transparency to the source material as the ideal". Very nicely put, indeed.
    An intersting analogy comes to mind. One may think of music as a great work of art (because music IS art), let's say the Mona Lisa. Fidelity is actually the way that the Mona Lisa appears behind the protective cover which has saved it from vandalizing attempts until now. So, if the glass, plexiglass or other material cover (low end, average end, high end equipment) is fully transparent, distortion-free and uv-absorbing (or any other property as required) it will convey the full and accurate image of the prototype.

    ReplyDelete
    Replies
    1. Hey there ML,
      Yup, exactly, and in fact I discussed a very similar analogy as that awhile back with the Mona Lisa related to the idea of the art as opposed to the "support" structures around it which of course serve an important role in the presentation.

      Thankfully the transparency of our audio systems these days I think are much better than the plexiglass and boundaries placed around the actual Mona Lisa as shown in the picture in the link above. πŸ˜‰

      Delete
  2. None of the below contradicts what you're saying, just further thoughts on the topic.

    To say that preference is subjective is tautological. Many arguments between subjectivists and objectivists don't even get to preference, because the objectivists argue that there is no audible difference.

    "Audibility" questions necessarily involve both measurements as well as how the sound is perceived by people. But just because they rely on human auditory processing doesn't mean that audibility can only be determined subjectively. If the test is perfectly blind, level matched, etc., and a person can differentiate the DUTs to a statistical certainty, then the difference is objectively audible.

    ReplyDelete
    Replies
    1. Thanks Neil,
      Yeah, I certainly see where you're coming from. Indeed "preference is subjective" is tautological and I trust it should be obvious for all of us even without needing it to be said. Alas, it seems that some might need to be reminded. 😁

      Indeed, if we can provide a perfect blind, absolutely level matched, consistent environment, and listeners with ears of high enough acuity, then they could become equivalent to "audio measurement devices" and act essentially as accurate, objective "instruments".

      Alas, in the complex world we live in, that's hard if not practically impossible! So generally, I think it's fine to just say human listener impressions even with controls in place are subjective. Let's leave the objective evaluations to those where we run tests, compare and quantify using instruments.

      Delete
    2. I agree that in many situations it's often practically impossible. A great example is Goldensound's ABX test of a high-tap sinc filter and ASR's collective meltdown over it. https://www.audiosciencereview.com/forum/index.php?threads/goldensounds-passes-apparently-abx-test-for-dacs-not-really.54079/

      Alas, all the controversy in subjective vs objective approaches boils down to audibility, so unless we are content with non-overlapping magisteria, we eventually must look for the objective in the subjective.

      Delete
    3. Interesting discussion and link, Neil,
      Yes it does boil down to audibility...

      However, it also has to do with whether we have trust. That GS video looks like BS to me, man. Maybe we can analyze that in further detail in another "Summer Musings" later. But what I see there seems super-human and given the kind of things GS says such as in the dCS video, I have grave concerns about the man's ability to be "more objective" as a subjective listener.

      Given such examples, I have serious doubts about his claims. External, independent verification needed.

      Delete
    4. This comment has been removed by the author.

      Delete
    5. > However, it also has to do with whether we have trust. That GS video looks like BS to me, man.

      Yes, I think that was the consensus at ASR after they examined it from all angles and could not find anything wrong with the methodology and verifiability. Of course what he provided doesn't rule out purposeful cheating, and that is certainly a possibility! But what can I say? He just doesn't seem like a cheater to me. I put the likelihood of cheating around 15%, but of course reasonable minds could go much higher.

      But basically all his test boiled down to is being able to tell the difference between Redbook -> NOS filter DAC vs. Redbook -> High tap oversampling -> NOS filter DAC. That doesn't seem that crazy to me, especially for a 26 year old who can still hear beyond 20KHz. And Amir said that Meridian published a study in AES (darn that paywall) with the same finding. What makes it strange is that they ran Deltawave on the samples, and the differences were *only* in the high frequencies and should have been completely masked. But do we have empiric literature on psychoacoustic masking of ultra high frequency sounds? That sounds awfully esoteric to me, and given how much trouble I have in tracking down audibility studies for anything, I'd venture to say we don't know much about that.

      At any rate, it's much more likely than the results of your recent DAC test, where the average participant age was *old as heck*, all three devices measured superbly, and the filter differences were presumably much more subtle than sinc filter vs. NOS. In particular, the Apple Dongle and the Majik measure nearly identically both in terms of noise as well as THD. And yet you got statistically significant preference data from that. I'm still skeptical. I wonder if something as simple as not randomizing the A, B, and C presentation order could confound the results. But in your analysis, you conclude "Yes, audiophiles can tell the difference between higher performing DACs like the Linn streamers compared to the Apple USB-C dongle with preference for the Linns." If you can believe that, why can't you believe GS?

      Delete
    6. "after they examined it from all angles and could not find anything wrong with the methodology and verifiability."

      On the contrary, I pointed out one HUGE hole in the methodology and verifiability. And it was consistent with his ABX logs (which should have raised more eyebrows than they did).

      It's nice that people always want to believe the best of others, but sometimes skepticism is warranted.

      Delete
    7. I PMed you last month after you referred to an obvious loophole in an ASR thread. You pointed out that he could have cheated because we don't have a 360 view of his work area and he can detect the differences in HF by looking at a spectrometer. In other words, you pointed out a way that he could cheat.

      I can think of other ways he could have cheated as well. For instance, he could have used a DSP to boost the high frequency data and "fold" it back into the audible spectrum.

      Since I pointed out that he could have cheated, I obviously didn't mean that the test was verifiable in the sense that we can prove that no shenanigans were involved. But c'mon, compared to the average YouTube blind test, this went *way* beyond what we normally see.

      So it comes down to whether you believe him or not. What goes into that calculation? Intuition and experience for sure. Does the person have a track record of lying? Does he have anything to gain by lying? And how outlandish is the claim? I think for many people at ASR, it's a rather extraordinary claim that these HF sounds would not be masked by lower frequency content, and extraordinary claims require extraordinary evidence.

      I'm on board with that. This test obviously doesn't *prove* anything. Obviously we would really want to see him replicate the results using someone else's rig, with a skeptic witnessing it. And if someone claims to be able to do something that ought to be impossible, it's natural to assume deceit over supernatural powers.

      But I dunno. Again, I haven't seen any empirical evidence about ultrasonic masking. I don't know if we are taking general ideas about masking from frequencies normally considered in the audible range and assuming that it works the same for ultrasonic. We also have Amir chiming in saying that there's an AES paper with the same result.

      BTW, having rewatched his companion video, I see that the comparison is between a "typical" oversampling filter and a high tap filter, not high tap vs. no oversampling at all. A considerably more difficult task in the abstract, though the Deltawave analysis already showed exactly how hard the task would be anyway.

      Delete
    8. "I can think of other ways he could have cheated as well. For instance, he could have used a DSP to boost the high frequency data and "fold" it back into the audible spectrum."
      Possible. I don't remember all the details, but if he published the files and showed the hashtags in the log, that would take more effort to pull off.

      "Does he have anything to gain by lying? And how outlandish is the claim?"
      Yes, absolutely.
      And highly outlandish, especially given the rapidity of the claimed detection.

      "But c'mon, compared to the average YouTube blind test, this went *way* beyond what we normally see."
      That says more about the people on YouTube than it says about reality.

      Delete
    9. Interesting discussion guys. And yeah, ultimately it comes down to trust in the person.

      Are there secondary gains for GS by "showing" the world that he has the ability to blind test this? Of course. For one, he has his name on the Ferrum WANDLA "GoldenSound Edition" DAC so there's something to be said about him wanting to portray a level of competence and confidence in the eyes of both subjectivists and objectivists. His role therefore is not just being a reviewer of devices, but he is also a salesman as he has crossed that bridge. Not to say that one can't be honest and transparent still, just that the job has expanded now, at some level he needs to be mindful of sales numbers for this device, and also if he wants endorsements elsewhere, needs to show some level of success in maintaining public interest while ultimately moving the units.

      On a related note, the fact that he has teamed up with a brand also means the audience must be mindful of what he writes and says about competing DAC devices. Conflict of interest can be subtle especially with subjective reviewer comments (which as we've seen with his comments about the dCS can easily trump what was found on objective measurements!). As an example, imagine if John Atkinson were to endorse his name on a product! I'm sure that would have changed our perception of neutrality in him as a reviewer in Stereophile over the years. (Not that he isn't affected by Industry as I believe we witnessed by his MQA stance... But much less overt than would a "John Atkinson Edition" DAC 😬.)

      Regarding the blind test, it's great that he goes the full length to show the ABX logs and videos to build confidence. He also knows the results of the DeltaWave analysis with the difference being mostly >20kHz. Having thought about all that, don't you find it just a little odd/convenient that he didn't check his supposed hearing frequency response until that snip with his friend later on? Then somehow surprisingly states "I should have checked that...", then his friend says something about the two test tracks "Oh my God... so this entire time, it's not that you have like the ability to hear a microsecond of time difference, it's that you can hear that one has more magnitude!" (really? more magnitude with just a bit of low-level 20+kHz stuff?), and GS states they "should have come to that conclusion 3 hours ago" or something like that... Hmmm, weird.

      If I had thought through all the variables, aware that digital filters are really about magnitude and slope of the low-pass filtering around Nyquist (not freaking out about stuff like pre-ringing), isn't it odd that checking FR would not have been important beforehand as part of the diligence and developing insight into his hearing abilities. He must have known that whether he is successful or not, this all hinges on that high-frequency difference and if successful, an expression of pride in demonstrating his "golden ear" hearing ability up in those very high frequencies, right?

      Now that he knows this, perhaps other blind tests to verify that claimed 20+kHz audibility ability (eg. determining the audible threshold and correlating to whether this would make sense in normal music listening volume) could then be proposed...

      Bottom line: While I'm not a YouTube video guy, what GS showed in the video can be replicated without much difficulty if one is motivated to do so with some thoughtful planning. Seeing is not necessarily believing, just like everything else on the Internet! At this point, I think it would be wise to at best "Trust, but verify."

      Delete
    10. One more thing guys about the AES paper referenced by Amir.

      It's about taking a hi-res track provided by 2L, and then downsampling it to 44.1 or 48kHz seeing whether listeners could tell the difference with no dithering (truncation) vs. RPDF (suboptimal, should have used TDPF) vs. 24-bits. This is the opposite of what GS is doing here with taking a standard 44.1kHz CD-quality content and upsampling to 176.4kHz using various anti-imaging settings.

      Whereas the paper is actually filtering out and removing hi-res content, GS is really just upsampling and not adding anything into the signal other than the small difference in how much extreme high-frequency material is preserved based on the steepness of the filter.

      As we can see in the link above, there was some controversy about whether the work was all that meaningful... The most interesting things for me were:

      1. Maintaining 24-bits did not sound/score different from either 16-bit truncation or 16-bit RPDF. (At least with this 2L track, 24-bits was not needed.)

      2. Score was really no difference whether downsampled to 44.1 or 48kHz. I believe 2L typically records in DXD (352.8kHz - part of the 44.1kHz "family"), so asynchronous downsampling to 48kHz didn't really result in meaningful reduction in audibility as some audiophiles sometimes fret over!

      Delete
    11. Thanks for the info on the AES paper. I'm basically in agreement with your other reply. Given your thoughts on this, what do you think about the results of your high end DAC test? What do you suppose could account for the differentiate the DACs from each other?

      Delete
    12. Hey there Neil,
      Good question... Maybe this will make another good "Summer Musings" post in the next few weeks :-).

      Delete
  3. "That GS video looks like BS to me, man." #metoo. And when I pointed out that there were obvious ways to cheat (not saying that he did, but we do have to consider the possibility when evaluating the quality of ears-only controls when such extraordinary claims are made), my observations and speculations were not warmly received.

    When claims are made by people who, by the nature of what they do, are self-promotors, controls need to include review by experts in cheating, aka magicians. Targ and Puthoff is an example of a "research" line that could have been discredited more quickly had Randi been brought in at the outset.

    BTW, thanks for the nice words you had on my little article about "subjective" vs "objective" testing a few years back.

    ReplyDelete
    Replies
    1. Hey there SY,
      After you brought out the video and the ASR comments, I went in to have a look at the initial video as well as his video doing the blind test. Lots of good comments on the ASR thread.

      Is it possible that he achieved that? Sure, there could be extreme outliers of listening ability given the billions of people on Earth. I don't doubt that the HiFiMan Susvaras can reproduce the ultrasonic frequencies that differentiate the two filter settings.

      Is it probable in this case? Hmmmm, age 26, looking at the music he's using, examining the delta between the upsampling differences, I don't think I'd put money on him being able to do it again if there were independent observers involved.

      Not saying that for sure there's some slight-of-hand going on, but as an outside observer, it would not be difficult to replicate what he did in the videos or the results he showed in the ABX log if one knows how to get things done for YouTube purposes to signal to everyone that one has "Golden Ears". πŸ˜‰

      Note that I don't disagree with GS in the summary which is that the effect must be very subtle for anyone who is able to detect the differences. It certainly won't be a meaningful "musically enjoyable difference" or anything one could dramatically claim even if one could hear it relatively easily simply because the >20kHz content is minuscule.

      Hope you continue to add to the rational audiophile conversations online SY.

      Delete
  4. "it would not be difficult to replicate what he did in the videos or the results he showed in the ABX log if one knows how to get things done for YouTube purposes to signal to everyone that one has "Golden Ears". πŸ˜‰" Exactly. The time stamps on the ABX logs should have been a clue. It's also possible there was a transitory tell he knew about but was not obvious to someone who didn't (analogous to marking cards). But if it were me and I were going to game this, the way I pointed out would be a whole lot simpler and impossible to detect from that video.

    ReplyDelete
    Replies
    1. Indeed I'm sure if a few of us sat down for a session, there would be a number of ideas on how to do achieve this; some easier that others πŸ€”.

      Magicians of course never reveal their secrets. Keeps it entertaining that way. (Beyond our esoteric audiophile circles, I'm pretty sure the vast majority would not find anything of this nature remotely interesting or entertaining!)

      Delete