For today's MUSINGS, let us take a few moments to think about our sense of hearing. Though the auditory system is not as complex as the visual architecture, the extensive interconnections and association areas they intersect with of course provides us with an altogether unique experience integrating emotions and memories. The "drug of choice" for us audiophiles is in the auditory domain. The music we hear adds to the quality of life, and in so doing, satisfies at least to some extent a sense of meaning to our existence. Much of this is a mystery which perhaps in time with the development of neuroscience, secrets may be unlocked... But even if these remain impenetrable mysteries, there is of course no denying the value and truth to human subjectivity. That joy and meaning is wholly ours alone, it is our right as sentient beings to own and cherish.
As humans, we also recognize that we are limited in every conceivable dimension and sensory modality. We cannot smell as well as dogs, we cannot see as well as cats, we cannot hear as acutely as bats, our proprioceptive abilities pale in comparison to even more primitive primates swinging in the trees... Over the years, I have posted on the importance of being aware of these limitations so as to be insightful about our abilities especially when making claims about what is heard or not heard. For example, I think it's useful to try something like the Philips Golden Ear Challenge as a way to evaluate our own hearing acuity. Having an appreciation of dynamic limits also helps us appreciate the importance of sound levels and silence in the listening environment. Knowing the limits as a human being helps us to understand the importance (or unimportance!) of developments like high-resolution audio and what we should expect.
Part I: Physiological Mechanism of Hearing
Our brain is mapping the world. Often that map is distorted, but it's a map with constant immediate sensory input.As a review, remember that the hearing mechanism is analogue (note that this is not absolutely true in that neuronal action potentials do operate based on thresholds). While analogue allows in an ideal world infinite "levels" to represent a signal like a sound wave unlike the quantization steps in digital, in real life there are limits to our amplitude discrimination. These limits are a result of noise which "blurs" details and lowers resolving ability. Though our ears can implement "dynamic contrast" adjustments with activity in the tensor tympani and stapedius shifting, the limits of our hearing has a range of approximately 130dB from the absolute softest sound level to the limit of pain. Remember though that we actually do not need to be able to encode all of this because in truth nobody (in a healthy state of mind!) should be listening to music regularly at extreme levels as one would risk instantaneous hearing loss above 120dB - assuming for a moment one even has a powerful amp and robust speakers! Furthermore, remember that dynamic resolution varies through the audible frequency spectrum; this is the message of the classic Fletcher-Munsen contours:
---- E. O. Wilson
Notice that our hearing is optimal between around 200Hz to about 5kHz. This is the frequency range where high-fidelity audio really must "get it right". Beyond that optimal range, our ability to appreciate frequency nuances drop off significantly on either side. Notice also that the shape of the curves shift depending on loudness level, this is an important consideration in music production and also significant in terms of "reference" volume. Just some of the many considerations we all have to keep in mind when doing listening comparisons to make sure differences we're hearing may not just be physiological factors at work.
For the tiny sum of $15,000USD, I have seen claims of 100dB dynamic range from phono cartridges (let's just say I'm a little more than skeptical). A typical LP is unlikely to achieve better than 60dB, maybe slightly over 70dB max. Remember that 16-bits is already very reasonable with an effective dynamic range up to 110dB or so with dithering (this varies with the different algorithms and ones like noise shaped dithering takes advantage of lower high frequency acuity). High-resolution 24-bit PCM can encode down to the thermal noise floor. Absolutely overkill but great for studio work to maintain precision of course.
We also must be aware of the limits to hearing as we age. Occupational noise-induced hearing loss is an issue for some, but even without this, realize sadly that as most audiophiles are men, on average, we will deteriorate more than women:
|Graph of average loss of frequency acuity with age. It is unfortunate that we do not have more women doing reviews! See Roger Russell - Hearing and Listening.
[If you're interested in the nitty-gritty physics of music and hearing, have a look at these Physics 406 "Acoustical Physics of Music" lecture notes from the University of Illinois.]
Probably the most difficult dimension to get a handle on is the time domain. Just what is the threshold in time for our ability to detect changes? I briefly alluded to this in my assessment of DSD decoding. There have been articles like this one from Kunchur (2007) and a related study by the same author in 2008 using different methodologies. In the latter article's introduction to the topic we see thresholds ranging from 2ms for "gap detection" in noise, to 200μs down to "2-16μs" as per Kunchur himself. Most studies seem to suggest a theoretical 10μs threshold value. If you look at the paper, experiments to determine this threshold are of various paradigms; dual tones with same spectra, different spectra, varying amplitudes, click and click-pairs. Both papers focused on a ~7kHz stimulus waveform either subjected to subtle lowpass filtering changing rise time or slight speaker misalignment. Specialized equipment was used to create and verify the very precise stimuli of course.
"We don't listen to test signals!" is a common argument I have often heard against measurements and objective testing. Recognize that these research experiments embody this criticism. Specially calibrated and constructed equipment is used to create test tones in the lab for listeners in blind testing analyzed for statistical significance. This is obviously not music.
[EDIT: I'll leave the original text above but with "strikethrough" for the sake of full disclosure. As Måns noted below in comments and I was corrected on the message forums, time domain performance for a signal below Nyquist is actually a function of bitdepth, not as I had originally written above a reflection of the need for even higher sample rates. For CD 16/44, we're looking at (1/(44,100 *2^16* 2Pi)) = ~60 picosecond resolution for signals well within the audible spectrum - thanks Adamdea. More than enough for the Kunchur estimates above... Nonetheless, it is interesting the ideas and claims from the MQA articles below.]
Despite the lack of clear demonstration that these experiments correlate with the need for higher time domain accuracy in digital audio, this has not stopped the audiophile world. Consider these series of articles about Meridian's MQA from the May/June 2015 The Absolute Sound. In them, Meridian (through the aid of Robert Harley) claims that "temporal blur" has now become the benchmark.:
"If Meridian were forced to characterize the quality of a digital audio system with a single metric, it would be how much temporal blur the system adds, measured in microseconds or milliseconds."As you can see in the text and Figure 1, 10μs has become the target for Meridian. Considering that few have actually heard an A/B comparison using MQA and as far as I know, blind testing results have not been released, I watch with curiosity the outcome in the days ahead now that MQA-enabled DACs like the Meridian Explorer2 have been released. It seems they're behind schedule with music roll-out since we're into the start of the forth quarter now (the article was looking at 2nd quarter and the only MQA-related news at IFA recently seems to be the Pioneer firmware announcement).
I anticipate that time domain resolution is going to be the big "push" in the days ahead thus the focus lately on digital filter measurements, and the blind test a few months back. Related to this as well have been the discussions on DSD which by virtue of very high samplerates have an edge over PCM in time domain performance.
Part II: Cognition and Listening
That is why I use these parables,
For they look, but they don’t really see.They hear, but they don’t really listen or understand.---- Matthew 13:13 (NLT)
"Hey... I told you to do that last week! Weren't you listening?!"Whether a reflection of wisdom from two thousand years ago or domestic demands of yesterday, we know intuitively that there is a difference between what our neural mechanisms hear, and whether we actually are listening to it - the stuff that actually makes it into our memory, subconscious, and of course conscious awareness. This leads us into the broad, complex, and marvelous domain of cognition/psychology in hearing/listening. This is a topic which should really be on the forefront of audiophile discussions but so often relegated to background chatter or even scoffed at when brought up as suggestions around potential explanation for certain subjective claims.
---- My wife the other day
To start, remember that the auditory memory buffer, also known as echoic memory is actually very brief. Echoic memory is where detailed unprocessed representation of what we just heard is temporarily stored and available for analysis and interpretation. Studies suggest this storage is in the primary auditory cortex itself, the duration of the "buffer" is about 4 seconds and it can linger around in memory for maybe up to 20 seconds without distraction. This is important because if we're doing blind testing, these limits suggest that snippets of audio should be brief, and we need to quickly switch between samples for best accuracy. Of course this does not mean we cannot listen to something clearly, process the impression, and then later compare based on the gestalt in longterm storage. This isn't difficult for clear or obvious differences, but subtle differences will not be so easily detected, remembered and recalled. This is important of course when we read reviewers talking about hardware comparisons of devices they used to own or have not heard in days/weeks/months/years.
As suggested above, distraction can be an issue. This leads us to how attention plays a major role in how we perceive and evaluate our world. Here's a video of an example which we should keep in mind as an analogy:
As per the title of the video, this is an example of selective attention. Of course this is in the visual domain but you can imagine a similar phenomenon when we evaluate audio for subtle changes. So often, we hear people commenting about how they "didn't notice" the presence of an instrument until some SUPER-USB-CLEANER tweak was inserted, or how ULTRA-SPECIAL-CABLES "made" the percussion seem like it was "30 feet behind the wall" or had "deeper bass". Realize that every time we listen analytically, we shift our attention (scan) to listen for changes and because there is no way to exactly recall the complexity of music (unless we're seriously doing a controlled test adhering to the limits of echoic memory to maximize detection of subtle sounds), it's no surprise that we report "noticeable" differences. Indeed it could be "true" that the person heard what they claimed... But the likelihood is that those sounds were always part of the playback; the only difference being whether the listener actually paid attention to them or not.
I would say that this kind of phenomenon is "utilized" quite a bit in audio shows especially with cable demonstrations. Inevitably, the demonstrator will ask the room whether participants heard a difference after some hardware change (which takes many seconds during which the sales rep probably talked about how much more expensive and theoretically better the new cable/device is) and someone's going to come out and say they heard some element clearer, or the bass seemed cleaner, or the soundstage seems wider, etc... Like I said, these impressions could be all "true" but it's not unreasonable to question whether the impression is only valid in the mind of the listener rather than an actual change in external reality. As much as some audiophiles make it a contentious issue, it is only when we account for variables in a more formalized fashion or repeat testing to verify that we can truly be sure...
Even more fascinating are multimodal perceptual interactions. Watch this:
As you can see in these examples, how we cognitively process sound can be highly influenced by other modalities such as what we see before us (and vice versa). Even when we know the "trick" such as the McGurk Effect or Shepard Tone, the mind subconsciously associates an interpretation which is extremely difficult (impossible?) to disentangle from. Remember, humans are primarily visual creatures. We dedicate way more neurons to visual processing and association than audio. Though perhaps not as easily tested or demonstrated, what happens when a reviewer is in front of an impressive looking sound system? What happens when he knows that it costs $200,000USD? What if I'm friends with the designer and he's personally showing off the gear in his room at the audio show? Do we not think that biases can be induced subtly if even the underlying physiology can be so plainly "fooled"? I know, I know, cognitive biases are things that happen to other people, right? Surely 2,000 audiophiles can't be wrong! (As I saw implied by a manufacturer recently.)
Part III. Wrapping Up...
Know Thyself.I think it is fair to say that with our perceptual and cognitive limitations, insight into truth is never complete. It is refreshing to see articles such as this on Computer Audiophile describing one's man journey, discovery, and perhaps not unreasonable to use the word wisdom.
---- Temple of Apollo at Delphi
As humans, we have remarkable cognitive ability. I would never make light of this... It is this ability to integrate feelings and cognition that gives us our fantastic subjectivity; the gift of understanding, insight, joys & griefs, sense of purpose and value to experiences. No machine (currently) can hope to even appreciate this magnitude of beauty, art, creativity - in sum, sentience.
But because we can be biased in our perceptions, even down to that infrastructure of our physiology and cognitive ability, I think it is wise to be mindful of our limitations. For example, an audio recording/analysis machine would never be fooled by the McGurk "bar" vs. "far" in the video above. It cannot be affected by physiological illusions, or emotional biases. And no doubt a decent modern recording device could "archive" audio with precision beyond the recollection of any human being's echoic memory or longterm memory storage. This is why objective measurements I believe is essential when we want to know just how accurate a piece of equipment is. This might not be all we want to know in a review but I do believe it's important as someone who cares about "high fidelity". Even more important, and as a corollary, objective analysis will also allow us to figure out if a device/cable/tweak made any difference at all. And if so, what magnitude of effect. I feel that this is an essential part of the evaluation of some devices and cables that have no other reason for existence other than claims of being able to impart sonic change based not on "evidence" beyond testimony.
On a related note, over the years, I have either heard or argued with folks who think that the whole purpose of high fidelity is "enjoyment" (therefore perhaps this discounts or reduces a need for objectivity). Sure, the hedonistic goal is important and I regularly sign off my posts with a wish that we all find enjoyment in our audio. However, I trust that the definition of "audiophile" is more than "music lover". We do not go to "audiophile shows" to chat with music salesmen nor do we typically read audiophile magazines to get the latest scoop on favourite artists and new albums of the week... No folks, an "audiophile" is more than a music lover, let's face it, "he" is a lover of the hardware and the technology. He is a practitioner of "high fidelity". While there are elements of art and design in hardware, we buy them for what they do. The technology and science inside the box is meant to produce "good" sound. While we may disagree what "good" sound is, I choose to use technical accuracy as my guide (the 'ideal') to what high fidelity means... Others may choose a more "euphonic" character and that's fine as well so long as we understand and can communicate goals. Art and science, subjective enjoyment and engineering virtuosity are complementary and together represent the fulfillment of this hobby (not to mention modern life!)...
Bottom line: As the saying goes "to err is human..." - therefore verify; especially ephemeral auditory impressions.
As usual... Happy listening everyone.
Oh yeah, one more thing. Realize that it is not only in audiophilia where reliability of the sensory system can be questioned once tests are held in controlled settings... Consider oenophilia. Cheers!