Archimago's Musings: MUSINGS: Do we "need" those >20kHz ultrasonic frequencies for high-fidelity audio?

Saturday, 1 April 2017

MUSINGS: Do we "need" those >20kHz ultrasonic frequencies for high-fidelity audio?

Recently I received this excellent question and link about ultrasonic frequencies from the Computer Audiophile site:

Hi, Archimago. Visit your blog frequently and find your posts enlightening and entertaining at times without the usual smoke and mirrors.

What caught my eye in the musings on MQA utilizing the Mytek Brooklyn DAC was the selection of a musical reference which had ultrasonic, specifically musical, content. After reading the paper by James Boyk <https://www.cco.caltech.edu/~boyk/spectra/spectra.htm>, ; I have been interested in the subject of ultrasonics and their (potential) effect on the listening experience fully recognizing that these frequencies are well above the capability of human hearing. Included would be identification of recordings that have musical ultrasonic content.

Given that formats such as PCM (24/96 or higher) and DSD (2x or higher) [Ed: remember that even DSD64 1x can go >20kHz] have the potential to capture musical content above 20kHz, I am intrigued by the possibilities. As with all HiRes formats, I understand that not only all components of the recording chain but also the reproduction chain must have frequency response greater than 20kHz to accomplish this. I do see where speakers are being offered which are spec'd to 40kHz and as high as 100kHz. There are also numerous add-on "supertweeters" being offered which have this capability as well.

IF the topic were of interest to you and worthy of your time and consideration, I for one would be most interested in your musings on the subject.

My apologies for using this Musing as a portal for my inquiry but did not know how else to contact you with the proposal.
FWIW, given the potential of the existing HiRes formats to capture the musical experience if fully realized, I too am less than interested in MQA as the latest flavor-of-the-day.

Frank Zawacki
Connecticut Audio Society

Thank you Frank for the link, interesting discussion and question. I try to do what I can to collect information and synthesize posts to provide hopefully reasonable thoughts on these matters; mixed with some measurements and personal subjective impressions as appropriate.

As is essentially characteristic of all biological traits, there is a "normal" distribution of threshold for frequency detection; some of us will be able to hear to 19kHz, others have ears that poop out by 14kHz. Even if it's not "hearing" the tone, but rather "sensing" the presence of it, this concept remains valid. Since I trust none of the readers here are aliens from another planet who may have extraordinary sensory perception, I believe we can find answers based on studies that are out there exploring this physical limitation of Homo sapiens.

You've brought up one of the fundamental claims in the audiophile world when companies embarked on going beyond CD-level resolution back in the late 1990's - that ultrasonic frequency reproduction improves the audible quality of high fidelity reproduction. This is of course on top of claims that going from 16 to 24-bits made a big difference... For more on the bit-depth discussion, have a look at the blind test from 2014 (and also general concepts of expectations for high-definition audio).

I remember reading about the supposed benefits of ultrasonic frequency reproduction around 1999. "Super tweeters" with response usually at least an octave beyond 20kHz started showing up on the scene; devices like this Fostex T90A horn, or even more easily available the Radio Shack 40-1310 which I have somewhere in my electrical parts box. Offerings grew in the early 2000's with the advent of SACD and DVD-A. Since sampling rates overcame the 44.1kHz (22.05kHz Nyquist) limits of CD audio, the opportunity was there to promote add-on transducers like the Tannoy SuperTweeter, Townshend Audio Supertweeter (2004) or this Audiosmile Supertweeter by 2008. As noted by Frank, there were some multi-way speakers over the years with super tweeters incorporated or the tweeters themselves capable of extended ultrasonics (Tannoy Dimension TD12, Linn Majik 140 for example among others like B&W's with ultrasonic-capable tweeters and JBL M2 rated to 40kHz). Check out the current Madisound list of super tweeters for sale.

Despite this ~20 years history, notice that a minority of speakers these days explicitly aim for flat frequency extension significantly over 20kHz even though many, especially the metal tweeters like titanium and beryllium ones can to varying degrees. When was the last time you saw a speaker with a reasonably flat extended frequency response like -6dB at 40kHz viewed as a major selling point? Stereophile measures speakers to 30kHz but there's no suggestion that this extension is important (rather, it's a nice buffer to ensure everything looks good to 20kHz). By definition, ultrasonic refers to frequencies we humans cannot hear; at least not based on research with typical pure-tone tests.

As a starting point, a good document to analyze which originated back during the advent of "high resolution" digital audio is this Tannoy white paper "The Need for Extended High Frequency Bandwidth - or Why You Need A Supertweeter" from 1999. In there you see the link to the James Boyk measurements as well.

We can see from the white paper 3 major claims:

1. The fact that instruments have very high ultrasonic frequency harmonics - especially piccolos, oboes, triangles, cymbals. Fair enough, nobody is contesting this information from Boyk. It's obviously objectively measurable.

2. Time coherence is of importance... Yes, the concentric speaker system like in the Tannoy (or KEF, etc...) is great. Indeed, if one were to have a super tweeter incorporated into a multi-way speaker, it would be good that it remains time/phase coherent with crossovers typically up at the 5kHz-7.5kHz range. Again, I think nobody generally contests this since it applies to all speakers, not just those with super tweeters (and remember, we can use DSP processing to improve time domain performance in our sound rooms).

3. Human perception of ultrasound is possible... Ahhhhhh, here lies the contentious claim!

Is there any evidence that what are typically considered ultrasonic frequencies (>20kHz) are perceptible?

If you look around at audiophile articles, just like the Tannoy white paper, you will no doubt run across the "Oohashi Hypersonic Effect". Basically, this was a series of papers published at the turn of the Millennium by Tsutomu Oohashi and colleagues, some concepts initially discussed in an AES convention way back in 1991 before making its way to the Journal of Neurophysiology in 2000 with further additions. I'm not going to spend much time talking about this again because I have written and analysed the papers (including a more recent 2014 paper where the researchers used DSD128) back in January 2015's post "MUSINGS: What Is The Value of High Resolution Audio (HRA?)". The bottom line IMO is that this stuff is simply too speculative and clearly it's more complicated and should not be used IMO as proof that reproducing ultrasonic frequencies will always result in a beneficial effect. For example, the recent 2014 presentation actually considered safety issues and both "positive" and "negative" hypersonic effects on EEG activity were reported. My sense is that until there is actual paradigmatic understanding (not just observational reports of neurophysiological change), there is no point debating the Oohashi stuff as a pro or con. (For more on this along with links to other studies, including those that failed to replicate Oohashi, see the Wiki on "Hypersonic Effect".)

As for the other research referenced by that Tannoy whitepaper, the 1991 paper by Lenhardt is worth thinking about. Realize that this research involved placing a transducer directly against the subject's body for bone conduction (see this New York Times article related to the research). The theory is that ultrasonics are not "heard" by the usual cochlear mechanism but rather the saccule which is an inner ear organ of the vestibular system used in balance / positional equilibrium. Although I don't know how far the research has progressed, we can see based on this recent 2013 paper by Kagomiya that they used a ceramic transducer placed over the mastoid bone behind the ear and had the non-deaf subjects determine the "audibility" of "sound" transmitted through a 30kHz carrier. We're not told how much power was needed to create the ultrasonic vibrations which the subjects were able to detect (ie. what SPL in the air would be equivalent to cause these vibrations?). Suppose we accept that ultrasonic stimulation through bone conduction is beneficial for high-fidelity appreciation, how loud does a 30kHz signal from the super tweeter need to be in order to vibrate our skull so that it even has a chance to be appreciated!? Perhaps someone can let us know but I have a feeling the typical amplitude of ultrasonic material during an acoustic musical concert is far less than the amount that the researchers in this experiment utilized (obviously I'm not talking about lo-fi ear-splitting amplified rock concerts filled with distortions and all kind of wide bandwidth noise).

Finally, the Tannoy whitepaper speaks of a 10dB peak at 30kHz being audible in their internal testing. So did they publish this finding? If not, why not?! This would be fascinating, would have prompted replication research, and likely will spur on sales of their SuperTweeter. Talk is cheap, Tannoy.

Remember that recently there was also the meta-analysis by Joshua Reiss to determine perceptibility of high-resolution audio (Journal of the AES, June 2016). He cites 4 papers (out of 31 utilizing auditory experiments) which suggested the potential for hearing >20kHz. Two of these 4 are from Ashihara. Have a look at this 2006 paper where they tried to determine hearing thresholds beyond 20kHz for 15 subjects.

For convenience, here is Figure 6 from the paper where they showed the results from the 4 out of 15 subjects with best high-frequency hearing acuity:

As you can easily see, there is a massive jump in threshold between 16-20kHz across the subjects - remember the logarithmic nature of the dB scale. They were able to detect thresholds of audibility up to 22kHz for 6 listeners, and the best 4 of them had detectable thresholds up to 24kHz (Fig. 6 above). What's the catch? First, the tones were obviously hard to detect with thresholds up around 90dB SPL for a pure tone. Second, guess what... These were young people from 18-33 :-). The majority - 9/15 of young folks tested could not hear beyond 20kHz even at high SPL up to ~90dB. How many audiophiles who claim to hear benefits from super tweeters are in that age range? And besides, who plays music at such levels that their speaker would be producing close to 90dB SPL at 24kHz!? What kind of music has this much ultrasonic content even!?

So far, we see little evidence that ultrasonic frequencies are actually audible by assessing the claims at least put forth by the Tannoy whitepaper. Likewise, I see no evidence that typical audio playback could result in bone conduction to any significant degree as would be suggested by the Lenhardt work.

Here are a few further points to consider:

1. Again, only the young can likely hear up to 20kHz with any significance. Let's be honest guys. By the time we're 35 years old, on average, hearing acuity of an 8kHz tone is somewhere on the order of -10dB compared to 1kHz. We basically don't hear anything by 20kHz. Remember that the ladies among us are blessed with more graceful decline in acuity. Even the Ashihara paper above showed only a minority (6/15) of young folks up to 33 years old had a measurable threshold to 22kHz (Nyquist limit of CD), and nobody in their population was able to hear pure tones beyond 24kHz up to ~90dB SPL. It is unfortunate that they did not publish demographic information like the mean age or gender of those they found with best high frequency acuity.

The evidence at least based on pure-tone analysis suggests that a sample rate around 48kHz is the absolute most that any adult really can even have hope of perceiving; any frequencies that need a higher sample rate is beyond benefit barring the vague Oohashi stuff. As Figure 6 above also showed, if there were noise or other frequencies in the signal, masking happens and makes detection even worse. I think we can round up to 50kHz PCM sample rate and state with good confidence that this is all we need to capture for human consumption in the frequency domain.

2. Microphones typically are bandwidth limited to around 20kHz. You can easily see this in the vast majority of recordings (synthetic sounds of course can have all kinds of unintentional ultrasonics). An example of what one typically sees from a pop/rock studio is something like Joni Mitchell's "Both Sides Now" from the DVD-A released in 2000 (24/96):

Coincidentally, the graph also demonstrates just how noisy this recording is. Although it's presented as 24-bits on the DVD-A, there's no evidence that it needs anything more than 16-bits given the high noise floor clearly way above -96dB. Furthermore, in the production chain (I don't know if it was recorded to analogue tape, went through an analogue mixer, or had a noisy ADC), there is a rather high level 29kHz noise peak. Clearly this is not part of the music, nor would anyone claim they "should" hear this. So why bother reproducing this ultrasonic signal?

Even in bona fide high-resolution recordings, ones like this 2L DXD sample found here (I downsampled it to 96kHz to better demonstrate the roll-off in the recording at one of the more dynamic segments of the music):

Clearly this is a much better recording than "Both Sides Now" with very low noise floor so 24-bit resolution could be reasonable. But time and again, this is what you see in the spectrum of essentially all recordings. There's just nothing much up in the high frequencies beyond 20kHz in the music we buy; even those that are "high resolution" with high sample rates like 192+kHz.

I mentioned a bit more about microphones in the post last year: "MUSINGS/ANALYSIS: Is there any value in 176.4 and 192kHz Hi-Res audio files?".

Finally, just to complete this illustrations, here's a very good sounding SACD - Christina Pluhar & L'Arpeggiata's La Tarantella: Antidotum Tarantulae:

I'm showing the FFT at one of the most dynamic parts of the music - notice how the high frequency naturally drops off. As another natural sounding acoustic recording, this time to DSD, notice that there's no evidence of anything significant beyond 25kHz being recorded by the microphones. Without filtering, the DSD64 quantization noise is rather nasty as well, obviously adding nothing of value to "high fidelity" sound...

3. Adding super tweeters make the speaker more complicated and increases risk of poor integration. More expensive, potential for suboptimal cross-overs. Why bother unless there is proof of value?

4. No transducer is perfect. Speaker non-linearities result in intermodulation and subharmonic distortions that may be audible <20kHz. This is reason not to record too much ultrasonic content nor try to reproduce it. This is especially true with ultrasonically noisy content like unfiltered SACD/DSD64 as above which is part of the noise shaping used in the technology. Large amounts of noise amplified and sent to the speakers in no way provides benefits to the audiophile who desires true high fidelity! (Refer to Monty's "Intermod Tests" and have a listen to demonstrate to yourself why too much ultrasonic content might be bad in your own system.)

5. High frequencies are attenuated through air. Let's just focus on this for a bit because I think it's interesting and perhaps we don't think about it enough. Tell me, friends, how many of you enjoy listening to an orchestra or jazz band sitting from a vantage point just 4 feet in front of a cymbal, triangle, or trumpet? Nobody, I hope :-). Realize that this is the distance the measurement microphone (a 1/4" B&K 4135 condenser) was placed in the Boyk article analyzing the high-frequency content of instruments.

Assuming we could hear the ultrasonics, and our microphones are able to record >20kHz at high fidelity, and the playback chain including the speakers can reproduce >20kHz frequencies accurately, there is still significant attenuation of these high frequencies due to the air between us and the speakers. In fact, you can have fun calculating this here. As I type this on a sunny day in Vancouver, the humidity indoors is around 40%, and it's about 20°C room temperature in my sound room. Calculated attenuation per meter at 20/25/30 kHz look like this for "atmospheric absorption":

Considering that I sit 3m (about 10 feet) from my speakers, this means that at 20kHz, the sound reaching my ears attenuates another -1.7dB, at 25kHz it's -2.3dB, 30kHz -2.8dB, and -3.7dB by 40kHz. Here's a graph demonstrating the attenuation curve 10' away in a 20°C, and 40% humidity room as an average benchmark for absorption in this part of the world (typically lower temperature and humidity will increase absorption):

Comparatively, this is not a huge amount at 10-feet (3m) of course. However, if we're talking about frequencies produced by real instruments, and we're trying to replicate the sound of an actual performance at a venue, we must ask ourselves then the question "what distance should the recording sound like it's coming from?" This is important for the tonal balance of high frequencies due to the non-linear disproportionately high amount of atmospheric absorption.

As someone who enjoys going to the orchestra, when I listen to the Vancouver Symphony at the Orpheum Theatre in downtown Vancouver, the instruments are clearly much further away than the 10-feet or so between me and the speakers in the sound room. Suppose I sat front-and-center at the Orpheum, I would estimate that the guy playing the cymbal and triangle emitting all those ultrasonic frequencies at the back of the orchestra would be at least 40' away. Assuming 20°C, 40% humidity, this means that a 20kHz tone would have already attenuated by at least -6dB even if this was direct with nothing but air in the way. By 30kHz, there's >11dB of loss at this seating position; again without all the rows of musicians between me and the cymbal emanating >20kHz material. That's assuming I'm sitting close to the orchestra as an audience member - how much worse is the attenuation sitting further back with the bodies of other patrons in the rows in front of me?!

It's worth thinking about this idea of perspective as a listener (not just for the frequency response reason of course). If our home stereos are supposed to reproduce sounds in "high fidelity", based on what sounds "natural" around us, the reality is that acoustic music (IMO the type of music most appropriate for "high resolution" reproduction) heard as an audience member naturally rolls off the highs. This is potentially why research into "target curves" or "house curves" used in room correction tends to de-emphasize frequencies from 10-20kHz. An example is the "Harman Target Curve" (as described in this 2015 paper):

This data was produced empirically by allowing subjects to use tone control to find subjectively preferred tone curves starting with a flat-calibrated room/speaker system. Notice how trained listeners preferred a gradually descending frequency response of the speakers. Wouldn't you think that if ultrasonics were important, that experimental results, especially with trained listeners would at least suggest flat frequency response to 20kHz for most music?

I've often thought about the rationale for preference of rolled-off high frequencies in these target curves (the classic B&K curve has similar roll-off). I wonder whether close-mic'ed studio productions may be adding to this these days. Remember, we typically sit many meters away from the artists in a live performance. Our ears/mind naturally expect high frequency roll-off in such a venue and would also expect the same in the home sound room. When artists are recorded in a studio, close-mic techniques where the microphone is placed often a foot away from instruments could result in a rather unnatural tonal response picking up more high frequency energy than a normal listener would hear from many feet away - the kind of roll-off demonstrated by acoustic recordings like the Magnificat and La Tarantella tracks shown above.

High-fidelity speaker systems capable of flat response to 20kHz may actually sound too "harsh" or "analytical" when playing these close-mic'ed studio recordings. Who knows, maybe this is why some objectively "inferior" speakers with early high-frequency attenuation can sound natural and more "musical" in certain situations. Perhaps this is also why suboptimal digital converters like NOS DACs and those using early roll-off filters like the PonoPlayer could be preferred despite the imaging distortions above Nyquist that come along with those filter settings. The idea about close-mic'ed studio recordings and harshness is speculation on my part, so I would be interested in others' thoughts. Considering recent developments, I think it is possible that it's not "time domain" quality and short impulse response graphs that are important when looking at digital filters... Rather, it's the high frequency roll-off that can sound more "natural" (eg. PonoPlayer / Ayre filter starts rolling off by 7kHz and about -4.5dB by 20kHz for 44.1kHz material). Perhaps these digital filters are acting as mild tone control compensating in an era where our hi-fi gear no longer have those control knobs any more.

For completeness, I know that some vinyl lovers will talk about the superiority of frequency response compared to CD. While it is true that a good quality LP can contain frequencies well above 20kHz, it's not like you see much ultrasonic content on most LP's. My personal experience with vinyl rips whether it's with Denon DL-110, Shure M97xE, or Ortofon Cadenza Black cartridges have shown that LP playback typically rolls off the highs up to 25-30kHz reaching the noise floor with little content above. I have no qualms with down-sampling my vinyl rips to 48kHz these days.

The Bottom Line...

However one looks at the evidence, I think it's fair to say that the likelihood of perceiving (not necessarily even hearing) a difference in sound material >20kHz is simply dubious. Then there's the question of whether this actually benefits the sound even if perceptible.

Yes, while there are the occasional research papers suggesting differences in audibility between sample rates (like this one comparing 88.2kHz vs. 44.1kHz from 2010), I have yet to see clear evidence that ultrasonic frequencies themselves have resulted in audible differences as opposed to sonic differences because of the DAC or ADC operating at different sample rates.

Among the research, I found the negative NHK study (Nishiguchi et al. 2009) particularly fascinating using the Pioneer/TAD PT-R9 super tweeter, B&W Nautilus 801 speakers, dcs digital gear, Sony and Marantz amps. In that study, they used 36 subjects ranging from teenagers to those in their 50's; a huge proportion (33/36) with audio engineering experience, 6/36 women, and the musicians who recorded the test audio also participated. There was no evidence of a difference whether the super tweeter was activated with >21kHz content. Interestingly, one 17-year old female subject actually did really well in the research trials but subsequent further trials with this individual did not pass statistical levels of significance.

Yes, I know there are some folks with strong testimonies out there. I remember interacting with a "Golden Ear" who claimed he could hear up to 30 kHz (with no evidence). Anyone can have an opinion but opinions aren't facts. Even if this fellow tried a 30kHz tone test, it's possible that he heard harmonic distortion below 20kHz but didn't know it. More often than not, proponents of super tweeters claim that they improve the "air" of the sound system, improve the spatial dimension or clarity of the treble, some even claim they improve the "transient" accuracy of bass - vague claims indeed. But time and again, the scientific literature questions the audibility and benefits wherever we look. Whether it's our human physiology (which deteriorates with age - especially frequency response), recording equipment specs (like roll-off from typical microphones used), the music signal not likely containing much >20kHz, or even the air we breathe absorbing high frequencies, all these factors collude to limit the likelihood of significant ultrasonic content in live acoustic performances, within the recorded music itself and ultimately potential audibility.

My personal experience resonates with the science. For years now I have been examining "high resolution" albums from SACD, DVD-A, Blu-Ray to digital downloads and have rarely seen what looks like actual recorded ultrasonic content especially in good acoustic recordings. I have done my own ABX listening tests with what should be very high quality recordings like those from 2L on different systems (such as discussed here). These days, in my mid-40's, I just see no point in purposely pursuing a system with ultrasonic frequency response. A system with the ability to reproduce sound reasonably flat to around 20kHz is great. Beyond that, I'll do my own tweaking of the room, and find the best sounding mastering of albums...

If we take Ashihara's research in pure-tone audibility in young people as true, we might say a sample rate of 50kHz is absolutely all we would ever need. In fact, this is why I honestly think that for the purpose of streaming "high-resolution" audio these days, what's wrong with just 24/48 FLAC? This is what I would prefer instead of a restrictive scheme like MQA.

Having said this, as a "perfectionist audiophile", I like the idea of high-resolution in the sense that deeper bit-depth (ie. 24-bits) and higher sample rate to 88.2/96kHz (because of universal compatibility at these sample rates) will allow us to capture all that decades worth of research reports have shown to be the limits of human hearing and more. As one who likes to "own" my own music collection, this would be my preference. Whatever arguments are being made about digital filter effects would be moot at the 88.2/96kHz sample rates as well. Realize that playback even of 88.2/96kHz material could result in non-linearities with poor speaker systems. So long as the high resolution recordings do not contain inordinate amounts of ultrasonic noise and respect the natural high frequency roll-off, this should not be an issue. No need in my mind for sample rate higher than 96kHz as a consumer.

Despite claims over the decades, I have yet to see super tweeters considered "must have" features of most speaker designs. Likewise, I'm not sure if there is any data to support the idea that headphones capable of frequency response significantly above 20kHz are considered to sound "better". Given the mere centimeters distance between the headphone transducer and the auditory organ, ultrasonic frequencies would be essentially free from air attenuation. Also, why is nobody (I know) asking for hybrid air/bone conduction headphones that can handle up to 20kHz or so by air conduction and offer ultrasonic stimulation through bone conduction* (as per the Lenhardt research)?!

Needless to say, I believe we can all enjoy the music fully without "super tweeters". No need for concern or regret; fearing that we're missing out.

I hope this provides some food for thought, Frank... Cheers and greetings to the Connecticut Audio Society.

*BTW, there are bone conduction headphones out there like this one, but if you look at reviews, one cannot expect sound quality to be as good as air conduction even though they do provide benefits (eg. low isolation so you still hear what's going on around, possibly easier for fit and comfort in some situations).

----------------------------------

Over the years, I have wanted to do a blind test comparing content with >20kHz vs. filtered audio for you guys (something like the 24-bit vs. 16-bit Internet Blind Test in 2014). Unfortunately, it would be very difficult to ensure adequate blinding because anyone these days could pull up the file in an audio editor or see results on a spectral analyzer and determine which is which. I would need to ensure subjects are honest in order to have faith in the results if we did such a trial :-)...

Well, Dynamic Range Day 2017 just passed on March 31st. Always good to remember that at the end of the day, quality of the mastering (of which a big part is the retention of good dynamic range) is an essential piece of the joy of high-fidelity sound! While you're on the web site, do check out the Loudness War Research page for some great info.

This week I've been getting into the guitar blues of Hubert Sumlin (1931-2011). Perhaps best known as a member of Howlin' Wolf's band, this guy "rocks" on his solo work as well. His 1998 album I Know You (DR13) I found very enjoyable.

As usual, have a great week ahead everyone and hope you're all enjoying the music!

Addendum:
In response to jhwalker below in the comments about DSD and the 25kHz roll off... Not true, DSD64 can encode frequencies way beyond 25kHz. Remember those SACD ads claiming 100kHz frequency response? Those ultrasonic frequencies however then get buried in the rising noise floor - this is the price to pay for low-resolution 1-bit sampling, and why high sampling rates in the MHz range are needed. Many DAC's will implement analogue filtering so the amp/speakers don't suffer through trying to reproduce all that high frequency noise! Software like JRiver for example will by default also digitally low-pass filter DSD --> PCM playback around 24kHz.

The La Tarantella DSD track was converted with no filtering done to the conversion. Here's an example of "synthetic" music with >25kHz components in DSD64:

As you can see, that's from the Beck album. Synthetic instruments, close mic recording, studio tricks with all kinds of ultrasonic frequencies (likely noise), much of it buried below the typical DSD64 quantization noise from 35kHz onward.

Addendum 2: (April 2, 2017)
A friend just E-mailed to me his disappointment with Bob Dylan's Triplicate available as 24/192 at the usual places. Apparently there's a nasty 27-28kHz noise that runs through the songs:

I don't know if others have "heard" this. Assuming nobody has complained, I therefore assume a frequency like this above 24kHz is inaudible and undetectable.

Sadly, this is yet another example of why much of the pop/rock "high resolution" albums IMO are worthless. Why pay more for 24/192? With a noise peak like that, might as well downsample the whole thing from 192kHz to 48kHz. Plus the noise floor doesn't demand >16-bits, and the album is compressed to DR8 so by all means also dither it down to 16-bits.

53 comments:

Mark's Blog1 April 2017 at 12:24
Another informative post. Thanks.
F.Y.I. Hubert Sumlin's Healing Feeling, featuring the great Ronnie Earl, is fabulous - one of my all time favorites. Don't know the measured DR but the sound quality (44/16 CD rip) is quite good.
ReplyDelete
Replies
jhwalker1 April 2017 at 12:50
Great write-up, as usual.

WRT the L'Arpeggiata recording, though: isn't 25kHz pretty much a normal rolloff / filter point for DSD64, in any case? Pointing out that a DSD64 recoding rolls off at 25kHz is basically like saying a CD rolls off at 22.1kHz - yes, and?

IOW, it's entirely possible had the original recording been done at DSD128 or DSD256, etc., there might have been substantially more supersonic content, with filtering beginning at a much higher frequency.

Just a thought.
ReplyDelete
Replies
solderdude2 April 2017 at 01:25
Some things to consider when looking at the Joshua Reiss analysis.

It shows the audible threshold of artificially generated tones (not music).
It is clear that 4 individuals could well detect up to 14kHz with an SPL between -5dB and +10dB.
It probably took quite some time of having total silence around them to be able to do so in these cases.

For those that wonder how one can hear below 0dB SPL. 0dB SPL is considered the average threshold of human hearing, NOT no sound pressure at all. 0dB SPL is still 20 micro Pascal of sound pressure and is just a chosen reference point.

The research did show some people 'detecting' a presence (feels more like a pressure on the ears/head than an actual tone) of 20kHz at 80dB SPL.
Up to 24kHz needing 90dB SPL.

Quite interesting in the sense that it is objectively possible to hear up to 24kHz. 20% above the considered audible range no less.
I remember from my own experiments, in my much younger years, that I could 'detect' up to 21kHz when that tone was played very loud.
Now it starts to cut out at around 17kHz for me while my kid is covering his ears in agony.

To put the analysis in perspective (when it comes to music) one should have a look at the amplitude of high frequency contents in recordings with lots of actual harmonics.
The high frequency energy in recordings is easily 30dB to 40dB lower in amplitude relative to bass and mids.
This means that in order to 'detect' these >20kHz frequencies (@ 90dB SPL) the bass and mids in a music signal would have to be 120dB to 130dB SPL in order for that >20kHz content to become just 'barely detectable'.
The rest of the music is just happily pounding away on your eardrums at extremely loud levels when the >20kHz content would be 'just noticeable'... when coming from silence.

Detecting artificially generated tones is one thing, but we are talking about high frequency contents in MUSIC and the audibility of that high frequency content which may well yield different results.
Some research has shown that in order to actually have a beneficial effect in SQ the >20kHz content must be harmonically related to audible fundamentals and/or overtones.
This is not the case in single-tone tests which some research may already have 'shown'.

While I am convinced some people have better trained ears than others and/or have better hearing (less damage over the years or being much younger) I doubt the claimed hearing acuity of some 'audiophiles' is as good as they 'know' it is.

All I know is that all of my own experiments on this subject have shown me that I have no benefits from playing files having content well above 20kHz.
For others the possible 'ease of mind' effect when knowing they play higher res files may give them more enjoyment.

For me it just allows me to carry more music around on cheap memory cards and saves me space on hard discs.

On required bandwidth:

As Frank Zawacki and Archimago already mentioned is that to reproduce 20kHz on a proper level/quality the actual bandwidth of the transducers (microphones/speakers/headphones) and amplifiers SHOULD exceed 20kHz.
This alone is a good reason to have gear (amplifiers, transducers) that has a bandwidths of at least 30-50kHz so it can acurately reproduce 20kHz bandwidth limited files.
That fact doesn't mean you also actually need music with contents far above 20kHz, except perhaps during recording to allow for post processing.
ReplyDelete
Replies
mats c3 April 2017 at 02:35
Many interesting facts here, but it has been mentioned that higher samplerates allows simpler and more gently digital filtrering which has less impact on the impulseresponse.
I have also a Teac UD-501. With my former amp and speaker, the NOS - filtermode was often to dull for my liking, but with my current setup Wharfedale Denton/ Rega Elex -R it sounds natural, but sometimes a bit laidback.
ReplyDelete
Replies
BigGuy3 April 2017 at 08:24
BTW, issue with not being able to successfully REPLY appears to have been use of Firefox. Explorer works fine. :-)

FRANK
ReplyDelete
Replies
Vincent3 April 2017 at 10:35
One a the few arguments I could conjecture up in favor of > 44 KHz is that you don’t need that steep brickwall filter right after the 20KHz
What are you thoughts on this?
ReplyDelete
Replies
tnargs3 April 2017 at 23:15
If the date of this post is central to its credibility, then you have been too clever for me!

IMO Oohashi's hypersonic effect needs to go in the box with cold fusion and other experiments that are "desperately in need of verification but only seeing contra-indications". When a new experimental finding contradicts a large body of prior research, the first thing needed is independent verification. When that doesn't happen, it goes in the category of outlier or anomaly, and best quietly ignored by engineers and implementers.
ReplyDelete
Replies
Teodoro Marinucci5 April 2017 at 01:05
Hi,
a question totally unrelated to the (excellent) post...
It seems that you like Roon. Why ? What are the features that make it (in your opinion) excellent ?
Me (given that I'm not interested in multi-room) I see none.
I have already expressed my opinion about "favorites", talking about them as a new way of "owning" albums: non more physically on my site, but "on the cloud".
So, "owning" 1000's of albums could become not so unusual. How you handle them ?
Me, I handle and play my collection of "liquid" music (ripped CD's and not only) using JRiver Media center.
Don'you think that something similar would be beneficial for the task ("I would like to listen the Third Mahler Symphony, conducted by ... I don't remember let me see the list ...").
I'm writing something like that. Qobuz offers excellent API's. Then I shall use BASS and SQLite to handle meta-data. Worthwhile ?
Thanks
ReplyDelete
Replies
Audio_Tony5 April 2017 at 03:39
I remember when I was much younger, reading about the so called benefits of an ultrasonic response, and experimenting with super tweeters / piezo tweeters etc. and I truly couldn't tell any difference, both with analogue and digital recordings.

I'm beginning to wonder if in 'x' years time, we suddenly realise that 16 bits really was enough.

Published on an interesting date by the way! (01 April)... :-)

ReplyDelete
Replies
StevenS5 April 2017 at 12:03
Great article. Relieved to see you are not onboard the Oohashi bus (he's kind of a *crackpot* actually, if you look up all of his published work and resume). But your link to the Harman (Dr. Floyd Toole) AES paper on target curves isn't working (for me at least) -- but this one does
http://www.aes.org/e-lib/browse.cfm?elib=17839
ReplyDelete
Replies
StevenS5 April 2017 at 12:12
Re DSD, nota bene that , at least in the days when SACD players were being pushed as the new thing, it was common for them (and recommended in the Scarlet Book spec) to low-pass filter the output as a final step, at either 50Hz or 100Hz, to eliminate at least some of the ultrasonic nastiness. I don't know if this is still true of hardware SACD players.
ReplyDelete
Replies
Honza6 April 2017 at 02:41
> 20 kHz we really cannot hear. 88.2/96 kHz has sense mainly for 2x upsampling from 44.1/48 kHz since many common DACs oversample much worse that Sox or Ferocious Resampler https://github.com/jniemann66/ReSampler/releases/tag/v1.2.6 do. Or for recording that is then used for CD creation. > 96 kHz sample rates are total overkill and have no benefit. Recording at 24/48, 24/88.2 (if pure CD output is desired) or 24/96 (if multiple output like computer playback, DVD and CD is desired) is the gold standard and should not be pushed further. With good plain TPDF or sloped or moderately noise shaped dither, even 16 bit is very well usable for playback on common equipment.
ReplyDelete
Replies
Honza6 April 2017 at 02:54
P.S. especially 16/44.1 - 24/88.2 or even 16/88.2 (with redithering plain TPDF or sloped TPDF) sound very well - not because they contain additional information, but because many common DACs do not filter 44.1 as well as 88.2. Even upsampling also preserve half of original "time points" (even if I know that PCm is not time-aware). Anyone can try it by using Ferocious Resampler (link above) and upsample 16/44.1 to 24/88.2 or 16/88.2 for mobile. Theoretically even 96 kHz can be used for that purpose but the results are not that elegant then. Of course the upsampled version does not contain any sound above > 20 kHz, but that does not matter.
ReplyDelete
Replies
BigGuy6 April 2017 at 07:28
Wow! Based on the learned comments regarding the efficacy of reproducing ultrasonics, I feel like I have been living in a cave...without cold fusion. :-D Not sure why but most of the articles I came across in googling on the topic over the past year were more positive than both this Musing and the subsequent comments. Appreciate the links to additional white papers on the topic. Glad I only spent $75 on the BatPure modules! ;-) FWIW, my speakers are Martin Logan CLX Art e:stats which are spec'd to 23kHz. All of my system components have Frequency Response well into ultrasonic range making them more than capable of reproducing 20Hz-20kHz and beyond assuming there is anything of value.
ReplyDelete
Replies
jmsent7 April 2017 at 19:41
One other point regarding reproduction of supersonic frequencies by loudspeaker systems: It's no secret that a driver's dispersion narrows substantially with increasing frequency. One only needs to look at the lateral response family charts on JA's Stereophile speaker measurements to see what I mean. When you get up into these ranges, most tweeters, even those designed for ultrasonic reproduction, will produce only rather narrow beams of energy. It's simply a function of the frequencies radiated vs the radiating area of the diaphragm. The upshot is that the speakers have to be carefully aimed directly at the listener, and the "sweet spot" where the ears will be in the path of the ultrasonics coming from both speakers will be quite narrow.
ReplyDelete
Replies
solderdude9 April 2017 at 02:12
Thank you for the wonderful spam !
I will certainly click your link now and make use of everything you have to offer.

ReplyDelete
Replies
solderdude15 April 2017 at 00:19
An interesting experiment... not blind and listening out for ultrasonic sound using a CDP which, by definition, has no signal above 21kHz to begin with.
In those days it would have been better to have used vinyl instead with a cart exceeding 30kHz, a pre-amp that did not roll-off >20kHz and records with (known) recorded ultrasound.

The modifications may have given way for aliasing resulting in unwanted signals to be added in the audible range.
This may not have resulted in the sound having more 'air' though.

Curiously enough you felt you experienced more 'air'. Could be expectation bias (knowing the tweeter was on) but possibly the horn tweeter, which may have had a 3 dB or more higher efficiency, may well have given more SPL in the upper audible range than the tweeter in your speaker did.
Most likely you just may have heard a rise in the upper part of the frequency band.
ReplyDelete
Replies
Unknown9 May 2017 at 07:27
Thanks for the wonderful analysis. I think an important issue that is missed in the argument is that ultrasonic harmonics interact in a larger continuum of harmonics and can create new effects within the audible range. I think that the conventional model of psychoacoustic testing with sine waves keeps this aspect from the forefront of the issue.

With regard to increasing sample rates, the audible difference would become more prominent if the frequency bandwidth of each component in the sound recording/reinforcement process allowed ultrasonic harmonics to pass through without being truncated. What we have now is merely increasing samples per second of the lowest resolution (frequency bandwidth) component in the recording system.

For me, a great example of what is missing in the argument is demonstrated by listening to a gong player live, then comparing it to a gong recording... striking difference!
ReplyDelete
Replies
raoultrifan9 May 2017 at 21:55
Hi all and many thanks for such a good reading!

In the past years, I was reading on many forums about the possibility, for some people, to actually fell some harmonics above 20 KHz, though I'm not sure how this could change how the music is perceived. Perhaps it would be great to have studio's master recording able to catch absolutely every sound, no matter of its frequency. This will most likely make everyone happy, I'm pretty sure about it. :)

Now getting back to our home audio equipment, to have an entire system able to correctly reproduce all frequencies spectrum from 1 Hz to 100 KHz would probably be a little bit dangerous. I am imagining the "perfect subwoofer" powerfully playing few Hz, then my chest and walls vibrating a lot and me thinking it's an earthquake. Also, if a bad recording will introduce ultrasonic noise then nobody will be able to predict how the DAC/amplifier/speakers combo will reproduce this. I would expect either oscillations from the amplifier, either speaker's filters will get defective, either speaker's drivers will be unable to fully reproduce correctly entire music's dynamic, especially if the ultrasonic noise is having a high amplitude (I don't even want to think about the price for speakers able to reproduce correctly up to 50 KHz or even more).

Also, AFAIK, low-pass and high-pass filters are always introduced inside amplifier to overcome woofer's very long excursion when very low-frequencies can occur (there's no reason to let 10 Hz to get to the speakers as long as driver/cabinet's resonance frequency is much higher), but also to suppress very high frequencies at amplifier's output to prevent oscillations. I was reading on audio forums about people realising that their amplifiers are oscillating a lot, but they cared not, because they really liked the output sound very much. :)

I believe Frans can explain much better why is so important for audio amplifiers to have a properly designed low-pass filter (and perhaps a high-pass filter too, though this could be done with ease by choosing the proper caps value in the input stage).

Best,
Raul.
ReplyDelete
Replies
solderdude10 May 2017 at 09:00
The reason why many amps roll-off in the lowest frequencies is often because of so called 'coupling caps' are in the signal path.

Another reason may be (for DC coupled amps that can even amplify DC voltages) could be the presence of a DC servo. This circuit could be needed to ensure speakers do not receive DC which is potentially harmfull to woofers. That circuit has a roll-off point which dictates the bottom part of the frequency response. More often than not well below 1Hz. The point of such a design would be NOT to have coupling caps in the signal path.

Older amplifiers often had (have) subsonic filters. This was essential for playing vinyl records where unwanted (and highly amplified) subsonic frequencies could make your speakers sway back and forth without you knowing. This could lead to overheated woofers or distortion.
That's why the subsonic filter is/was there.
So IF you play vinyl a subsonic filter may be a good thing to have.
Possibly built in in the RIAA pre-amp.
For CD this is not needed nor for other digital sources.

Another reason for having a roll-off in the bass is to prevent possible DC voltages from certain sources to be amplified when the amplifier circuit itself would have been able to amplify it. In those cases there is a coupling cap present to remove possible DC output voltages which can easily and silently burn out woofers when not detected.
When no coupling caps are desired DC servo is another solution.

High frequencies are often limited solely by the design of the amplifier.
Certainly when output tranformers (think tube amps) are used.
Amps do not start to oscillate when high frequencies are fed into it.
BUT they certainly can amplify them without you knowing/realising it.

There is no reason why designs wouldn't be able to reach 1MHz. This SHOULD not be a problem for the amp nor speakers.

Still some amplifiers have low pass filters on their input. This usually is there to prevent very high frequencies to 'saturate' the amplifier circuit when feedback is used in the amp. The dreaded TIM.
When squarewaves are applied this could cause problems in the amp.
Fortunately they do NOT exist in music... That is unless you use unfiltered NOS DAC's in which case it could.

When unfiltered NOS DACs are used large amounts of hypersonic energy may be present leading to IM distortion and/or blown tweeters.
In some hires recordings high amounts of 'tape bias' signal could be present or other HF sigals one cannot hear but potentially could damage tweeters or even 'boucherot filters' in amps could go up in smoke.

Why is having a wide bandwidth still important for amps and (less so) speakers/headphones but not for digital audiofiles ?
If one wants to reproduce 20Hz to 20kHz FLAT (means well within 0.5dB) the bandwidth, which is usually given at -3dB, needs to be a few Hz to about 100kHz to reach -0.5dB between 20Hz and 20kHz.
Also to ensure correct phase response for the lowest frequencies and instruments harmonics the phase should not shift too much at these extremes.
For this too roll-off points should be chosen well above and below the audible limit.
That needs to be this way in order to accurately reproduce a file which is limited from 10Hz to 21kHz by itself.
ReplyDelete
Replies
Archimago10 May 2017 at 22:45
Good discussion guys.

Mytubbie - welcome! Interesting experiment with NOS and the on/off switch to the high frequency horn. I agree that there is subjectively an "air" to the effect from ultrasonic noise. Another way to "hear" this is with PCM to DSD64 conversion such as in JRiver to a DSD DAC. I can appreciate if some like this effect even though the "sound" isn't part of the original intended signal!
ReplyDelete
Replies
Unknown14 May 2017 at 11:57
Thanks for this fascinating discussion! So many factors in play. I'm coming from a perceptual (composer/musician) and medical background. Do you mean by 'liveness', the simulation of immersive spatial qualities of sound?

What I am thinking about with regard to the passing through of ultrasonic harmonic interactions, is related to spatial qualities and harmonic/phase relationships which can be lost or truncated anywhere in the (<20KHz) recording/reproduction process (chain of hardware, software encoders, compression formats, amps, speakers, etc). Optimally, these interactions can create a psychoacoustic gestalt in which the perceptual experience is greater than the sum of isolated parts (harmonics, phase relationships).

I also wonder about the limitations in accuracy of current digital recording formats, to encode and render the inharmonics (non integer multiple harmonics), and aperiodic waveforms of the live gong sound.
ReplyDelete
Replies
solderdude14 May 2017 at 15:01
This comment has been removed by the author.
ReplyDelete
Replies
vintologi.se15 July 2017 at 12:08
I did a simple 44.1 vs 96 blindtest earlier and i preferred the 44.1 KS/s version 30/43 times, in all other blind test the result was the opposite but then i never got good statistics.

My system is really cheap(dynavoice dm5+t-amp) so i cannot rule out IMD products. I only noticed a difference in very specific types of music, very strange.
ReplyDelete
Replies
Ronit15 June 2021 at 02:58
This comment has been removed by a blog administrator.
ReplyDelete
Replies
Moving India17 June 2021 at 03:43
This comment has been removed by a blog administrator.
ReplyDelete
Replies

Add comment