Archimago's Musings: MUSINGS / ANALYSIS: Is there any value to 176.4 and 192kHz Hi-Res audio files? A practical evaluation...

Saturday, 4 June 2016

MUSINGS / ANALYSIS: Is there any value to 176.4 and 192kHz Hi-Res audio files? A practical evaluation...

Check out this article from 1998 written by the founder of Earthworks, David E. Blackmer (1927-2002).

Although I believe some of the contents in the article above are debatable, in the years since 1998, high-resolution, high samplerate audio has of course become common-day reality for audiophiles. As I expressed years ago, I do like the 2xCD samplerates like 88 and 96kHz. But as a result of realizing that 176.4 and 192kHz songs were not being streamed properly with my Logitech Media Server with BrutefirDRC set-up described a few months back, I started asking myself, what is it we would be missing if these albums were downsampled to 88.2 and 96kHz?

Put another way, we could ask "Is there something musical about the highest octave in these 4x samplerate files?" This highest octave for 176.4kHz files would be audio containing 44kHz to 88kHz, and in 192kHz files from 48kHz to 96kHz.

Now ideally, if a file contains no audio information in the highest frequencies, we should see a very clean - low and flat - noise floor... Something like this:

Frequency FFT

Spectral Frequency Display

This is the track "Grande Sonate Op.25 I-Andante Largo" by Ricardo Gallén from the album Fernando Sor: Guitar Sonatas which was recorded in DSD256 and converted to a 24/176 PCM file. (You can get this album from NativeDSD.com.) If you watch in realtime the frequency FFT as the song plays, you'll notice that the music is contained up to about 30kHz; that is, the noise essentially stays flat from 30kHz to 88kHz on that graph. This can be easily seen on the "Spectral Frequency Display" in Adobe Audition where I also placed a 44kHz line since if we were to downsample by half, the natural target would be 88kHz. As you can see, there's basically no musical content at all in that highest octave.

Another example of a relatively clean noise floor would be something like this:

Frequency FFT

Spectral Frequency Display

This one is from 2L records, the beautifully recorded vocal track "Nato Canunt Omnia" from Psallat Ecclesia by Schola Solensis downloaded as 24/192 (itself downsampled from an original DXD 24/352). While the noise does creep higher from about 60kHz up, it remains low overall. Just like the track above however, if we watch the frequency spectrum in realtime during playback, we see that there is no apparent musical information above 40kHz or so; it's just noise which stays essentially static. Again, beyond the 2x samplerate of 96kHz (48kHz line), there's nothing in the uppermost octave that appears to represent any musical or ambient information.

More often than not however, you see files which are contaminated with noise peaks probably picked up in the analogue recording chain. Something like this:

Frequency FFT

Spectral Frequency Display

That was taken from the Linn 24/192 "Studio Master" release of Carol Kidd's album Tell Me Once Again (the track was "Moon River"). It sounds fine on playback of course but you can see that the ultrasonic spectrum is contaminated with peaks up to -68dB or so especially around the 60-70kHz regions; easily seen as a band across the Spectral Frequency Display. If you consider however that there does not appear to be anything but noise above 30kHz, one certainly should start wondering if it might not be best to just filter out all that extra crud since there is potential that intermodulation distortion could affect the audible frequencies in playback.

I started to look at just how many of these "very high-resolution" albums (that is, 176.4/192kHz albums) I had in my Logitech Media Server library. As expected, relatively few - only 5 24/176 albums and another 35 24/192 albums out of about 8,000 total albums (0.5%) excluding singles and vinyl rips. Many of these I collected back in 2008 to 2012 as high-resolution downloads off HDTracks or my own DVD-A rips purchased in the early 2000's. Remember that it was during that time frame when high-resolution audio started becoming available online and DVD-A Explorer came out for DVD-A ripping (~2008). Well, I loaded the first 14 albums into my audio editor and made notes about whether I thought there was musical content in the highest octave afforded by the higher sampling rate.

- Neil Young - American Stars 'N' Bars - 24/176 DVD-A rip was just an upsample, happily resampled to 88kHz
- Neil Young - On The Beach - 24/176 DVD-A rip was also just an upsample, brought down to 88kHz
- Neil Young - Archives Volume 1 - Blu-Ray 24/192 rips. I didn't go through every song but the ones I looked at all looked like 96kHz upsamples at best.
- Bill Evans Trio - Waltz For Debby - 2011 HDTracks 24/192. Easily resample down to 48kHz and dither to 16-bits without fear of losing anything. High noise floor on old recordings do not benefit from 24-bits.
- Bob Marley - Legend: The Best of Bob Marley & The Wailers - 2012 HDTracks 24/192 download, no content worth keeping so downsampled to 24/96.
- Fleetwood Mac - Tango In The Night - 2011 HDTracks 24/192 download - clearly not worth more than 16/48 due to high noise floor and no high frequency content.
- Fleetwood Mac - Tusk - 2011 HDTracks 24/192 download - like above, no point keeping >16/48.
- Neil Young - Harvest - 2002 DVD-A 24/192 rip - upsampled from 96kHz.
- Carly Simon - No Secrets - DVD-A 24/192 rip - awful DR9 remaster, just get the DR13 first press CD or maybe Audio Fidelity remaster I see is available.
- The Eagles - Hotel California - 2001 DVD-A rip - feel free to go down to 24/96.
- Cat Stevens - Tea For The Tillerman - 2012 HDTracks 24/192, happily downsample to 96kHz.
- Steely Dan - Everything Must Go - DVD-A 24/192 rip. Basically a 96kHz upsample.
- Muddy Waters - Folk Singer - 1999 Classic Records HDAD 24/192 - lovely recording but just noise >25kHz. In fact 16-bits would also be enough for a recording of this vintage with high noise floor.
- Antonio Forcione & Sabina Sciubba - Meet Me In London - 2011 Naim 24/192. Sounds great but even an "audiophile" release like this has a high noise floor and nothing but noise beyond 48kHz as well as some moderately strong high frequency tape bias signals.

... and so it goes; that was just the first 14. Even classical and acoustic music (from 2L and Linn) as suggested above in the graphs looking like genuine high-resolution captures at 192kHz could easily be downsampled to 96kHz and I don't think an audiophile should have concern about content loss. I know that we often joke on forum posts that maybe these very high resolution audio files would be preferred by cats and dogs, but honestly, I suspect one's dog and cat would prefer a downsample taking out all that noise and high-pitched continuous tones in the highest octave found in many of these albums (like the Carol Kidd tune demonstrated above)!

As I went through what I had, the only time I considered maybe 192kHz could capture more musical information was with Trondheimsolistene's In Folk Style 24/192 Blu-Ray rip where the track "Grieg Two Nordic Melodies Op. 63: II. Kulokk and Stabbelaten (Cow Call and Peasant Dance)" encroached up to 48kHz and maybe just barely surpasses. Obviously, any information going above 48kHz is of very low levels and unlikely audible given human hearing limitations, so it's all rather academic and perfectionistic.

Years ago, Monty at Xiph.org already laid out the technical case against 24/192. Some excellent points there, but even just looking at the realtime spectrum analyser while playing the music, one gets the sense that there's just nothing there to even bother with.

Finally, if you look at how we make recordings, it gets down to the basic fact that there are not many microphones with the capability for extended frequency response. For example, the recent Sennheiser MKH8000-series models can go up to 60kHz, but typically to around 50kHz. And as noted by Demian Martin in the comments of a previous post, we need to be careful about drop-off when recording off-axis (he gives the example of the B&K 4133 and off axis drop-off based on the data sheet page 7). Even for Earthworks' own demo material to show off their microphones' high frequency response, the wav audio download is presented as 16/44 files with obviously no content above 22kHz. Are they suggesting that higher microphone frequency response might improve tonality in the audible range?! If we look at the specs for a typical list of "recommended" studio microphones, the vast majority are rated to 15-20kHz; bottom line is that there's not much high frequency material being captured even if one were to argue that some instruments are capable of strong ultrasonic harmonics. Sure, there could be "life above 20kHz" based on a CalTech paper (you can still find it on Google cache), but how many studios actually capture this and of what fidelity? What I have seen suggests that there's little being retained whether by choice or from the limitations of the microphones used.

[On a side note, of course we do have microphone technologies that can record ultrasonics quite well. For example, in the biosciences, a number of research papers on animal vocalizations such as this paper on rat communication uses the Avisoft-UltraSoundGate CM16 which is capable of up to 200kHz within the limitations of the polar diagram. Great for bird chirps, rats, echolocating bats, and understanding porpoises; not quite necessary for music recordings as far as I can tell :-).]

If you're wondering what settings I use to downsample from 192kHz to 96kHz, the obvious answer is that it really doesn't matter with modern sampling rate converters since accuracy would be excellent. My favourite program for this is iZotope RX using a relatively gentle linear filter (compared to the typically steep 44kHz settings where "Filter steepness" would be around 30 to keep frequencies intact to 20kHz) beginning just beyond 40kHz and essentially free of aliasing:

Sounds great and works for me... Of course I could use an even gentler filter but I had this romantic notion that it would be nice to keep frequencies as flat as possible to 40kHz (two times the typical ideal top frequency response for human hearing at 20kHz) and to use linear phase setting ("Pre-ringing" at 1.00) to prevent phase shifting since we don't know if the DAC will further oversample with its own minimum phase algorithm at playback. Not that this likely matters being ultrasonic and all...

In summary: Yes, there is "life above 20kHz". But I see little if any sign of life above 44/48kHz (or 88/96kHz samplerate) in recordings out there. IMO, this is further justification for the music industry, if they choose to be serious about providing truly high resolution quality offerings to consider "keeping it simple" and standardize on 24/96 masterings.

In the comments, let me know what albums you've found to contain recorded harmonic frequencies benefiting from 176 and 192kHz samplerates... I'd love to have a list for demo and assessment purposes!

-----------------------

Before closing off, remember that in this post I'm only reanalyzing my 176.4/192kHz files. In truth, remember that there are many 88.2/96kHz files out there that probably can just as well be downsampled to 44.1/48kHz simply due to lack of actual content. I've many times come across 96kHz files for example that just look like they may have been put through some kind of analogue mixer but the underlying musical content originated at 44.1/48kHz. Apart from noise picked up in the ultrasonic range, and unless on wants to argue that said ultrasonic noise has beneficial and audible effects, I'd be inclined to downsample these as well. This is of course the natural outcome when the music industry cannot standardize on a technically quality controlled product for what constitutes as "high resolution". Also, I believe many audiophiles are tempted to buy the "high resolution" file simply because of the "bigger is better" mindset in the absence of objective standards and analysis (is it any surprise that websites like HDTracks don't allow user review comments for folks to discuss the value of these downloads?).

Finally, remember that I'm not at all saying anything negative about the value of higher technical capabilities like 24-bits or >88/96kHz sample rates. There is of course a time and place where 24-bits resolution is essential (eg. in the studio for dynamic range overhead during production) and high sample rates used to optimize the mastering quality and archiving. It is important to however remember that the needs of the studio does not necessarily apply to the consumer home playback of the final product. Sure, there might be some psychological satisfaction in owning a "studio master" better than CD-resolution despite the technical standard of the content. I look around my collection and see albums like Miles Davis' Kind Of Blue in 24/88 and 24/96, or a copy of 24/96 Harry Belafonte's Belafonte At Carnegie Hall; both from 1959 and clearly neither of which have a low noise floor requiring 16-bits nor frequency extension beyond at the very most 48kHz samplerate. I also see Bob Dylan's Highway 61 Revisited (2015 MFSL SACD) in 24/88 from 1965 - again 16/48 would have been enough. But due to the "classic" status of albums like these, I'm OK with keeping them in a bigger bit "container"... Remember though that it's one thing to accept oneself as human and have idiosyncratic "values" but I would certainly not argue that these albums would sound any worse downsampled to 16/44! This is an example of what it means for me to be "more objective"; maintaining a foundation of decision-making and understanding based on the science, but at the same time allowing oneself a license to take joy in one's own subjective psychological idiosyncrasies.

Have a great week ahead. And of course I hope you're all enjoying the music.

Addendum (06-10-2016): Based on the discussion below with sk16 about Steve Wilson's production, a reader sent me some information about the recent album Hand. Cannot. Erase. (HDTracks 24/96, 2015).

DR11 is quite good for a modern album of course. The lower graphs are for the HDTracks 24/96 download of the first song "First Regret". As you can see there's nothing beyond 29kHz demonstrated on the FFT and Spectral Frequency Display. Based on what I see on playback, I believe resampling and dithering down to 16/48 would have been just fine while saving a good amount of storage space.

52 comments:

Mans R4 June 2016 at 12:20
Cue the ignorant "it's not about frequencies, it's about improved time-domain response" comments.
ReplyDelete
Replies
Unknown4 June 2016 at 12:41
I have an album by Papa Bue’s Viking Jazz Band in 192kHz that has a lot of sound above 48kHz, especially when the trumpet is played.
ReplyDelete
Replies
Anonymous5 June 2016 at 06:51
I don't know why record labels don't use software to create plausible higher frequency components in their higher res versions so that people can't tell they're upsampled, or to give the impression that their studio has used exemplary techniques to capture the ultrasonics.
ReplyDelete
Replies
Sombody5 June 2016 at 11:24
What about time domain?

Say 44.1k vs 88.2k, same bitdepth, you have 2x the info in time domain, less error for digital filter downstream.

Btw, Shannon sampling theory dictates 2x minimal to reproduce band limited signal. However, the more the better for less distortion. Just another two cents.
ReplyDelete
Replies
Blumlein 886 June 2016 at 23:57
My comment seems to have been lost.

Anyway, I have taken high sample rate files and done a steep filter removing everything below 20 khz. The reverse of the normal brickwall. That leaves only the ultrasonic material. I then slow it down to 25% of normal speed for playback. This is almost like upping your hearing to 80 khz. Even when you do this, there just isn't much up there to hear. It isn't loud at all. To think this, masked by the music, and well above your hearing will make a big difference becomes a clearly ludicrous idea at that point.
ReplyDelete
Replies
sk167 June 2016 at 07:42
Arch,
Great article as usual. Well done.
Have you ever analyzed any of Steven Wilson's recordings (solo and Porcupine Tree)? I regard Wilson as the gold standard of modern rock recording; I think others may as well. He has released on bluray at 5.1 24/96 for some years now. The two most recent releases, The Raven Who Refused to Sing and Hand.Cannot.Erase sound really, really good. Wondering if you have measured them?
Wilson is also noteworthy for remastering Aqualung, King Crimson, Yes, XTC, and Tears for Fears in 5.1 2.4/96.
Highly recommended.
Best wishes...
ReplyDelete
Replies
Unknown10 June 2016 at 13:20
"It is important to however remember that the needs of the studio does not necessarily apply to the consumer home playback of the final product".
Well, that is only true for people running a NOS dac which surely isn't done by "the consumer" in any significan't numbers. What you're saying is just brick wall (or gentler) the original or upsampling material, then feed it to a dac that up samples itself. This is complete the opposite of what you'd be doing if you want less junk in and fooling around with your data. So this, IMHO, is more psychologically biased than not down sampling at all.
ReplyDelete
Replies
Unknown10 June 2016 at 15:27
Well for mobile use it those 192k files are indeed big. The point I'm trying to make is that all dac chips convert all 44k1 to at least 8 times that much. Now if you have a NOS dac, that wouldn't be true of course, but not many have. Apart from the discussion if someone's audio chain is helped or not (in the nos case) with higher bit rates, it most certainly doesn't help your dac chip to feed it with modified (lower sample rate data) if what it does itself is always up sample (to at least 352.8KHz). I presume everyone here knows how dac chips work and that the first stage up sample process in a dac chip is, generally, the most important one. If not: usually the dac chip is either ladder or sigma delta. All of them work at way higher sample rates than what you feed them (44k1 and such, but also 192KHz is considered a too low native bit rate). So what a dac chip does is up sample 44k1 8 times. Most of the times it does this in steps. Like input frequency times 2, then one more time times 2 and the 3rd stage also 2. So you have an 8 times oversampling dac in the case of a ladder dac. A sigma delta dac is different, but let's just suffice with saying that 8 times is more likely to be 64 or even higher. What you do by removing the high sampling frequency, is making the dac chip have to work harder to get to that same sampling frequency. So it has to work harder (=more errors because of internal precision), and you add errors because of the downs ample process. This doesn't make sense to me. Sorry for ranting and its not my native language, nor am I a dac chip designer, so do take this with a small grain of salt: things might be slightly different in real life. The theory stands though!
ReplyDelete
Replies
Unknown10 June 2016 at 15:40
To further cast doubt: the principle of sampling theory is that the length of time a sample is being taken of a signal must be very (almost infinitely) small. What happens if I sample at 44k1 and I feed it a 1KHz signal? It distorts. That's why a lot of dac manufacturers state their harmonic distortion plots at a integer division of the sample frequency. As in: Fs = 44.1KHz -> test signals = 1.027Hz (divided by 44). This helps understanding the fact that the higher the sample rate, the better the sampling theory works out. There are also hardly any oscilloscope manufacturers that sell oscilloscopes that comply to the bare minimum of the Shandon theory. They sample at least 3 to 5 times higher.It's being done because it's really hard to filter everything above half the sampling frequency with an analog filter that is ridiculously steep. Easier to do it less steep and up the sampling frequency. At least: nowadays that easy.
ReplyDelete
Replies
sk1610 June 2016 at 22:00
Re Wilson...yes the Porcupine Tree releases were 24/48. I believe the first 24/96 product was the live Anesthetize bluray.
The first PT DVD-A releases were engineered by Eliot Scheiner. Wilson watched and decided he could do it better.
Grace for Drowning was the first 24/96 product. I think he did his research and realized 96k was what was required as you write above.
Good listening.
ReplyDelete
Replies
Honza10 June 2016 at 23:53
Well, I also in the past was kind of thinking that more could be better. Last year is optimized my audio chains a littel bit (although i have not much money to spend) and have to say that anything above 48k is overkill concerning audible frequencies. We can store our recordings at 96k to be safe, but for common usage 48k/44.1 is completely OK. The case with 16-24 bit is different, since theoretically the dynamic range could be perceived and morever 16bit requires dither which is artifical element, although inaudible under common conditions. 176-192 kHz is totally pointless, though.
ReplyDelete
Replies
Honza11 June 2016 at 00:00
And one more thing. Whether the DAC oversamples or not is the matter of the DAC designer decision and they for sure can do that. But this does not say anything about storing/playback format, which should be fed to any compliant DAC available. And there, 48/44.1 is completely OK.
ReplyDelete
Replies
Honza11 June 2016 at 00:08
And second more thing :), by expressing opinion to 16-24 bit I do not mean that 16 bit cannot be used well, actually the difference to 24 bit would be inaudible in 99 % tracks. But the bit rate could be more important than expanding frequency.

Generally I would recommend to do and store studio recordings at 24/96 or 24/48, if space is not concern use 24/48 or 24/44.1 for playback, if space matters or CD is desired then with a peace of mind use redbooks 16/44.1. In either case our ears will not miss much, if anything :-]
ReplyDelete
Replies
Unknown16 June 2016 at 17:13
On the subject of dither, I did a little piece last year for a hifi magazine here in Australia demonstrating its effect by downsampling a short piece of music from 16 bits to 8 bits, undithered, dithered with vanilla noise, and dithered with shaped noise. Far easier to hear what's going on than when downsampling from 24 to 16. Piece is here: http://hifi-writer.com/wpblog/?page_id=4544
ReplyDelete
Replies
Unknown16 June 2016 at 17:16
Whoops, should have added my name since I don't think I have a compatible 'Comment As" profile (and I hate anonymous commenting): Stephen Dawson
ReplyDelete
Replies
StevenS30 June 2016 at 16:13
I guarantee you this will be the next 'talking point' from the hi-rez cheerleading squad:

"A Meta-Analysis of High Resolution Audio Perceptual Evaluation"

http://www.aes.org/e-lib/browse.cfm?elib=18296
ReplyDelete
Replies

Add comment