Archimago's Musings: SUMMER MUSINGS: Post Hi-Res Audio. Why hi-res is often not for the best. Resampling, dithering and de-clipping.

Saturday, 25 July 2020

SUMMER MUSINGS: Post Hi-Res Audio. Why hi-res is often not for the best. Resampling, dithering and de-clipping.

It has been a good summer around here thus far with time off, working around the house, and of course time to enjoy the warm weather for a bit. Here in Vancouver, the late fall to early springs are typically dark and rainy so I'm happy to catch a few photons when I can :-).

With the pandemic, this will certainly be an unusual summer/year. Unless things change substantially, this will be the first year in 2 decades that I won't be traveling off the continent for vacation or work - heck, I can't even visit the USA at this point without at least 2 weeks of self-quarantine back in Canada. In that spirit of staying put and cleaning up this year, I thought I'd talk about a few related items that have been on my mind pertaining to my music library.

A few months ago, I wrote an article about remembering that for home audio, we must always think about the big picture triad of PRODUCTION - REPRODUCTION - PERCEPTION. So often, we in audiophilia spend disproportionate amounts of time on the hardware aspects of reproduction (what's new? how do these devices measure/perform?), or speak of subjective perceptual experiences. Much of that I think is a reflection of what our magazines, online sources, and forum topics revolve around. Maybe that's what the Industry also wants us to think/talk about. We so often forget that in fact, a huge amount of what we perceive - or can perceive - has already been "baked into the cake" from when favourite recordings left the studio (or perhaps specifically the mastering engineer's workstation).

The other day, I saw the brief summary from Mark Waldrep ("Researching HD-Audio: The Truth") about his blind test comparing hi-res vs. standard CD resolution conducted over the last year with hundreds of responses. Yet more evidence that what I suggested a little more than 6 years ago (how time flies!) at a time when the Industry was pushing for hi-res audio is true. "Hi-Res" audio is really of no value unless the production is top notch which is what Mark brought to the table with his research - tracks that were of true hi-res provenance. And even then, as Mark's research is suggesting, it's doubtful that the majority of listeners will be able to perceive a difference. Furthermore, remember that with listening tests, an audible difference is no guarantee that it sounds "better".

One disclosure I want to add is that interestingly, I was able to blind-test and identify correctly 5/5 of the tracks I downloaded from Mark in the blind test. I used my ASUS Essence One DAC (with NJR MUSES02 opamp upgrade) and combination of Sony MDR-V6 and Sennheiser HD800 headphones one evening. I was probably just lucky but even if I were to have faith that I could hear the difference, this was all at best subtle; nothing I would say provided more "joy" in the listening experience of the hi-res tracks.

No surprise, right? Over the years, research reports like Meyer & Moran (2007) questioned benefits although there are legitimate concerns about the recordings they chose. On this blog in 2014, the "24-Bit vs. 16-Bit Audio Blind Test" likewise showed no evidence of improved bit-depth having significant benefits among 140 respondents using hi-res music from 2L. In the past, I've discussed questions about fidelity and audibility of ultrasonics (consider for example air absorption). I've shown examples of hyped "hi-res" albums back in the day that in no way should be considered as such (like this). Even the "positive" results like J. Reiss' AES meta-analysis (2016) IMO have been unimpressive to suggest hi-res to be of benefit; those of you who work in academia will know that statistical significance does not automatically imply real-world significance or value (also discussed here).

As I had argued over the years, the "value" of hi-res audio to us as human listeners need to be considered because there is a cost difference. Since I believe value should be reflected in price, I never thought that "hi-res" audio should cost much more (if anything) over the same CD-resolution version. My suggestion over the years was that music labels recognize the importance of hi-res audio be released as "advanced resolution" versions that used elevated mastering sensibilities rather than exactly the same master put into a bigger hi-res bit bucket.

Remember though that while it looks to me like hi-res music is of little value, this doesn't imply that we should not use high resolution-capable hardware. For example, I argued that it's good to have a hi-res DAC because we might want to do digital processing during playback. It's good to have a reproduction chain be capable of better resolution so we have latitude for the best possible upsampling/transcoding (eg. with something like HQPlayer), volume adjustment (eg. ReplayGain), EQ, and room correction.

I thought about the recent poll looking at music library size. It's good to see that most people are not as much of a hoarder as I am :-).

As expected, the available storage is skewed higher than the amount of audio data (not including backups as specified in the poll question). It looks like the vast majority of the >200 respondents (>82%) have <4TB of audio data. I think it'll be interesting to see how this need for storage might shift over the years as more audiophiles turn to online streaming services as their primary source of music instead of building a digital collection.

Within all that audio data we own, I wondered about the amount of hi-res audio people are accumulating. This is worth thinking about because 24/96 data will take up 3-4x more space than 16/44 CD-resolution FLAC data. Without downsampling (eg. from 96 --> 48kHz), just dithering from 24-bits to 16-bits, we're typically looking at a 50+% reduction in storage space because those lowest 8 bits likely just contain difficult to losslessly compress noise.

When I look at my hi-res audio folders, which include quite a number of 24-bit vinyl rips, I'm seeing almost 1.5TB of data. If I were to convert all that to standard resolution FLAC, this would easily take it down to 600GB. This number would have been even lower a few years back when I used to have 24/192 music (gradually almost completely weeded out). In fact, storage needs would be even less because I already have CD-resolution versions of many of the same albums to be replaced if downsampled/dithered hi-res version is better mastered. The important thing is in spotting which version of the music is the best sounding by listening and perhaps comparing dynamic range data as appropriate.

So then, after years of making vinyl rips, buying hi-res content off HDTracks, exploring SACD and DVD-A rips, more recently extracting Blu-Ray hi-res content, here are some general principles I have come to follow before buying any more hi-res material.

1. As discussed 6 years back, do not bother getting hi-res versions of highly compressed audio. Over the years, looking at my own recordings of solo instrumental work, choirs and live bands, uncompressed acoustic music will have DR values typically from 13-16. I know the classic DR "crest factor" measurement needs to be evaluated in context and other means like Roon's R128 "Loudness Range" are available. However, the DR measurement is free, convenient, widely quoted and IMO good enough.

Generally I suggest that if you see a DR of less then 10, simply do not bother buying such an album in hi-res since it's highly likely that this music was manipulated to no longer have natural levels of dynamic contrasts. My personal threshold would be DR12 and above. Also, I highly recommend that you consult the on-line DR Database first to see if others have reported the specific album before taking out the credit card.

Remember that there is a place for compressed dynamic range music. If you're in your car, on a bus, or riding the subway, highly compressed music can help accentuate details in the recording to overcome the ambient noise level. The shift to mobile listening has been significant and I think this has perpetuated the use of low-DR mastering; IMO the majority of mobile and headphone listening situations simply take place in contexts that do not require hi-res.

2. Hi-res is about subtleties. Are there genres or types of music that really cannot benefit from those subtleties? Is there need in electronic dance music? Do we think that modern pop with all kinds of complex studio work, DSP effects, etc. will benefit from 24-bits and >44kHz? How about intentionally "lo-fi" and "grunge" music? When was the last time you heard live recordings with low ambient noise level?

With regards to that last question, I quite like the ambiance of live recordings and have a number here. However, there's really no point having Bruce Springsteen & The E. Street Band's Live in New York as hi-res. Absolutely no need to get excited about Santana & McLaughlin's Live at Montreux 2011 in 24/96, or The Who's Live At The Royal Albert Hall hi-res. What's the point of keeping Leonard Cohen's Songs From The Road in 24/96 or Jackson Browne's Running On Empty DVD-A rip as 24/96? They're noisy whether stereo or 5.1 multichannel and typically there's not even evidence of "musical" content above 22kHz in those recordings.

Just like many SACD albums appear to be PCM upsamples, don't think that just because an album advertised as 24-bits really means anything in terms of actual quality. Many HDTracks downloads, DVD-A and Blu-Rays are from rather poor remastering jobs. Many are IMO worse than the respective CD's - for example, Jackson Browne's Running on Empty DVD-A was "Loudness War'ed" to DR9 when the original CD sounded more "live" on a good audio system with DR13.

Bottom line: Don't bother with genres like live "hi-res" recordings. Likewise, I have yet to see any synthpop, electronica, mainstream Top 40, etc. that looks or sounds like it could benefit from hi-res. Stick with acoustic music for your hi-res purchases.

3. Vinyl does not require 24-bit quantization. Remember this discussion from 2017 on vinyl resolution. Yes, LPs can have ultrasonic content over CD's limit of 22.05kHz (44.1kHz sampling). But again, by definition, ultrasonics are not audible. We'll talk more above vinyl and ultrasonic content below.

4. Was the music originally captured with hi-res equipment and produced with care? This implies very low noise microphones, recording devices, mixing and production chain components also capable of good >22kHz frequency response! Is the studio making the product capable of that? From what I have seen, there are only a few labels that consistently produced hi-res quality recordings; Reference Recordings (the recent Tchaikovsky No. 4 / Leshnoff Double Concerto is excellent), AIX, 2L, Channel Classics among the small group...

Analogue master tapes may approach a dynamic range of 80dB, typically much less, so until the arrival of digital recordings, there was no such thing as resolution higher than the 16 bits of a CD even if this gear captured ultrasonic content above 22.05kHz. Since we're talking about fidelity, beyond dynamic range and frequency response, let's not forget about the inaccuracies in the time domain such as wow and flutter in analogue playback/recording way worse than the little bit of jitter many audiophiles fret about.

High quality digital recordings and playback gear were typically at best 16-bit resolution until the 1990's (sure there was research and early >16-bit work back in the late '80s). I suspect there was a gradual transition and some of the recordings from classical labels like Telarc by the mid-1990's represented early true hi-res quality even though SACD and DVD-A were not introduced yet.

On average, audiophiles are not a young demographic so the Industry has released many of the "classics" like The Who, Grateful Dead, Bob Dylan, Pink Floyd, CCR, CSN, Beach Boys, Bowie, Neil Young, etc. as hi-res for repurchasing. Needless to say, even more geriatric golden oldies pop like The Zombies or Herman's Hermits isn't even "hi-fi" to begin with, contain no ultrasonic content even related to the music, yet are available as SACDs as if "hi-res"!

Let's go one step further and examine some "data" from old recordings like these. Consider this FFT from the track "Ripple" (from the Grateful Dead's 1970 American Beauty album, one of my first hi-res purchases back in 2001 on DVD-Audio):

Notice the noise spikes at around 29kHz and 31.5kHz. Depending on the track on that album, we see variations of noise like this in the ultrasonic range. You see this kind of anomaly with many albums in fact, some of that could be intentional AC bias from the original tape recording, other times this could just be spurious noise from the production process. As such, consider then, unless we know an album was recorded/mixed/produced in a pristine hi-res way, do we actually "want" those anomalies to be on the recordings, much less be reproduced in playback? Should we not consider the possibility that maybe hi-res albums like this might sound worse than if it were just downsampled to say 48kHz?!

Remember that while the noise above is ultrasonic, our equipment, particularly amplifiers and transducers (speaker, headphones) are not absolutely linear devices. As a result, anomalies like intermodulation distortions can show up in the audible range. Here's a demo to download and listen for yourself:

Archimago's Ultrasonic Intermodulation Demo

This is a 24/96 track which contains 2 primary frequencies - 26kHz + 28kHz, with modulation around these frequencies. Theoretically, you should not hear anything with a perfect DAC/amp/transducer system since it's all above 23kHz as per this FFT:

However, since our gear is unlikely to be perfect, you will likely hear intermodulation distortion through your speakers. Have a listen to the demo on your system with bit-perfect playback, making sure there hasn't been resampling or filtering, and turn up the volume slowly (WARNING: Start soft and be careful to turn the volume back down before playing normal music so you don't blow your speakers/headphones!!! I obviously cannot take responsibility for user error...).

The signal plays back on both channels first, then will alternate side-to-side every 3 seconds then end off with both channels again. For most of us, we'll be able to hear what sounds a little like a soft original Star Trek tricorder scan due to intermodulation distortion.

While ultrasonic noise in most recordings will not be as high as the -30dBFS peak level of this demo shown in the FFT above, the point is, do we actually "want" albums where the ultrasonic content could distort into audible frequencies? Would it not actually be better to just filter out most or all of this stuff in the first place and just keeping 44.1/48kHz samplerate audio instead?

The idea that ultrasonic signals might be audible, so potentially beneficial (see this previous post about the "Oohashi hypersonic effect" papers as one of the sources used by many audiophiles) is somewhat theoretical; but the demo above reminds us that ultrasonic signals can create real audible distortions that were not meant to be heard and are not just theoretical.

I believe the answer is likely "no". We do not want or need those high frequencies in general and likely down-sampled versions at 44.1/48kHz could sound better than the "hi-res" version by relieving us of the potential for distortion. A position that is very much counter to the Industry lobby for "Hi-Res Audio".

A gradual process of downsampling and dithering of unworthy "hi-res" music...

As I surveyed my music collection in the last month, time permitting, I've been converting 24/96 LP rips to 16/48 and doing the same when I come across hi-res albums clearly not satisfying the criteria above. The size of my music library has been shrinking by gigabytes every week compared to when I posted. I know that storage is cheap so it's not like there's any pressure to do so, but since I like my music library well managed and in optimal shape (including tagging of course), I think it's good to weed out storage-wasting albums.

Practically speaking, let's discuss the resampling and dithering process...

For the most part, 48kHz sampling rate has become a "friend" of mine :-). It's a little bit higher than 44.1kHz so I can retain a little more from 88.2/96kHz downsamples up to the 24kHz Nyquist frequency. Non-integer downsampling from 88.2kHz is no problem these days with good asynchronous resampling software. The extra 2kHz (above 22.01kHz Nyquist of 44.1kHz sample rate) can give digital filters a little more room to operate in the top frequencies. Furthermore, these days 48kHz is ubiquitous among modern DACs (would be very odd for a DAC to play 96kHz and not 48kHz) so there should be no compatibility issues.

For years now, I have used 64-bit iZotope RX as my main high-quality conversion tool, currently at version 7. It has a convenient batch function to convert a whole album at a time. Here's what I use for a straight conversion between hi-res to my most common default target of 16/48:

There are all kinds of settings one could use, these are what have worked well for me over the years. As you can see, the batch processing steps include a couple of optional gain settings. Sometimes, it's good to add a small amount like -0.5dB to the signal before resampling if I know there could be some intersample overloading. Other times, it's nice to add something like +2dB before the dithering step if I know the 24-bit audio is a bit soft and I can increase the volume to optimize the available dynamic range. This is all guided by the peak level of the music from DR data or amplitude statistics from audio editing software.

Remember that while one is of course free to play around with digital filters on the DAC including minimum or intermediate phase settings, it's best to keep the resampling process linear phase so as not to introduce group delays in the signal ("pre-ringing 1.0" in iZotope RX). I've used a steepness setting of 100 and the "Cutoff shift" moves the filter roll-off back a bit so that by Nyquist, it's about 20dB down. Doing this will result in roll-off starting around 23.5kHz. I started doing this a couple of years ago because I noticed that friends who were using various consumer ADCs at 96kHz sometimes introduced a noise spike right at 24kHz in many of their vinyl rips. Here's an example:

Various ADCs seem to add a 24kHz noise spike when capturing at 96kHz samplerate.

So I might as well allow some attenuation before 24kHz and suppress noise around there. Here's the resampled result of that same track using my iZotope setting:

Anomalies right at 24kHz isn't uncommon to see in many "hi-res" downloads. Here are a few - it didn't take long to find!

Notice for Return to Ommadawn, there's really no content beyond 24kHz.

As for the 24-bit to 16-bit dithering setting, I'm not a big fan of algorithms with lots of noise shaping. Sure, we can drop the noise floor significantly in the lower frequencies where the ears are more sensitive, but this results in higher noise level in the upper frequencies to compensate (no free lunch guys!). My preference is to keep noise level relatively stable throughout with only a small amount of noise shaping. 16-bits provide plenty of dynamic range already and dithering IMO doesn't need to be strong, hence my use of "Lowest" noise shaping, and "Low" dither amount with iZotope's MBIT+ algorithm. The 24 vs. 16-bit Blind Test in 2014 used a basic triangular dither without noise shaping and listeners already could not tell a difference with hi-res 2L music.

Here's what the noise floor looks like comparing my 16-bit dither settings (yellow) to original 24-bits (red, which was dithered from 32-bit). For fun, I've included an example of the noise floor from a 45rpm mint vinyl single after being cleaned with a VPI machine, using the excellent Ortofon Cadenza Black cartridge, ripped with my RME ADC allowing -6dBFS peaks. For the vinyl recording, there's some audio in the lower frequencies, but you can clearly see the noise level from 5kHz up:

Notice that dithering to 16-bits is still capable of encoding the -110dB signal. Noise shaping causes that dip in noise floor below 14kHz while shifting the noise level slightly higher above 15kHz. [Note the Addendum below about using stronger dither settings.]

There's plenty of dynamic range between the vinyl rip and the dithered 16-bit digital... Sure, the noise level will vary depending on what turntable you use, the stylus, phono cartridge, preamp, and quality of the vinyl itself. However, even if you have the best of the best, vinyl is not hi-res. From the perspective of noise level and dynamic range, it doesn't even challenge 16-bit digital. Buy vinyl for the physicality, collectability, rituals and subjective preference for the sonic colorations; not under any pretensions that the sound quality is of higher fidelity regardless of the claims some make.

By the way, if you're wondering about the amount of noise shaping used for CD releases, we can see that in some instances, strong amounts have been applied. For example, here's a quiet portion of the Tsuyoshi Yamamoto Trio's "Another Holiday" from What A Wonderful Trio! advertised as an original DXD recording:

That's an example of very strong noise shaping where they were able to keep noise level very low in the 1-6kHz region where our hearing is the most sensitive at the expense of rising noise from 14kHz up to -100dBFS by 22kHz. I can understand why this is done based on human physiology, but like I said above, I generally prefer a more steady noise floor across the audible spectrum when I've experimented with various settings. If you look at the FFT during playback, this album sounds great but the music really doesn't encroach down into those noise limits below 14kHz so I think the setting is excessive.

Most of the time, we see less aggressive noise shaping, for example a popular one is Sony and their Super Bit Mapping (SBM) used in many CD releases:

There's some low-frequency audio in the FFT above, but we can see the SBM process keeping noise floor low down to around -140dB for most of the audible range and rises from 14kHz.

Beyond just noise anomalies around 24kHz, there are all kinds of "hi-res" recordings with abnormalities suggesting that we really should just downsample these... Here's a selection found in the last couple of weeks:

EOB's Earth is a nice recent album from the Radiohead guitarist if you like alt rock/experimental rock sounds and was featured recently in Stereophile. I don't think there's a point getting the hi-res version due to high compression as in the waveform display (album DR7). Furthermore, there's nothing but ultrasonic noise and anomalies like those spikes in the FFT.

Elton John & Leon Russell's The Union also doesn't need to be 96kHz. There's also no point keeping this at 24-bits when dynamic range has been compressed to an unfortunate DR8; resampling this to 16/48 will save space, and remove any potential of distortion introduced by ultrasonic noise.

This same situation exists for multichannel "hi-res" tracks. Here's the title track from Dire Straits' Brothers In Arms 2005 DVD-A remaster/remix to 5.1 24/96:

This album from the mid-80's was at best a 16/44.1 multitrack digital recording anyways so we would not expect any hi-res content. Surely, those ultrasonic noise spikes cannot make the sound "better"! I would happily downsample this to 5.1 16/48 as per the iZotope RX 7 settings above (16/44 would be fine also). iZotope RX handles multichannel just fine although they need to fix issues with losing Unicode tags which has been a problem for awhile.

While we're talking about multichannel, in the last few years, Steve Wilson has done some excellent remix/remastering work with classic progressive rock albums. For example, the recent 2015 Blu-Ray of Yes' Fragile contains an excellent 5.1 multichannel version (along with remixed stereo and flat original stereo transfers). Have a look at the remixed 5.1 24/96 "Roundabout":

What we see is that the music actually rolls off by 25kHz. Beyond that is noise which peaks out at 31-32kHz. Remember, this is a >45 year old analogue recording that is not "hi-res" at its core. Clearly just because the audio data rate can be pushed to 24/96 doesn't mean that the content ever needed it. Perhaps a reasonable analogy would be if we took the movie Dirty Harry (also released in 1971), filmed on old 35mm, digitized that into 4K, regraded the colors for "pop", perhaps enhanced contrast for HDR brightness. No matter what you do, clearly the movie would still not look like a pristine, clean image as if it were filmed on modern hi-resolution video cameras (some discussions on this here). So too, old recordings done before the hi-res digital era would never truly need the full benefits of 24-bits or >48kHz even if some enhancements were added in the remix/remastering.

Remember that compression and questionable hi-res doesn't just affect rock/pop these days. Unfortunately, soundtracks which are some of the best "modern classical" recordings have been affected as well. Here's something from Michael Giacchino's Star Trek: Into Darkness (2013):

Again, no point keeping such compressed audio as 24-bits and all that ugly stuff >24kHz will do nothing to improve sound quality!

I love John Barry's soundtrack to Dances With Wolves (1990). It has decent dynamic range at DR11, however look at the FFT for the first track "Main Title - Looks Like A Suicide" from the SACD rip (released in 2000):

Anyone want to argue that we need anything more than 48kHz samplerate for that? Those large 26.5kHz and ~29kHz noise spikes are featured throughout the album across every song I checked. Again, this kind of ultrasonic noise is common with analogue recordings - good thing we can't hear it, and no point keeping it there! ;-)

Audiophile labels are not immune from questionable hi-res recordings. Consider Chesky Records and this from Christy Baron's Take This Journey (2002, HDTracks 24/96 Studio Master) :

Not only do we see some ugly ultrasonic junk above 30kHz, but there's also spurious noise in the 15.5kHz region presumably from a device along the recording/mixing chain.

Cookie Marenco from Blue Coast Records touts the benefits of DSD and says she uses a hybrid DSD/analogue system in the studio. I only have one of the label's SACDs - Blue Coast Collection: The E.S.E. Sessions (2007). If we just picked the first track and converted that from DSD to 24/88.2 PCM, this is what we see as an average spectral content of the audio:

Notice that above 22kHz, there's little ultrasonic content (to be expected) but what is there consists of non-musical noise, the strongest around 23.5kHz (and there again is the common 24kHz noise). I suspect this was all introduced by the analogue studio process. While Marenco might claim that looping the audio through an analogue tape system sounds better, the final digital data would sound just as good (if not better) downsampled to 24/48 IMO (assuming you believe 24-bits convey benefits here; I don't).

One last quick example... While it's nice to own well done, remastered "classic" jazz like the immortal voice of Ella Fitzgerald, remember that recordings from the golden age of analogue does not need hi-res. Here's Ella on "S'Wonderful" from Ella Sings The George and Ira Gershwin Songbook (2013 HDTracks 24/96 release):

There's obviously a static noise peak at 29kHz through the song. But also notice the high noise floor throughout, sitting above -95dBFS. While the album is a good DR11, this high noise level inherited from an old recording does not demand 24-bits.

One potential benefit with using sub-optimal 24-bit, severely compressed recordings - do your own de-clipping?!

There are some albums which are badly produced and clipped available as 24-bits (typically a waste of money and storage of course). While we can't confirm that below 16-bits there's anything of value in that data, one interesting use for 24-bit data is the ability to do one's own manipulations with the best accuracy possible. For example, back in 2017, I mentioned that one could take digital audio and "fix" severe compression by allowing software to expand the dynamic range. Over the years, I've continued doing this with a handful of very bad recordings.

Here's an example using that idea with the 24/96 version of Tool's album Fear Inoculum (2019) and specifically the track "Chocolate Chip Trip" (interestingly, this track was used in a recent Stereophile Volti Audio speaker review):

As you can see from the waveform display, much of the track has been peak-limited. At least nothing terrible >20kHz on the FFT. Here's the album's DR analysis:

Not too terrible with DR9 average, but being <DR10, not a good candidate for buying as 24-bits.

So, let's do the De-Clip procedure in iZotope RX as per previous discussions:

And then afterwards I'll just run it through my downsample and dither to 16/48 as documented above.

Here's what "Chocolate Chip Trip" looks like on the waveform display after the de-clipping / resampling / dithering process:

And the new DR analysis:

The album has now been "restored" to a DR12 average with most of the clipped portions extended using the best quality 24/96 data. While it would have been better that the album not be mastered with so much dynamic compression at the studio, the "de-clipped" version at least relaxed some of the aggressive claustrophobic sound. I would suggest the folks at Volti Audio give this a try for their demos with the volume pumped up!

Over the years I've found a few other albums worth de-clipping. For example Aerosmith's Get A Grip from 1993 was a paltry DR7 (an early casualty of the Loudness War). De-clipping the 24/96 HDTracks version gets us to a decent DR10 which also sounds slightly better to me. I like the de-clipped version of Wilco's The Whole Love better also.

Let's wrap up...

I think that's long enough for a "Summer Musing" :-).

The bottom line is this guys and gals. Do not forget the PRODUCTION side of the equation as audiophiles. Too often as audio enthusiasts, we gravitate to the "latest and greatest" in hardware that claims ever-improving sound quality. As I discussed last summer, there is a limit to what we truly "need" in terms of fidelity regardless of what magazine reviewers, or company advertising might be selling to us. Once we hit a threshold for hardware (and related issues like out room limitations), we reach a point of diminished returns regardless of how much money is spent in upgrades. We must expand our thinking into the media itself. Remember the importance of picking and choosing the best album mastering if you have options. While we have plenty of great sounding recordings out there, many of my favourite artists actually have poor recordings which I can enjoy but would never hold up as examples of "demo" tracks to show off hi-fi playback.

For years we have been sold on the idea that "Hi-Res Audio" somehow represents an improvement in the sound quality of the music we buy. After all these years, I believe the most poorly thought out sales pitches and claims came from the days when Neil Young tried to market Pono (remember this stuff back in 2015?). He presented an incorrect and simplistic view of sound quality being linked to the amount of data used, holding the idea of 192kHz sampling rate as some kind of threshold for quality. Since the Internet never forgets, new audiophiles looking at articles will still be exposed to this kind of inaccurate and simplistic marketing. Thankfully, I believe most seasoned/thoughtful audiophiles are operating beyond that level of intellectual inadequacy and can see through these kinds of arguments. Incidentally, I have not seen any need for hi-res with Neil Young's own recordings.

Evidence like Mark Waldrep's reminds us that there is little if any benefit to "hi-res". I think it's time as an audiophile community to recognize a "post-Hi-Res Audio" mindset less interested in the hype just because something is 24-bits or >48kHz, and more focused on the potential which can only be realized in a minority of circumstances using immaculate recording and production values. It has been 20 years since the release of DVD-A and SACD. 12 years since HDTracks launched to get your hi-res downloads. Long enough to experience whether these tracks are of value for ourselves. Long enough to see the market's appetite - from what I can tell, low mainstream demand simply because it never made a significant audible difference.

With the gradual process of resampling and dithering, I suspect by the time SSDs are inexpensive enough to house the terabytes of my music library in a few years, the size of that library will be smaller still than it is today as I do more cleaning. It's quite possible that I've reached "peak audio storage" in my music collection. I remind myself too of the suggestion from 24bitbob earlier this year to focus more on the artists, the "core music" I truly love and dispense with simply hoarding a collection.

A couple of final points as I end...

I think these ideas should also have repercussions when it comes to lossless streaming services. Generally, I see no need for "hi-res" streaming either. Simple 16/44.1 or 16/48 lossless streaming is already great. So long as you don't mind the extra data utilization of Qobuz Hi-Res or Amazon Music HD, that's fine although I think unnecessary. There's still no need to entertain nonsensical MQA through Tidal which is of no benefit to the consumer (more like a worthless tax).

Remember that I am not saying hi-res is worthless in all circumstances and that 24-bits, 96+kHz be banished; far from it! There is a time and place for hi-res audio. For example, I will support record labels that produce "true Hi-Res" recordings I enjoy. I will continue to do my own recordings and rip vinyl as 24/96 because I've found my ADCs perform very well at the 96kHz sample rate. Furthermore the ADC's low-pass filter will be operating well outside the audible range. Likewise, studio recordings should definitely be done in hi-res. Hi-res "raw" captures are the foundation from which the final product is created providing the audio engineer with great latitude in the production process to manipulate the sound while maintaining highest fidelity. The point is that the final polished albums we buy needs not be tied to 24-bits or >44.1/48kHz for full expression of artistic intent or best quality.

[As a side analogy... As digital photography enthusiasts, we might want to capture our images in camera RAW but how often would we feel compelled to send the final processed product to family / friends / customers in such a format!?]

I hope you're able to enjoy the summer (or winter for the Southern Hemisphere visitors!).

Enjoy the music...

PS: Here ya go... They're working on the ultimate hi-fidelity audio - direct-to-brain streaming. I think I'll pass on the first generation of whatever this product will look like. I suspect it'll take awhile to get the fidelity right and then wait for audiophile-approved interface cables of course. :-)

PPS: Seeing Unknown's comment below, a friend sent me the 24/192 track "American Idiot" (off Green Day's American Idiot of course) this morning. Based on the DR Database, it looks like the hi-res version enhanced dynamic range with DR9 vs. a most unsatisfactory DR5 for the original CD. FFT not good:

I think it would only be "right" to immediately downsample that to 16/48 to keep everything you need and discard the ultrasonic "Studio Master" junk best left in the studio! Alt. rock not exactly a genre I think we will ever need "hi-res" anyway. :-)

[Addendum: 2020/08/03]
I appreciate the input from the readership! Honza made a good point about whether "Lightest, Low" dithering settings in iZotope RX 7 might not be adequate dithering strength for the 24-bit to 16-bit conversion.

I experimented with various signals and the effect of truncation. Indeed, setting the dither and noise shaping a step higher to "Light, Normal" does help so here's the setting and noise level again:

In general, this is all just "perfectionist audiophile" stuff ;-). Considering how "unworthy" so much of the 24-bit "hi-res" audio is, in practice whether one uses this stronger dither setting or the less-strong one in the original text IMO is not going to be audible...

59 comments:

Danny26 July 2020 at 00:37
Pretty much accept everything you wrote. A couple of comments:

1a)I have a lot of hi-res/DSD remasters that are the best sounding version I know of. This includes legacy vinyl material. This almost for sure is due to the mastering, not the format, but it's a reason to obtain the hi-res.
1b) There are some instances where the hi-res version is mastered as audiophile (generally less added compression) and sounds better/different than other versions.
Some Paul McCartney albums, and Green Day American Idiot come to mind. Also the 24/96 BluRay version of the White Album is a different (and better sounding, IMO)version than the CD and even the 24/96 download.
2) I have also noticed a phenomenon where when listening to a hi-res version of an album I hear small details that I hadn't noticed before. Even albums I know well. But then, when I go back to the Redbook version, I can hear those same details - that I hadn't previously noticed. There seem to be 2 possible explanations for this: a) the hi-res version/master makes existing detail easier to hear; b) intense listening to the new hi-res version allows perception of certain detail that wasn't previously noticed....Once heard, it can't be unheard, as it were.
ReplyDelete
Replies
Nobody26 July 2020 at 02:41
High resolution audio is the highest copy in production chain survived. In general, owners will not release it as raw material. We will see one way or another manipulated and relabeled releases. If the highest copy is an analog tape, it will degenerate or will be lost with time. So it could happen that low-res vinyl you own will be highest copy (press) in production chain and can be labeled accordingly.
ReplyDelete
Replies
sk1626 July 2020 at 10:45
Arch,
Since you referred to Steven Wilson, have you measured one of his bluray releases; Hand.Cannot.Erase.for instance (24/96 5.1)?
DRs are high and his production and recording standards are quite high I believe.
All his releases are bluray 24/96 5.1 since the late Porcupine Tree days.
Stay safe!
ReplyDelete
Replies
Fluffy26 July 2020 at 22:54
The ultrasonic intermodulation demo is very surprising. I tried it in my system, and I can definitely hear the distortion products. A recording with a microphone showed artifacts at frequencies as low as 2khz. I'm puzzled as to how this works. You said it's a result of a gear being not perfect, but what parts of the chain are actually causing this? And how intermodulation distortion can produce artifacts at a lower frequency?
I have noticed something else when trying this out – I played the demo and slowly raised the volume, until at certain point the audible distortions just came to be, after some crackling. They didn't sound soft and gone up with the volume, rather just appeared at a specific volume level.
ReplyDelete
Replies
relmu27 July 2020 at 06:43
is it possible to sample from 88.2 khz to 48 khz without loss or failure in the conversion?
ReplyDelete
Replies
Charles King27 July 2020 at 22:40
Regarding intersample overs: I thought this was only really an issue when UPsampling, as there’s a possibility that this will introduce interpolated samples that go over 0dB. I suppose there’s a possibility of producing overs from asynchronous non-integral downsampling (I.e. from 88.2 -> 48), but thought this wasn’t something to worry about when performing simple integral decimation from 96 -> 48. Am I mistaken?
ReplyDelete
Replies
Techland28 July 2020 at 06:01
Arch wrote: While ultrasonic noise in most recordings will not be as high as the -30dBFS peak level of this demo shown in the FFT above

I find this statement a bit misleading. The peak level of the file's signal is -1 dBFS (around -5 dBFS RMS), so the signal is really hot. This information is not present somewhere (unless I overlooked it).
ReplyDelete
Replies
DColby28 July 2020 at 07:47
Looking at it the other way around : let's say that the high-res version of a recording is the 'raw' version, straight from the digital recorder with no processing, than it should be cheaper than the CD version which need to be processed (down sampled) and even more from the mp3 version for streaming that is even more processing..!
ReplyDelete
Replies
Honza28 July 2020 at 23:04
As for intersample overs, I would rather decrease volume AFTER resampling that before it. When doing resampling in 32 or 64 bit float, no information should be lost with overs from resampling, and after that the volume can be reduced so that there is nothing (or little) above 0 db when saving as 16 or 24 bit integer.

As for dither, I doubt if "Low" dither in izotope is mathematically correct for dithering. I experimented with RPDF dither (less noise) and the results were mixed. Maybe 0.9 bit tpdf dither is enough as the noise fluctuation at that level is very very small and quantization error is compensated. I usually use violet noise TPDF dither (1 bit, with error-correction feedback) - or some very light shaping for 16 bit. However, plain (white noise) TPDF sounds a bit "grained" to me - I do not know why.

I do not know why, especially at 48 KHz, should be any aliasing allowed - therefore I would set the filter to no aliasing (full filtering) and not only -20 dB at the nyquist point. If there are any reasons/improvements from allowing aliasing at this sample rate we can discuss it.

ReplyDelete
Replies
Anonymous29 July 2020 at 05:44
I have done my own recordings here in 16 bit/96kz and 24 bit/96khz and I have only achieved a -80+ db noise floor here in my home studios with the mics closed. Once the ambient room noise get in it goes higher depending upon what I am recording. Most of the masters I do are 2496 as I like the smoothness of 96khz, I can go to 24/192 but I can't hear the improvement. I have many commercial recordings where the opening silence reveals a noise floor often at -55 to -60db. I don't feel bad where I am at recording wise.

I have not seen the HF spikes in my FFT display, but the HF noise is surely consistently extended way out past 30khz and have read that it can cause some issues with certainly amps. In my display most of it is down -90db or more so it has not been an issue here.

I own a number of SACD, some recorded in DSD as masters and others like the RedSeal remasters and all of them sound great to me. I even have some of the RedSeal Lps to enjoy. I used the stereophilel recording of K622 recorded by Tony Faulkner in tape and on DSD. I own the LP and the SACD with the CD layer and they all sound excellent, but I do enjoy the SACD most.

Even though I just bought and enjoy the Project S2 dac box, I used it to upgrade the sound of my old Sony DVP NS 755V players and even the 2496 recordings I've made do sound better IMHO. It is certainly better than my 17 year old disc spinners, but only slightly better than my 2007 Yamaha S1800, which does surprise me. I am bouncing between filters 2 (linear/fast roll off) and 5 (Hybrid) mostly now. I still think that 2496 is my best compromise in terms of SOTA sound. I do have some great sounding CD so mastering and recording as certainly gotten better. I am listening to Jorge Federico: Osorio from Cedille Records out of Chicago and the solo piano is excellent.

What I do miss is being able to go out and record new things. This lockdown is certainly sad and so many have not made it through. That is terrible.
ReplyDelete
Replies
Anonymous29 July 2020 at 05:54
Of course then Stereophile and JA1 reviews the Weiss DAC502 at $9850 in their AUG issue and then I work to get over myself about my $300 SOTA????? S2 DAC. I am pretty sure I will never really hear it all.
ReplyDelete
Replies
tnargs29 July 2020 at 22:20
I am actually getting pretty interested in MCH DSP up-mixing for 2-channel stereo product. (Which, like you say, might justify paying for high-res stereo product.)

My interest is in doing it in an intelligent way, i.e. not creating musicians-everywhere, but de-rooming and enhancing envelopment and spaciousness in a way that doesn't diminish the intent of the original, but goes some way to creating an 'audience perspective' MCH experience. Looking forward to sophisticated and subtle 2-ch music upmixers, that demur on spectacle and insist on authentic-seeming soundfield generation and room 'removal'.

I think the intent of Lexicon's Logic7 and Logic16, and AnthemLogic, are what I am after, which is not to endorse their implementation or success -- I don't know enough about them to do that.
ReplyDelete
Replies
Prep 7430 July 2020 at 00:45
Hi Arch

Well written article, as usual.

This fetish with ultrasonic frequencies that some'audiophiles'have, rather than focusing on frequencies that matter to humans is rather amusing. Although 100 years of data have long established the upper limit of human hearing at 20khz for healthy 18 year olds (plus or minus a couple thousand hertz) it is somewhat ironic that the profile of audiophiles are middle and upper age men who would be fortunate to hear anything above 14khz.

The other puzzling aspect of this search for out of scope human frequencies is why the focus on ultrasonics rather than subsonics? At least with the latter there actually can be a perception of these low frequencies as they shake internal organs and room furniture - my guess is that the ultrasonic fetish was started by the vinyl crowd knowing that CD cuts out at 20khz but they can't handle the fact (and never discuss the fact) that CD can get right down to 0 hz and be within 0.5db while doing it, an impossible task for vinyl recording and playback. In any event, all digital formats can records and playback subsonic frequencies and do so with accuracy.

But the story goes, even though we cannot hear these ultrasonic frequencies, they effect the frequencies we can hear. Well let's aside the nonsense of this claim and entertain that this is a fact. If these ultrasonic frequencies are affecting the frequencies we can hear, why would it relevant to playback of a recording rather than the live event? By definition the audible effect of the ultrasonic frequencies will be below 20khz (otherwise it would not be audible) and captured by the microphones and baked into the recording. Taking this thought bubble further, one can then argue that ultrasonics is bad for playback (assuming your stereo can actually reproduce those frequencies) as it will double down on the effect that is already baked into the recording. In other words it will cause distortion.

Of course this also assumes there is ultrasonic musical content in the actual recording. To what extent has any ultrasonic or for that matter subsonic content recorded on most popular music, particularly analog recordings? There is an interesting thread on the Steve Hoffman forum regarding the sound quality of the Beatles White Album. As in the link below, one of their (more rational) members who is also a sound engineer posted the specs on the recorders used for the White Album.

https://forums.stevehoffman.tv/threads/the-beatles-the-white-album-sound-quality.421749/page-22#post-24523975

Presumably, the Abbey Road studio would have used state of the art machines but note the frequency response of tape recorders and tape formulations back in the day of 30 to 15khz at +/- 2db. In fact, I believe that it wasn't until the late 70s or 80s that studio recorders could reliably record past 20khz without a serious drop in amplitude. However, despite having no ultrasonic and subsonic content on any music made prior to say the 1980s, many albums of that like the White Album, sound great.

If there is ultrasonic music content presumably only a post 2000 24/96 recording could possibly capture it within a reasonable db drop. These are recordings such as those produced by Mark Waldrep of AIX records, but neither he (or his test subjects in his controlled test) hear any difference between 24/96 and 16/44 on playback, and his business is producing and selling hi res recordings.

ReplyDelete
Replies
Turrican30 July 2020 at 02:35
Regarding electronica, I watched an interview with Robert Babicz as he talked about his production chain. At one time he mentioned that generally the audio will be low-passed somewhere between 16 and 17 Khz, and he is advocating that in his mastering classes for electronica genres.

Turrican
ReplyDelete
Replies
julian6730 July 2020 at 13:40
Reading this blog prompted me to run dr14_tmeter on my lossless music collection. It's just short of 1300 albums, all flac, mostly derived from CD, some from SACD, also an increasing number from buying files. These range from 16-bit 44.1kHz through 24-bit 48kHz to 24-bit 96kHz & 192kHz. 90 of these are from SACD, with the DSD layer extracted using a modified Sony Blu-Ray player. In some cases I also have plain original CDs of the same albums so can compare DSD, 24-bit 88200kHz flacs derived from DSD, Red Book audio from CD and my own conversions from DSD to 16-bit 44.1kHz.

Blah blah blah!

Anyway, in terms of dynamic range it seems to me that there is *zero* correlation between bit depth/high sample rate/format/container and DR. Many of the DR17 albums I have are plain old CD. I don't have any really bad SACDs but I do have some stinkers on CD. In the most general terms modern Concert Music ("classical, choral, early, chamber" etc) is very nicely recorded and presented, with very few exceptions and nobody ever joined in the loudness war. Chandos & Bis deserve medals for what they can achieve on CD. Popular genres are a lucky dip, though you're probably OK with acoustic stuff.
ReplyDelete
Replies
Anonymous30 July 2020 at 14:43
I have no proof, but I think that most putting their music on SACD would do a better mastering job in that transfer...except when a Norah Jones transfer got caught as no better than redbook. Whoops. Some people were watching early on.
ReplyDelete
Replies
VladHV1 August 2020 at 10:36
Archimago
Assignment when converting two Gain utilities with a value of 0 db.
ReplyDelete
Replies
Mork4 August 2020 at 05:10
Nice article and quite eye-opening! I'm not an an obsessive-compulsive HD music collector but I tend, if possible, to purchase the highest resolution available in the case (as I do) I need to convert audio in lossy formats (namely AAC) for portability. Don't you think converting from higher formats may conduct in better lossy files (i.e. the algorithm works better)? Moreover, instead of a massive batch music conversion why not a real time downsample while listening? Thanks again!
ReplyDelete
Replies
VladHV4 August 2020 at 22:35
Hi, Archimago
Why 2 times Gain?
Gain 0 db - does not change gain.
What's the big idea.
ReplyDelete
Replies
VladHV7 August 2020 at 01:53
Hi Archimago,
Thank you very much for the full and thorough response. Izotope RX 7 conversion I also apply.
ReplyDelete
Replies
Anonymous11 October 2020 at 03:29
Great article as usual. It's really depressing that the music industry works their "magic" in such fraudulent ways. I am glad my suspicions get confirmed by others who has the technical gear and understanding to prove it, but dang, it is not good.

As mentioned elsewhere, Santana's wonderful album Caravanserai is probably the most terrible recording of all the countless albums I own, but an example of gigantic greed and stupidity is the SACD of Rolling Stones Let it Bleed. The master is from 1971, and does certainly not contain any shred of music worth converting to any other format. The CD layer sounds just like the LP, but the SACD layer is utterly useless.

But as consumers we are left with whatever pleases the suits in the industry, who are most likely smirking because they can keep squeezing money out of oblivious buyers. They should be sued for what they do to the music, the bastards.

Luckily, having a great stereo system takes away some of the anger and frustration, because plain redbook CD's can sound amazing as long as the involved sound engineer has got something other than air inside his head, and thank God, there are a handful of them.

My system consists of: Accuphase E-370 integrated amp with the DAC-50 board, Dynaudio Contour 20 speakers, and Oppo UDP-203. My old and faithful turntable, a modded Luxman PD-121 id about to get another upgrade, as I have ordered a new tonearm: Audiomod Series VI. It will start out carrying a DL-103R cartridge, but I plan to upgrade that to a Hana ML later on. For now I am looking into perhaps going the SUT route when I have listened carefully to the new arm/cart combo.

With this equipment, who needs hi-res?
ReplyDelete
Replies
WVAudio7 February 2021 at 08:55
Very nice article on a subject that is generally little discussed. I have investigated this topic too and I have compiled some examples of issues that can be problematic in Hi-Res. You will find the results in 2 articles of my blog: https://wvaudio.blogspot.com/
ReplyDelete
Replies