Sunday, 14 May 2023

RESULTS: Internet Blind Test of 24-Bit vs. Dithered 16-Bit Part Deux - Daft Punk Edition


Well ladies and gentlemen, the time has come to reveal the results of the 2023 24-bit vs. 16-bit Internet Blind Test as laid out in the post: "Internet Blind Test: 24-Bit vs. Dithered 16-Bit Part Deux - Daft Punk Edition!".

This test was launched on March 4th and remained open to May 5, 2023; plenty of time I trust for folks who wanted to perform the test to listen and give me their results. As indicated in that invitation post, this test was created as a response to discussions on a message forum where it was said that 24-bit audio could be audibly differentiated from 16-bit files. It was offered in the discussions that the Daft Punk track "Giorgio by Moroder" from Random Access Memories, an album well-known to audiophiles, is an example of a well-produced, modern recording that benefits from 24-bits "high resolution". For years, it has been available for download at places like HDtracks, and Qobuz.

In this post, let's discuss how the test samples were created, reveal which sample is the high resolution version, and results from the respondents. Grab a drink, let's have a peek... ;-)

0. Let's talk about how I created the two samples for testing.

I started with the 24-bit official hi-res download of Random Access Memories off HDtracks at 24/88.2, and selected a portion about 2:30 in length which should be long enough for adequate A/B testing consisting of a mixture of human voice, background incidental sounds, and music. It was very convenient that this track had such a diversity!

I then used iZotope RX 10 Advanced, well respected and mature audio editing/restoration software, to downsample 88.2 into 44.1kHz and 16-bit dithering with settings I use regularly.

Here's a look at the iZotope "Module Chain" along with specific settings applied in this test:

iZotope module chain consisted of the "Resample" and "Dither" functions.

As you can see, I applied steep, linear phase resampling (200 steepness, 1.00 pre-ringing which is linear phase), and the dithering applied was a "Normal" amount with a gentle "Lightest" level of MBIT+ noise shaping which provides subtle improvement to the noise floor at the expense of slightly higher noise level <1kHz and >15kHz; a very reasonable compromise given human auditory sensitivity. Noise shaping is commonly used in commercial CD releases like Sony's Super Bit Mapping which I believe has stronger amount of noise shaping than this.

You might be wondering why I downsampled to 44.1kHz. First, this is to reduce file size for download by 50%. Furthermore, I was able to hide the fact that a small amount of noise shaping was added to the dithering process. Random Access Memories is basically a 48kHz recording as it has little musical content >24kHz and the noise shaped dithering would have shown up quite obviously in the 16-bit version if I didn't downsample.

A peek at what the original 24/88.2 FFT looks like during quieter portions of the music.

This is a bit-depth resolution blind test; therefore, as long as we keep the samplerate constant, whatever difference heard would be a result of the change from 24-bits to dithered 16-bits.

One more thing. In order to further keep the test "blinded", I converted the 16-bit dithered file back to 24-bits so both files would be feeding the DAC with 24-bit data and the 16-bit version would be padded with "0"s in the lowest bits. Plus, I added a very small amount of white noise in the 24th lowest bit in the 16-bit version. The reason I did this was to prevent someone from easily cheating by loading up the file into an audio editor like Adobe Audition and looking at the amplitude statistics:


In an open, Internet-distributed, blind test like this, it's important to close off potentially easy ways to "cheat" without affecting audibility. As you know, state-of-the-art DACs these days can achieve dynamic range up to 21-bits so the 24th-bit noise would be well below the noise floor for any consumer/audiophile device.

One other good "side effect" of adding that very low level 24th bit noise to the 16-bit file is that the FLAC algorithm has a harder time with lossless compression, resulting in larger file size. It would be very easy to just look at the file size of the 2 samples and say the 16-bit one takes up less space!

As far as I can tell, I don't think there was "cheating" going on that would grossly impact the results although I know some folks analyzed the files with various tools over the last couple months.

Whenever we do blind/controlled testing, it's essential that the amplitude be matched. No problem with this test:


Everything is an exact match with 0dBFS peaks and average -11dB RMS amplitude. This also allows the user to try something like the ABX Comparator Foobar plug-in for testing. While the average DR is measured at 8, as I mentioned in the test invitation, the first part of the track where Moroder is speaking and you hear some incidental noises in the background (restaurant noise and chatter) actually averages DR11.

Here's the unblind: Track A = 24-bits, Track B = 16-bits.

To keep this as "blind" as possible, I did not monitor the results during the course of the survey (heck, I was busy going away for a few weeks anyways ;-) just in case I might say anything in the forums to sway respondents. I trusted that there were no major issues...

So, with that preamble, let's get to the listening test results!

I. Who responded?

In total, I received 121 responses for this blind listening test. This is pretty normal based on previous experience. While I can obtain >1000 responses for a detailed survey like last week, once it gets more complicated, the numbers drop off given the "work" that's needed to load up the files and take the time to listen. I appreciate the work that everyone put into this; thanks to those who also sent me their ABX logs and detailed descriptions of multi-sitting testing regimens!

As you can see in the map above, most respondents were from North America or Europe. Top countries were USA (26), Canada (13), UK (11), Germany (7), Netherlands (6), France (5), Switzerland (4), Russia (4), Italy (4), and Czech Republic (4) in the top 10.

I suggested that respondents could tell me about their systems and they were very open with sharing this information. Let's have a look at the kinds of hardware listeners used:

Headphones: Earsonics Corsa, Sennheiser HD650, Etymotic ER4XR, Dunu Vulkan, AKG K371, Beyerdynamic T1, Dan Clark AEON Noire, Westone W80 V2, TRUTHEAR x Crinacle Zero, HiFiMan Arya Stealth, Dan Clark Ether CX, Sennheiser HD600, Sennheiser HD580, AKG K702, AKG K240, Sennheiser HD800S, HiFiMan HE400i, Sony MDR-7506, Sennheiser IE 600, Meze Liric, Sennheiser HD 6XX, Philips SHP9500, Denon AH-D7000

Speakers: Canton Reference 3.2.DC, ScanSpeak Revelator custom build, Dynaudio LYD 48 + Dynaudio 18 sub, Magnapan MG 1.7, Q Acoustics 3050, KRK RP-5 G2, Dynavoice Definition DF-5, Martin Logan Aerius, KEF Reference 207/2, Kii THREE, B&W DM603 S2, Linkwitz LX521.4, Elipson Facet 6BT, Wharfedale Diamond 12.4, ELAC FS 409, Genelec 8010A, ELAC BS41, ProAc DT8, KEF LS50 Meta

Amps (headphone and speakers): Hidizs S9 Pro, Burson DAC/headphone amp, Audiolab 6000A, Electrocompaniet ECI 5, Denon PMA-707, Meier Audio Corda Classic, NAD C316BEE, BVAudio PA300SSE, Topping L30 II, Anthem STR, Levinson No.23.5, Bryston 4B, Hypex NC252MP, iBasso DX200, JDS Labs Element III, Luxman L-505, Sound Blaster X7 LE, Cambridge CXA 80, SMSL SP400, NuPrime IDA-8, Hypex NC400, Purifi Eigentakt, Topping LA90, Schiit Jotunheim, Schiit Heresy, NAD C298, Graham Slee Solo Ultra-Linear DE

Streamers/DACs: Bluesound Node2, Auralic Vega DAC, Innuos Zenith Mk2, Topping E50, Cambridge CXN v2, Mojo 2 + Poly, FiiO X3 GenII DAP, Audiolab M-DAC, Topping D10s, Topping D70, Topping D30, SMSL DO100, Bluesound Node2i, Linn Numerik, Schiit Bifrost 2/64, Cambridge CXN v1, Soncoz SGD-1, MOTU M2, miniDSP 4x10 HD, Logitech Transporter, SOtM sMS-200, SMSL M300 Mk2, Rockna Wavelight, Fosi Audio Q5, Dragonfly Red, Mola Mola Tambaqui, SMSL Sanskrit 10th MkII, RME ADI-2 DAC FS, E1DA 9038D6K, Oppo BD-105D

I think that captures most of the reported gear without repeating the same items. Topping and RME DACs, Dan Clark headphones, and Sennheiser HD*** products were quite popular. There were some DIY speakers and amplifiers as well that did not have enough of a description for me to list. I think it's fair to say that the respondents are hi-fi enthusiasts running this kind of hardware!

I didn't see anyone testing with Apple EarPods/AirPods or Beats headphones this time around. ;-)

II. Overall, which of the two tracks did listeners prefer? And how many had no preference?

Wow. Look at that. Almost evenly divided into 1/3's! 38 preferred Sample A (24-bits), 40 preferred Sample B (16-bits), and 43 had no preference of one being "better" than the other.

Obviously with numbers like that, we're not going to achieve statistical significance between those who chose Sample A and Sample B. Already we can surmise that even if there is an audible effect between 16-bits and 24-bits, a large number - 36% - felt whatever difference heard, if any, did not warrant a preference.

III. How much difference did the listeners find?

Let's see if we can go deeper by asking listeners to tell us how much difference they thought they heard:

We already know that about 1/3 reported no audible preference above so it's no surprise that on a 0-10 scale where 0 is "no difference" and 10 is "obvious difference", it would be skewed towards 0. Notice that the number who chose "0" was 27%. Since 36% indicated that they had no preference, this means that 9% heard differences but were not significant enough to pick one they thought sounded "better". This is not unexpected. IMO, often even between something like amps, one might have the impression of audible differences, but it could be hard to choose one as a clear "winner".

Of the 64% of listeners who chose either Sample A or Sample B, let's look at the distribution of how much difference they subjectively heard:


Good to see that there's internal consistency as none of the ones who made a choice between A or B chose "0". Notice the skew towards the left though as 66% scored their perception of audible differences at <5/10; and the weighted average below 4/10.

Some reported hearing more of a difference hence the "bump" at 7-8/10, and another group at 10/10. Let's have a look at those who were more confident and chose 6+/10. Given the subjective impression of hearing more difference, which of the samples did these 22 respondents think sounded "better"?


That's interesting! Those who thought they heard more of a difference (6+/10), actually preferred the 16-bit dithered version. I found this surprising.

Statistically, assuming a 50:50 probability for either Sample A or B being chosen if random, the chance of achieving at least 14/22 preferring the 16-bit sample is 14% (p=0.14). Not "statistically significant" (p < 0.05 is typical threshold), but that's certainly an interesting trend which could have been significant if there were more samples and this proportion held.

IV. Between headphone and speaker listeners, any difference?

For this test, I see that slightly more listeners (53%) used wired headphones:

This is reasonable given that I think most audiophiles know by now that 24-bit vs. 16-bit would not be an easy test and headphones can be more intimate and resolving for these kinds of assessments.

One user used wireless headphones, reported that it was "very hard to tell difference", and selected Sample A as "better", which was indeed the 24-bit sample.

So, what were the preferences for headphone vs. speaker listeners?


Again, interesting! The speaker guys were pretty balanced in choosing either the 24-bits or 16-bits sample, but it was the headphone users who pushed that tendency towards picking the 16-bit sample as sounding "better". Also, many more of the headphone users conceded that they had no preference.

I suppose we could speculate on a number of hypotheses as to why this happened. For now, let me just leave you with the numbers to consider.

V. What characteristics of the sound seemed to change between the two samples?

Suppose we looked at just the 38 respondents who preferred the hi-res 24-bit sample (Sample A), were there any specific sonic characteristics (eg. noise level, bass quality, soundstage...) the listeners thought changed?

As before, we're using a 0-10 scale going from "no audible difference" to "substantial" audibility with each color representing specific sonic characteristics for the listener to consider: noise level, vocal quality, bass quality, treble quality, soundstage, and perceived resolution. As you can see, differences were not identified as large given the skew towards the left.

Of the characteristics surveyed, notice that few thought "NOISE level" changed much between their preferred 24-bit file and the 16-bit sample. Note the very low weighted average of only 0.47/10 for this characteristic. Since in principle it's the noise floor (and dynamic range) that changes most when we go from 24-bits down to 16-bits, the implication here is that noise level / dynamic range of a 16-bit file was clearly adequate when listening to the Daft Punk track for those who still preferred the 24-bit sample.

Of the other characteristics, the sense of "perceived resolution" variation (weighted average 2.89) was rated slightly higher than the others.

On the other hand, how about the 40 respondents who preferred the 16-bit sample, what characteristic differences did they describe hearing?
Again, the listeners did not indicate hearing that the noise floor changed much between 24 and 16-bits. On average, these respondents thought there was more of a change in "perceived resolution" even though the preference was for the dithered down 16-bit sample.

As an exercise in consistency, let's look at the profile of folks who had no preference for which sample sounded better (n=43):
As expected, these respondents generally did not hear a difference between the two samples (lots of "0"). However, there were "blips" suggesting maybe they still thought they could hear audible differences here and there, but ostensibly not enough to declare either the 24-bit or 16-bit file "better" sounding.

VI. Were there any golden ears?

As with my other blind tests, I look for evidence of extraordinary individuals. :-)

There were 2 respondents out of the 121 who selected Sample A (24-bits) as preferred and at a level of certainty of 10/10! 

One person submitted the response from Latvia but did not leave any details for me to report on unfortunately. The other person submitted from Slovakia, assessed the sound using headphones, stating that he (presumably male) heard significant differences in vocal, bass, treble, soundstage, and resolution, and left a detailed message (emphasis mine):
Hi,
I am a highly non-standard participant. Since you don't know me, I'll write something about me. My lineup is extremely beyond anything that people can even imagine.

I mean the quality of the reproduction, not the price. My hearing is extremely superior to anything humans can imagine. To eliminate this, I first used commonly available, not expensive, but high-quality headphones. The differences were immediately and easily audible. Then I used the best audio system I have. The differences are huge. It's like comparing a VW versus a Ferrari. Space, reverberation, separation of tones, airiness, impact and so on.

I can bet a million dollars that if someone plays me one of those songs, I'll tell them correctly which one is playing. This implies that you chose an excellent sample, or that the dither was not optimal.

So that you can verify me in the future, my digital ID is HEEKH44A-8W7SUOVFVR3SD7QQDN4KLKF2. It's the equivalent of the public key in the communications app I use.

If you want, you can publish my answer.

P.S. A is absolutely unquestionably sonically superior to B. Period.
What can I say!? Assuming we're not looking at a very sophisticated AI analyzing the resolution of the data (can't be sure these days!), the man's got cojones and he gave me permission to publish the comment. Well, he's right. I therefore award this respondent with the "Golden Ears Award". Great job!

Unfortunately he did not add a full description of what hardware he used (other than "high-quality headphones") for the listening test so maybe he can leave a comment below along with some tips as to whether there were portions in the music he specifically used to differentiate the samples.

Congrats! In recognition of our golden ears! :-)

"I can bet a million dollars that if someone plays me one of those songs, I'll tell them correctly which one is playing." is a pretty bold statement, so someone might want to take up this wager. :-)

Now to be balanced, I looked in the data and found 2 listeners who preferred the 16-bit sample and also rated the audible difference as 10/10. One thought that the 24-bit version had "irritating, rough edges" when listened using high quality IEMs. Another person works as an audio engineer and used very high quality speakers that cost close to US$15k/pair and felt that the subjective sense of "resolution" was better with the 16-bit sample.

Conclusion...

Overall, looking at the results as a whole, I simply could not find evidence that there was a bias among the 121 audiophile respondents towards the 24-bit sample using this Daft Punk Random Access Memories track compared to the iZotope RX dithered 16-bit version.

If anything, there seemed to be an interesting tendency for the headphone listeners to choose the iZotope RX 10 dithered 16-bit file as the "better" sounding one although this was not statistically significant. I can only vaguely speculate as to why this was the case. (Interestingly, in the 24-bit vs. 16-bit test in 2014, we saw one track where listeners also seemed to have a slight preference to the 16-bit version.)

Even when listeners felt they could hear a change, their ratings for magnitude of difference tended to be small, with few listeners rating >6/10 (where 10/10 defined as "obvious difference"). Audible differences therefore were likely better described as subtle than overt.

As usual with previous blind tests, within the data set, we can find the rare "golden ears" as per Section VI. It's good to find traces of these individuals with potentially very sensitive hearing since these are the folks who should be invited for further study and to ascertain the reliability of their hearing abilities.

In this blind test, the music used is quite different (modern electronica / pop), and the procedure has changed (much simpler with 1 track, and an option for "no difference" heard) compared to our classical music 24-bit vs. 16-bit test done in 2014 using 2L demo tracks. Nonetheless, results remain similar in that in the big picture, there was basically no audible preference for the higher resolution 24-bit version.

While intellectually we can still desire the "best" 24-bit resolution file we can get for our favourite albums, this test adds to the body of evidence that suggests the absence of audible benefits with "high-res" audio. Other tests like Mark Waldrep's comparing not just 24-bits but also increased samplerates (2020) have not been able to ascertain meaningful audible superiority for hi-res (also see the meta-analysis from Reis in 2016, not impressive results as discussed previously).

I think over the course of time as we audiophiles have had more experience with hi-res content, the suggestions to be critical of expectations (such as this from 2014), as well as other discussions we've had on the blog, have proven to be true. I remain "post hi-res" these days and keep 24-bit tracks only for the best sounding albums I own. While I don't think most albums need to be >16-bits, high-resolution 24-bit DACs can still be valuable when we do things like perform DSP and apply gain control. Likewise, in audio production 24/32-bits should be standard these days to achieve the best final mix with least unintended distortions.

For those who might have missed it from the original test invite, here's the 8-bit version of the Daft Punk sample for a listen. Notice the obvious difference between 8-bit and the 16/24-bit versions. This is what lower bit-depth sounds like. There is an increase in the audible noise level as the dynamic range drops, with no significant change in frequency response, general musical tonality/timbre, "speed" of the sound, bass content, or soundstage.


We can see in the graphic above the dithered noise floor comparing 24-bits, 16-bits, and 8-bits. The non-flat 16 and 8-bit noise floor is reflective of the small amount of noise shaping added using iZotope RX in this test.

Finally, a little Public Service Announcement for audiophiles and some practical considerations because ultimately, bit-depth is about dynamic range. And when we're listening to music, this correlates to the range of sound pressure levels (SPL) we can hear:

Make sure not to listen too loud! Protect your hearing in order to achieve a lifetime of hi-fi enjoyment.

For my own listening, I don't push the average sound level above 80dB(A) SPL.  In the quietest domestic sound rooms, we'll be very fortunate to achieve 25dB SPL ambient noise. IMO, 100dB peak SPL for transients is more than enough for enjoyment if your speakers are able to hit this level without gross distortions.**

Assuming we listen at an average level of 80dB SPL, 100dB peak would be 20dB above, adequate for some of the most dynamic classical recordings. This implies a dynamic range of ~75dB in basically an ideal setting (100dB peak - 25dB ambient noise); lesser settings like in a car, 50dB range might be all you can achieve comfortably which might be served by the 48dB dynamic range of 8-bit digital.
** Note 1: We can estimate how much power is needed to drive our speakers. If we want to achieve 100dB SPL from a speaker with 90dB 1W/m in-room sensitivity, allowing for 3dB amp headroom so there's no clipping, seated a comfortable 3m/10' away, we'll need around 180W. Since loudness is logarithmic, 93dB/W/m speakers would need 90W to hit 100dB. Here's a calculator to play with. This is just an estimate though, so I would still recommend doing your own measurement.

Note 2: See CDC chart on loudness and risk to hearing. 50 minute exposure at 95dB SPL can lead to damaged hearing. Constant 100dB SPL exposure risks hearing loss after 15 minutes.
Comparatively, the best dynamic range that vinyl LPs achieve would be around 70dB assuming an excellent cartridge and supporting playback gear which is why we can hear the noise floor with vinyl playback. Don't believe the idealistic vinyl lovers who say they "never" hear noise or hiss - they're either exaggerating, listening at below-normal levels, have poor hearing, or listening in a room with high ambient noise.

As a consumer product, 16-bit digital audio data (CD-resolution, download files) already provide 96dB dynamic range for audio engineers and musicians to exploit. As audio consumers, I think we should be very thankful that Sony/Philips chose to standardize on 16-bits when they created the CD!

--------------------

For the record, to be fair, since I ask others to submit their details, here's my Foobar ABX result listening with Sennheiser HD800 headphones, using the Drop+THX AAA 789 amp, fed by Topping D10 Balanced DAC:
foo_abx 2.1 report
foobar2000 v1.6.16
2023-04-05 17:35:44

File A: 01 - Daft Punk - A - Giorgio by Moroder (16-bits or 24-bits).flac
SHA1: 02d9134e168d7946004a9be93e8c4477544e415a
File B: 02 - Daft Punk - B - Giorgio by Moroder (16-bits or 24-bits).flac
SHA1: fc5e0adac56cd504c7fd9c48acfe00fec593680b

Output:
Default : Speakers (2- TOPPING USB DAC)
Crossfading: NO

17:35:44 : Test started.
17:36:27 : 01/01
17:37:07 : 01/02
17:37:44 : 01/03
17:38:15 : 02/04
17:40:11 : 02/05
17:40:40 : 03/06
17:41:36 : 03/07
17:42:17 : 04/08
17:43:04 : 04/09
17:44:03 : 04/10
17:44:47 : 05/11
17:45:52 : 05/12
17:46:38 : 06/13
17:47:47 : 06/14
17:48:40 : 07/15
17:49:13 : 08/16
17:49:13 : Test finished.

 ---------- 
Total: 8/16
p-value: 0.5982 (59.82%)

 -- signature -- 
5c15a09ccd0957c61dd8b97967690986d680751a

Statistically, the result is literally a "flip of the coin" and I would have selected "No preference" for this Daft Punk track as I'm well aware that I was basically guessing.

---------------


IMO, the whole 24-bit vs. 16-bit debate I think is quite clearly resolved. To me, where this still might matter is with audio streaming. While I am a proponent for lossless PCM since there could still be substantial differences compared to lossy even if psychoacoustic techniques minimize audibility, I'm not sure whether 24-bits or >48kHz is worth it given the increased data rate. Another of those "diminishing returns" situations. If streaming lossless 16/44.1 or 16/48 allows a company to lower costs for consumers while investing in more multichannel content, I think that would be a meaningful path. 
[Although we have been waiting quite awhile already, I am curious what direction Spotify HiFi might take when it eventually becomes available, or if TIDAL might make changes as it sheds MQA and evaluates the tier pricing.]
I see that Random Access Memories 10th Anniversary is out now, mainly featuring some bonus and outtake tracks. I've also recently finally heard the lossless Atmos version of Dark Side of the Moon (50th Anniversary) - the spatial effect is fantastic on a multichannel Atmos system, instruments seem to emanate and sound effects like the clocks ("Time") and cash register ("Money") anchored in defined space floating at various height levels.


Also catching up on some Enigma recently with the most recent release The Fall of a Rebel Angel (2016, DR8). Some pretty cool tracks on there like "Mother" (low bass), "The Die Is Cast" (neat sounds and special effects) and "Amen" (pretty cool tune). "Sadeness, Pt. 2" utilizing Bach's "Toccata and Fugue in D minor" sounded a bit silly to me though, and doesn't rekindle how special the first "Sadeness/Find Love" combo sounded on MCMXC A.D. back in 1990.

Hope you're all enjoying the music!


Addendum: May 15, 2023
Just thought I'd show a comparison of the 16-bit noise floor when we use Audacity 3.3's "Shaped" dithering option going from 24-bit to 16-bit as per the comment from Stephan_M:


Notice that Audacity's "Shaped" setting uses stronger noise-shaping than the iZotope RX 10 setting I applied for this test. With noise-shaping, there's a give-and-take. For Audacity, they've prioritized <6kHz to achieve the lowest perceived noise floor at the expense of higher noise further up, especially above 13.5kHz.

35 comments:

  1. Fascinating considering I upgraded the computer I did this on and lost files. Can't recall but think I chose 16 Bit on headphones. Anyway suffice to say in terms of 24 bit I'm like you and use for best albums. In terms of remastered albums at 24 bit these are different as one can hear changes more easily although some are better and some worse so also a mixed bag.

    Your last piece on noise levels is very interesting. I'm like you run my home system at around 78 to 85 db. We attend live music as often as we can. In terms of live small venue noise most blues and rock currently run at 89 to 110 db.

    Considering I suffer industrial hearing loss I find this too loud for comfort. It seems old blues rock guitarists(in particular) in my view must be suffering hearing loss feel we like it louder. My complaining falls on deaf ears and subsequently attend less and less.

    Attending a Beth Hart concert 5 years ago the levels were around 120 db. Left after 20 minutes. Average age was about 60 years, females were wincing. Beth's acoustic set was great but on the band arrival onstage the volume went through the roof.

    The two best concerts(large venue) for sound quality and levels were Joan Baez and Leonard Cohen I've attended in the last 10 years.

    ReplyDelete
    Replies
    1. Thanks for the note Robocop, and highlighting the discussions on SPL.

      Admittedly in my younger days (20s and 30s), I attended more pop/rock/blues concerts. These days, it's mostly for "nostalgia" or with family/friends that I attend live concerts (like say Elton John last year, or Ed Sheeran later this year with my kids), other than classical (have Vancouver Symphony season tickets).

      I generally find live amplified concerts too loud and really hate the ringing and attenuated hearing for a couple of days after these kinds of concerts. Starting to bring ear plugs although it just seems silly paying good price on tix but have to literally put a barrier in place between what I paid for and my senses! Oh well. I've encouraged my kids to make sure to have hearing protection as well and they generally agree that it's also too loud for them.

      Agree that many remastered 24-bit recordings are a mixed bag. I try not to get my hopes up unless trusted friends and other audiophiles have had a listen and say good things a lot of the time before considering to purchase...

      Delete
  2. Hello Archimago,

    Since the results are already published it should be fine to talk about this now. Two years ago I published a software that can check whether a high bit-depth file is dithered from a lower bit depth or not. The initial version is already capable of detecting the files you uploaded. Also, I released a new version last week with even more detections:
    https://hydrogenaud.io/index.php/topic,114816.msg1026786.html#msg1026786

    ReplyDelete
    Replies
    1. Very cool Bennet,
      Gave the tool a try and indeed the 16-bit version using oldsCool.exe showing the "LSB Trip: 256" comment.

      Thankfully I didn't see any respondents admitting to using the tool with the blind test at least. ;-)

      Cheers!

      Delete
    2. Hi Bennet!
      You created a great program!
      However, if a 44/16 original audio file is upsampled to 96/24, the program does not help.

      Delete
    3. Thanks fgk,

      Resampling would be detected by FFT analyses and there are already a lot of existing software capable of doing this. Naturally, the process of resampling requires a higher bit-depth than the original source and therefore bit-depth (the most extreme case would be conversions between DSD and PCM) can only be expressed in terms like ENOB vs frequency bandwidth instead of exact sample values.

      There are also algorithms like DSEE HX capable of generating nice-looking spectrograms so even traditional FFT analyses cannot tell if the files are upsampled or not.

      Delete
  3. Damn I don't remember what I answered LOL.
    Pity that I didn't receive any summary @ email after answering ;)

    ReplyDelete
    Replies
    1. Hey Milan,
      E-mail me and I'll look in the data for you. ;-)

      Delete
  4. In view of the fact that more and more artists and streaming services are offering their music in high definition, and this test clearly showed that 95% of people don't care whether the music is being offered in 88.2kHz/24bit or 44.1kHz/16bit , I would like to shrink my music collection. Unfortunately I only have Audacity as a program. iZotope RX10 is definitely too expensive for me at $1200.
    I probably wouldn't even be able to use it.
    If you could create a guide on how to do this that would produce similarly good results with Audacity I would be very grateful.

    ReplyDelete
    Replies
    1. Hi Stephan_M,
      I'm not an Audacity guru but what you do is dither down to 16-bits as per this, selecting the "Shaped" dither if desired:
      https://manual.audacityteam.org/man/dither.html

      And then export the audio ("File > Export > Export Audio...") as FLAC with 16-bits. Lemme have a peek at what Audacity shaped dither setting does later...

      Check this article out for Audacity batch processing when you have directories of files:
      https://garrysblog.com/2021/04/12/batch-processing-files-with-audacity/

      Delete
    2. Just put in a "bonus" graph Stephan to show the noise floor with "Shaped" dithering in the current version of Audacity.

      Delete
    3. If there are many files to convert, I suggest using SoX.

      Delete
  5. Interesting. I had a very slight preference for A, so I will probably continue to go ahead and buy 24-bit recordings just because the price difference is usually small and storage is cheap. (Also, BIS and other labels release MCH on 24-bit only.)
    Why not? I grew up when recordings and playback were decidedly inferior to my perception/discrimination and I'm pleased to think that sources are now better than what I can distinguish. Limiting factors are now mostly transducers (speakers and microphones, the latter not being discussed enough imho.)

    ReplyDelete
    Replies
    1. Greetings Phil,
      Yeah, I think transducers (and rooms) have been for a long time and likely will remain in the foreseeable future the most challenging fidelity-limiting products in our home systems.

      I think if one is buying music like BIS Records' classical material known to be recorded and produced to high standards, absolutely... Get the 24-bit version! Can't really go wrong and it'll support the artists.

      More run-of-the-mill rock, pop, R&B, dance, electronica, indy, metal, etc. IMO more likely than not does not require hi-res. I maintain the general rule of thumb that if the DR is not ≥12, I probably won't seriously consider keeping it at 24-bits. Silly and counter productive I think to keep things at 24-bits if on purpose the recording has been squashed of dynamics and we know that the change from 16-bits to 24-bits is to improve potential dynamic range!

      BTW, this of course includes stuff from Neil Young because he's such a vocal advocate of hi-res for years ;-). I checked out the 24/192 download of the 50th Anniversary Harvest this weekend. No question about it... That is not "high-res"-worthy and can easily be converted to 16/48 without fear of missing anything whatsoever. (Heck with some of the ultrasonic noise, it might actually sound better to filter that stuff out just in case nonlinear distortions might cause effects in the audible frequencies.)

      Delete
  6. Interesting results! FWIW, the correct expression is "the man has cojones" (not "cajones"). The word cojones means "balls" in Spanish.

    ReplyDelete
  7. Hi Archimago. I may have to relinquish my ‘platinum ears’ that you awarded me in 2017 since I voted for B… Probably a time limit in this department. :-(

    If I may suggest a psycho-accoustic reason for the slight prevalence of B choices, it may simply be the normal order of listening : I didn’t know this music, so I sampled it from the A version, and when listening afterward to B, I noticed some more details, thus finding it somewhat better. Then I got stuck in preferring B with successive comparisons.

    So, in order to not cheat, I submitted my result, and then tried a foobar ABX test. I got twice 6/10 for B (p-value 0.377), so not significant but still consistent with my choice. After that, I subtracted both files in Audacity, got a straight line silence (as expected) then normalized that to 0 dB and got a very smooth white noise with no musical content…

    Regarding Dark Side of the Moon, I don’t know if you are aware that the streaming Atmos version was available on Amazon and Tidal only on the launch date, after that it was pulled, and many suggested that Apple paid for an exclusive right. I had a chance to listen and even the lossy version sounds extraordinary, especially the guitar solo in Money that seems to come from everywhere at once! Shame on Sony to do such a thing, use the streaming version mostly as a teaser to buy their overpriced set. As someone who cares about value for the money, I guess someone lent you the Blu-ray… ;-)

    ReplyDelete
    Replies
    1. Hey Gilles,
      Sorry to hear about the expired "platinum ear" card ;-).

      Yeah, the position of which track is A/B can have a "precedence" effect depending on psychological state; usually with a slight preference to "B".

      Interesting about the streaming version of DSOTM pulled except Apple. I wonder what the stats look like for multichannel/Atmos streaming demand. Might be meaningful if Apple is pushing for exclusivity in this way.

      Better luck next time for the Golden Ear award friend. ;-)

      Delete
  8. Ok, so the bottom line is if you claim you can hear a big difference between a 16/44 and a 24/96 file, probably, in the words of the Rolling Stones "It was just my imagination/Running away with me." I can accept that. I sometimes think I can hear some differences, but nothing I would characterize as significant, and I'm open to accepting that it's merely a placebo effect.

    ReplyDelete
    Replies
    1. Greetings Phoenix,
      Yup. Bottom line. At best it's subtle (except for the Golden Ears winners of course ;-).

      At least with Random Access Memories, 24-bit resolution conveyed no real benefits.

      Delete
  9. Great investigation as always, Archimago! I found it interesting that the "Giorgio On Moroder" track is so highly regarded for sound quality but (only) measures at a DR of 8. Not being familiar with Random Access Memories myself I looked the album uo on the Dynamic Range Database
    Interestingly the digital releases, both CD and download have significantly lower DR's than the vinyl?! Not sure why the disparity between formats but have found this to be the case with releases from other artists, e.g., Adele's "25" had a DR <8 on CD but, IIRC, a DR ~12 for the vinyl rip.
    Not sure whether your bit comparison testing might have been more enlightening and/or clearly divided if a higher DR version had been used?!

    ReplyDelete
    Replies
    1. ADDENDUM Happened to get access to the 10th Anniversary release. Unfortunately the DRs are no better than those of other CD/digital releases and "Giorgio by Moroder" has a DR = 7 (JRiver analysis).

      Delete
    2. The DR meter doesn't really work for vinyl. The spikey waveform of vinyl gives a higher reading even if the production is from the same master. Then there is the question of the vinyl rip and the differences between turntable set ups and cartridges. The video below explains it well.
      https://productionadvice.co.uk/tt-meter-not-for-vinyl/

      Delete
    3. The Ian Shepherd video was quite revealing and compelling. Thank you. Given his objectivity he is certainly cut from the same cloth as Archimago! :-) This video shows him "remastering" another Daft Punk track...
      HIs closing statement from the first video does say it all... "If you want to find out if the vinyl will sound better, your best bet is to listen." I would suggest replacing the word "better" with "preferable"!
      That being said, although I have personally done a CD vs vinyl rip comparison of Giorgio by Moroder, I can say that the vinyl rip of Adele "25" is unquestionably better than the CD release.

      Delete
    4. Sorry! I omitted the link for my reference...

      Delete
    5. Hey BigGuy and Prep,
      Indeed, applying DR measurement to vinyl rips even of the same mastering typically results in higher numbers. A few years ago, I wondered about this in the context of using de-clip software:
      http://archimago.blogspot.com/2017/08/musings-increasing-dynamic-range-of.html

      And also the mechanical process of vinyl cutting and playback probably "rounds off", maybe extends (inertia?) some of those clipped peaks.

      Funny no vinyl lover ever questions the quality of the DAC used in the studio to cut the vinyl with when the source is digital!

      As usual, while I can appreciate and enjoy vinyl collecting, the sound quality usually leaves me disappointed.

      Delete
  10. Did you gather respondents' audio hardware configurations, or just the hardware identification?

    On a Mac, at least, I'm pretty sure the default is for audio hardware to be set to 16-bit. That means any audio sent to such an output device gets automatically downsampled to 16-bit. I would expect 16-bit, 24-bit, and 32-bit to all sound the same on such a configuration—maybe with a slight bias to 16-bit since the automatic downsampling probably isn't as high-quality as the one you used.

    It's possible to change it, or check the current configuration, in the built-in Audio MIDI Setup application (which, yeah, means it's kind of buried). I don't know what the equivalent would be on Windows or Linux.

    Anyone who participated in the test (especially if your answer was “they're about the same”) and hasn't already de-blinded themselves might want to check their audio configuration and retest. If your output device is set to 16-bit and not 24- or 32-, you're not using your hardware to its full capability, and the 24- and 16-bit samples sounding about the same is to be expected.

    (I would also recommend making sure your sample rate is set to 44.1 and not 48 or 96 kHz, since the sample audio is 44.1 kHz. Make sure resampling isn't a confounding factor. Every audio device should support 44.1 since that's what CDs use.)

    ReplyDelete
    Replies
    1. Hi Peter,
      Yeah, certainly if users are not setting their gear correctly, 16-bit is often the default. In the Windows world, users should at least make sure to use 24/32-bit output, or playback with ASIO/WASAPI for bit-perfect output.

      And yeah, would be nice for folks to keep at 44.1kHz so as not to resample which might be done poorly by some OS/machines.

      Alas, as an Internet test, there would be no way for me to ensure that proper "hi-res hygiene" was done. I trust that most listeners who tried this are savvy with hi-res playback after these years. For those who used dedicated audiophile streamers/computers, I trust they would have already been pre-set for 24-bit playback.

      Delete
    2. On good DAC (and ADC), 24 bit recording has obviously more details and better SNR that dithered 16 bit track. Of course somebody may prefer sound of 16 bit dithered version. And, 16 bit dithered is also OK, since you do not lose anything big, and on many recordings there is not much content in those additional 8 bits. So where space/bandwidth allows, I prefer 24 bit version (also when offline upsampling), but 16 bit dithered is no big loss.

      Delete
  11. By the way, recently I started to prefer longer filters for 44.1/48 kHz upsampling, e.g. 10-14 percent of the bandwidth starting at 95 percent passband, at the expense of some aliasing.

    ReplyDelete
  12. I see the "peak amplitude" of samples reviewed is near 0dB. This is the most difficult case to distinguish between 16 and 24 bits, because full range of 16 bits is used. This is not always, or even often, the case in real life. To be fair, samples more representative of the average situation whould have been preferable. In my sense, 24 may be useful and distinguishable when there is real time signal processing (and this is more and more the case, with Roon for example, and in the audiophile world who cares about 24 bits) : parametrical equalization, LUFS normalization etc.. In this case, when the signal is amplified, low significant bits matters. But with 0dB peak amplitude samples, there is little (no) room for DSP, so we can't use these samples as in the real life..

    ReplyDelete
    Replies
    1. Sample peak at or near 0 dB is extremely common in modern recordings and remasters.

      The best 'music' for possibly hearing 16 vs 24 is when the signal is very low level, as in fadeout tails. But the playback level has to be cranked to absurd levels to do so, which isn't at all normal listening. 16 vs 24 bits at the consumer end is just another molehill made into a mountain by 'audiophiles'.

      Delete
  13. I confess I don't understand why you downconverted from 88/24 to 44/16, then upconverted to 44/24. Why not either 1) downsample from 88/24 to 44/24 and also downconvert from 88/24 to 44/16, and compare those; or 2) downsample from 88/24 to 44/24 , and then to 44/16 ?


    ReplyDelete
    Replies
    1. (to be clear, I don't expect any audible difference between any of these except using extraordinary playback levels of quiet moments)

      Delete
  14. Wao Its amazing article, What an interesting board, also many thanks for so in depth review! If you don't know to change Graphic card on laptops then you can also read my this blog How to change Graphic card on laptops

    ReplyDelete