Friday, 4 July 2014

24-Bit vs. 16-Bit Audio Test - Part III: SUBJECTIVE COMMENTS & FINAL THOUGHTS

Part I: PROCEDURE
Part II: RESULTS & CONCLUSIONS
Part IIa: The 20 Correct Respondents

Welcome to Part III; the last in my reporting of the 24-bit vs. 16-bit blind test. In this segment, I want to spend time looking at the subjective differences as reported by respondents to the survey. I also want to spend a few moments offering a few personal observations and thoughts at the end.


I. Subjective "differences" between 24-bit vs. 16-bit audio?


First, I want to present verbatim the comments I received when the respondents were asked to describe what difference they heard between what they believe as the 24-bit and 16-bit samples:
soundstage, instruments and voices in 3d space, edginess of crescendos, how relaxed I was in listening to the music
Depth, instrument brass and breathing of the voman
nothing specific - overall clarity and air and space
The 24-bit tracks sound more realistic, but the difference is very small. Some tips would be to use the best equipment you can, and do not listen for too long at a sitting.
Space, separation, intensity of instruments
Much more clarity and width/depth
Small parts (10-15 secs) comparison.

24bit was to me which sounded the more colorful
Clarity of sound 
Dynamics, detail. In my experience if tracks are produced and mastered in the same manner there is no significant audible difference in audio format. Differences in producing and mastering are what makes the difference.
Do not listen to a section of any more than ten seconds. Use good headphones and a good DAC.
Detail, space.
Tried to listen for the usual audiophile adjectives, but was ultimately unsuccessful in picking out those qualities.
Use headphones absolutely. Bozza - A Sample seemed more alive and enthralling. Vivaldi - A Sample seemed clearer. Goldberg - B Sample seemed richer and complete.
Listened for imaging, soundstage, clarity, decay of piano notes.
a mix of distortion and resonances
string instruments, background noise "air", human voice neutrality, brass instruments, bells...
Started out listening for/expecting the stuff you specify in Q's 1-3, but mostly could not discern significant differences. In one case, on a different system, I got the opposite answer with the same notes (more "room" and "more resolved".

I went into it not at all confident that I would be able to discern the difference between a properly down-and-upres'd file and the original. Truncating or dithering-down a good quality recording mostly results in removing very little information. Upres can "smooth" things out (e.g. Cambridge Audio 840C - I sold it).

I suspect if this test were done with the sample rate rather than the bit rate, it would be more audible.

I also think a more meaningful comparison might be to simultaneously record a performance in 16 and 24 bit with the same ADC, though my current feeling is that it's best to record and play back at the "native" rates of the ADC/DAC.
Dynamic details, piano string more ohysical reality
Listening for instrument/vocal placement in the soundstage, transient dynamics, general recording ambience (such as reverb decays) and tonal qualities of instruments.
I already played this game with recent recordings that I own in both qualities. The listening is more intense and discerning at night. The morning is the worst moment of the day when everything sounds equally and sadly flat.
The Dynamik, the Clearness, at the End: The Music

The Piano is easy, you hear definetly more mechanical Noises
I tried to listen to dynamic and spatial reconstruction
1. With 24bit more reverberation can be heard.
2. Cymbals or other high tone percussion instruments have more clarity and remain clear when other instruments play loud (e.g. at the end of Bozza)
3. Piano tones have more information in the upper frequency range, especially in the attack.
dynamics and upper mid range 1 khz - 3 khz.
For the first sample it was the cymbals.
For the second and third, it was what sounded more live and natural.
Generally listen for clarity of the highs, separation of instruments and soundstage. 
I'm pretty new to this though.

Really couldn't tell much of a difference in the three samples.
I do attend classical music concerts regularly and just try to compare how the instruments sound on the recordings to how i remember them sounding live - nothing more sophisticated than that.in truth.
I ythought the '24bit' recordings sounded sharper with more precision especially at the top end of the frequency range. But I might be wrong.
small clues such as sound of triangle, fullness of the sound, how fast the transient response is, room ambiance, decay of the piano, etc
I tried to focus on the clarity, on the precision and dynamic of the presentation, but honestly I cannot identify a single difference, even a very subtle one. No way to distinguish from the two versions.
Fullness of sound. Decay of notes.
If I'm right then I would bescribe the 24-bit sound less "spikey" and rounded. Less focused on left/right but more coherent and staged in the middle. I hate to say it, but "more analog" would describe it for me. (If I'm wrong with my choices then I will not buy any 24-bit music ... !!)
The A tracks seemed more life-like and engaging. The B tracks in comparison sounded flatter.
I listened for ringing on piano, realism in voices and "pressence" in all intruments and voices
More "airy", increased clarity and realism.
Smoothness and depth.
I felt the tracks I picked for 24bit had more open dynamics, the 16bit tracks sounded a bit constrained.
Honestly i couldn't tell between them, for "Bozza - La Voie Triomphale" i thought sample A had a bit more fullness/detail, but the other 2 sounded the same to me, if the difference can be heard it doesn't seem my equipment is good enough for me to tell (or my ears i suppose. haha)
High frequencies, airiness, etc
A more dimensional sound, one of the samples usually sounded "flatter" than the other.
I listened to resoluion and depth at upper midrange and high frequencies to specify some differences.

Imagine a bright clear stary night sky. Guess how many gloomy stars our eyes are capeable of.
Which sky might be brighter to your perspective.

What about your reproduction system, will it be able to resolve the higher density transparently and deliver it to your ears.

24bit is nice, not necessary.
This was my first time ever comparing the two. I don't recall listening to a 24-bit recording before.

The 24-bit version, at least what I thought they are, had deeper bass (noticeable in Vivaldi's), richer highs (around 1.5 minute into Bozza's), and crisper notes (Goldberg).
I looked for the air around the instruments and whether the notes lingered a bit longer or ended abruptly (ie not naturally). I tried to also notice the violin, guitar plucks and the piano to see if they sounded more natural.
Sounds closer miked than the 16 bit track, livelier, more rough. The 16 bit sounds more polished. The 16 bit might be a little bit more boring and easier to listen to.
Transparancy
Presence.
more detail overall 

In particular the transients and decays are much easier to follow on the tracks I've identified as 24 bit
richer sound
separation 
realism 
I just felt I detected more detail, "air"
More relaxed, dimensional and resolved.
Nothing particular. I thought it sounded richer.
What sounded better! The last track was the hardest.
Nothing specific...just straight listening and looking for any possible difference whatsoever.
24 Bit: more present, livelier, 3D space, subjectively more dynamical, more resolution of the polyphony
more details of airy sounds. While listening You forget about the media

16 Bit: flat, artificial
I was unable to honestly discern any difference between samples. My markings were a total guess! That has been my previous experience when comparing 16 to 24 bit from the same source.
Mostly imaging and top end clarity.
The sustain of notes and tonal quality
The highs specifically cymbals and sibilance in vocals. That is the usual way of telling a really low bitrate MP3 from lossless. Couldn't tell a difference.
Smoother, more lifelike and natural sounds to the instruments/vocals.
I tried to listen to then deepness of the sound, especially the bass
I listened for 
dynamic range
sound stage/spaciousness
attack and sustain

I am not an experience classical music listener, so not familiar with the instrumentation of these sample audio tracks.

on Goldberg. I preferred B, which I think was 16 Bit
Clarity of tone.
openness of sound seemed to me different.
a bit smoother and less harsh on the cymbals
High resolution; as when listening to a 64 kbit stream and the obvious difference when compared to CD quality
quiet passages
The 24-bit sounds "thinner" because of better separation of instruments, sound is clearer, in particular bass is cleaner and more natural.

Have heard each piece once (no ABXing or similar), and decision was clear. Did a second round to double check, with same result.
I felt the tracks I identified as 24-bit had a *shade* more bass "impact" and less "edge" on the notes (e.g., "rounder", "smoother" sound). It was certainly very subtle in my setup (Geek Out 720, HE-500 headphones).
Smoother and less harsh sound.
vocals, piano warmth trebble
With the first track I listened to the difference between the bass drums. The 24-bit track had a clearer "bang" here and the overall room between the instruments was larger.
The 2nd track was much harder to identify but also on certain passages the 24-bit gave it away through more details and more room.
The third track was the hardest and at the end I identified the 24-bit by the details of the side noise such as the breathing of the artist.
Greater attack, more space between instruments, more tonal colour. The first track had a much grander scale than the second. The third track held the attention much better and allowed the listener to see into the recording more - notes hung and decayed more realistically. On the piano track, the 16-bit sounded as if there was a blanked stuffed inside the piano in comparison with the 24-bit version.
Quietest passages. Headphones.
Too low a bit depth "flattens" the music. It will sound less airy, less dynamic.
I could not identify any differences between tracks. I listened for transparency, potential harshness, resolution and such things.
percussion; voice; extend/sustain
details
originally though i could hear a more realistic sound on piano/voice. i was exactly 50% for all three tracks with 10 trial ABX
At first listening i thought i got it which one is some what "fresher", but after several times played i get confused TBH.
Did not perfome a/b-ing. Got no reason for that. If I can't tell difference within half a minute between two listenings then it is good enough for me.
Was picking sollely by impression of freshness. Not a native english orator so i can't figure more apropriate words.

Must say both Goldbergs is somewhat different sounding compared to my copy of the same recording and that is without a doubt. Unfortunatly got no ma own copy's of other two recordings to make that comparisson as well. Know that was not the question you asked, but thought you might find it interesting enough to mention.
Transients, tonality, space around the music.
no help here. Just listening for some sort of extra clarity for lack of a better word.
I would not expect 24 bit to be different to 16bit, personally, but had a clear preference for A, B and B which surprised me.
Smoother with more air and better intelligibility.
complete
Just initial impressions. Most probably I picked more B tracks since they were the second listen of the musical piece!
Imaging mostly, the ones I prefered had a better, more detailed and stable soundscape. The most obvious (to me) is the Vivaldi. The singer really stand out more in A while in B she is drowned in the accompaniment.

The piano is also more natural and stable in A, even though I hear notes bouncing all over the place in both, a normal phenomenom with a non-point source as the sounboard is, when closely miked.

The band was more difficult, I listened mostly to room decays, and better definition of the instruments.

Of course, I may be completely wrong...
Spaciousness. The sound of the room more than the instruments. The interaction of different sounds.

The difference with the Bozza was vastly more obvious so I assume it may have been a trick question with the lower being deliberately knobbed.

IMHO, the best of 24-bit will come when producers etc. use less dynamic range compression. I look forward to that day.
Maybe a bit fuller and smoother.
Smoothness and presence
track b's were less pleasant to listen to, sounding thin.
fullness of sound. 
soundstage
imaging
smoothness
The A tracks sounded consistently more strident, less focused and less coherent. Vocalist and piano was more diffuse on A tracks. Piano was duller sounding.

The B tracks were slightly more dynamic, more focused and richer in tone. The vocalist was more 3-D.
Upper bass. It is lighter in the 24 bit version.
air, spaciousness around decaying notes and transients
I was listening for any difference in overall listening experience. Did one sound more natural. I was particularly paying attention to the dynamics to see if that extra headroom allowed for greater impact and contrast from the soft to loud sections.I payed attention to noise floor etc. I didn't expect to hear a difference but i tried my hardest to find one and i honestly couldn't . at first i thought track 1 sample B was slightly softer in the quiet sections and had more impact in the louder sections but i used abx test and at best got 60%.
I listened loud obviusly :) tried to hear the ambience but also the 3d space and deep bass and the silence ! if it sounded differently .

But... this is so hard

There were a handful of comments about not hearing a difference which I've left out from the list above since I wanted to focus on the subjective experience of those who believed/heard a difference. Not surprisingly, the respondents utilized the standard lingo of subjective audio evaluation with the typical adjectives (I of course offered some of that when I asked which sample they thought was 24-bit earlier in the survey). There are >80 comments in that list above yet we know overall there is no evidence that despite these subjective impressions, the respondents were statistically able to discern a difference as a group.


II. How "easy" or "hard" was this test?


One of the last questions asked of the respondents was whether they felt the test was easier or harder than expected:

I had no real expectations. I found it impossible to hear any differences on first listen, but after approximately 30 minutes or so, began to sense a bit more air and relaxed presentation with of A samples. It also seemed that I could relax into the music with the As. It will be fascinating to know whether I just made that up in my head or actually heard a difference.
Clearly harder than I thought, although I'm not really used to listening for those small details - I usually just enjoy the music...
harder
Thanks for putting together this test. You obviously put a lot of effort into it. Here's some advice on how to give the results. Document with as much detail as you can the procedure used to create the test files. That way, people can duplicate your test files exactly and show that you weren't trying to cheat them in any way. A lot of trust is required for this kind of test to have any kind of validity, and many audiophiles are paranoid.
Thank you for doing this. Really made me re-think how I feel about high resolution audio. Going into the tests I thought there was not a lot but some clear differences. After the tests I would probably place more emphasis on finding the best recording/mastering of a particular Album rather than just buying the highest resolution files of that Album.
You manipulated this music, when I understand it corretly there could have been
Diffmaker shows only a negligible difference, still hearing difference
Exactly as I expected ;)
It was approximately as simple as expected.
Again THANKS TO YOU looking forward to the seeing the results. Playing Bass since 1968.
About what I expected. Did not expect to be able to tell much of a difference at all. I thought i heard differences the first time through, but the differences became smaller as I became more familiar with the music.
That more difficult than was expected. Thank you !
I expected there to be no difference, and I heard no difference.
Thanks for putting together this valuable survey. I have long suspected there was no difference between hi-rez and redbook formats. Would like to see the same thing with 96 or 192 sample-rates vs. redbook.
It was hard to listen any difference.
Could you manage a similar test between PCM and DSD in the future.
Hmm, without quick comparison AB tracks seem to be the same :-)
As I mentioned, I was not at all sure that chopping off bits and upres'ing them would be audible. And I'm not at all sure what it illustrates. I'm not confident that it proves that there is no difference between 16 and 24 bit recording.
The test was difficult. I'm by no means certain of any of my answers. I have yet to reach the stage where I will habitually buy a 24bit version of an album in preference to the CD, though that's partly because of the price difference!
Assuming that my answers are correct I did find the test a lot easier than what I expected.
It was easy
I expected it would have been hard, it was indeed :)
Harder than expected.
I believe a very good DAC and amplifier are needed to make the difference hearable.
It is much easier with headphones than with speakers.
harder than expected.
Not easy. I think a ABX test to determine if people can actually hear a difference would be useful. Sorry I don't have to software to do the ABX switching you talked about. If there were three sample for each song and Q1 would be which are the same, Q2 would be which do you think is the 24 bit.

I am curious to know if I got anything right!
It was hard, just like expected. I played with open windows and an open balcony door which raises the noise level in the room. But either way, I would be surprised if I got all three correct.

I didn't spend much time on this. I would have preferred being able to more easily forward both to a passage in the sample where more "is happening" and then AB test them quickly with a press on a button.

Those times I've been able to distinguish MP3 in 256kbps vs lossless, I have found it easier when entering a passage in the music where there's for instance more high frequencies.
F'ing hard!
Came with no preconceptions - on one hand hoping 24 bit might offer an inexpensive upgrade on CD quality but likewise if there is no benefit to the "extra 8 bits" why waste disk-space and extra cost of 24 bi files.

Honestly couldn't tell much if any difference.

All samples sounded very well recorded - better than many recordings I have bought from major record lables.

Looking forward to seeing the results !
Slightly harded than expected. I thought I'd be confident about hearing a difference between 16 and 24 Bit.

Thank you for setting this test up. I will be very interested to read the results.

John Allen (UK)
I got the MP3 vs lossless response right. But my confidence is much lower with this test. The difference is too minute for my ears/equipment to reliably detect the difference.

Anyway, thank you for putting this together!
The third sample was more difficult.
This test confirmed what I'm sure about from a long time: that I cannot hear any difference from 16/44 to 24/96 or more files. No way, 16/44 seems to be enough, at least for my hearing system :D
For future tests I suggest you leave the files in wav format uncompressed. I know this is a waste of storage but someone could be influenced by the different file sizes of a flac compression (even though they're not correlated to the original file resolution).
Anyway great test!! Useful for people to understand what they can and what they cannot really hear!

PS: I don't care anonymity so if needed this is my email address: mag1ster@alice.it
Easier than expected. I found the classical pieces chosen to be very revealing- more so than some rock selections I have tried this test with in the past.
Currently I use a Raspberry Pi + HifiBerry DAC. There is much debate about powersupplies (walwart vs BOTW). I would like to see a test about the differences between the use of a "stabilized" PSU vs an ordinary "2-dollar" PSU. Is this auditable.
This was easier than I expected. You don't mention how you dithered the files. I'm wondering whether the differences would be less pronounced with a different or more sophisticated dither algorithm.
harder than expected
very hard
Great fun, thanks for your effort.
I can't tell if it was easy or not because I don't know if I'm right yet XD
I think I can tell them apart when listening side by side but it would be very difficult to guess a track on it's own. The 3rd pair was hard to tell because the recording wasn't as detailed to begin with.
Much harder than expected. I still think the raw material and capture of true ambiance is much more important than the final resolution of the music tracks. I've heard some great CD's and some very bad supposed to be HDTracks
Harder than expected!
for me, it was't easy to detect subjectivly the higher resolution examples. yet I only suppose less or more that I'm right
In terms of how you ask the questions. It might have been better to have a 'not sure' option instead of 2 because that might skew your data analysis.

You could have combined both first questions as having 5 options: Definitely A, Probably A, Not sure, Probably B, Definitely B.


Any questions/feedback, feel free to reach me at ezzatelhalabi@yahoo.com.au

Really keen on the results of this survey.

Regards and best of luck.
It was hard. The differences are not huge (to my ears). Clips should be shorter.
I made a point not to look at file sizes or any other file property. I cwitched the output to 96khz on the Dragonfly. I am not sure I could consistently differentiate between the samples in a real blind AB test.
Much harder than expected.
This was harder than expected. Please do more of these tests in the future :D i enjoyed it greatly
much easier than expected
bit easier!
Easier. (assuming I was correct...!)

Thanks for going to the trouble of providing this survey.
A little harder than I thought it would be.
Differences were subtle. Was the 16 bit file sourced from 24 bit file and downsampled. Or was it a native 16 bit track.
Harder.
The tests were easy but trying to find any difference between 24 and 16 bit was very difficult to the point where my answers were complete guesses. In other words, can't tell the difference.
Easier to decide than thought
I appreciate your ongoing efforts to clarify the vast amount of misinformation surrounding digital audio. I belong to that subset that finds 16/44 more than adequate.
I didn't expect to be able to hear a difference. The unsure results were as I expected.
Harder
Not a big proponent of high-res, at least not on normal gear. I think its benefits are marginal, making this sort of evaluation as challenging as I expected it to be.
Great work
Is there a recommended BC wine to consume whilst performing the test.
Thanks for doing this.
HARDER!
Enjoyed it and looking forward to your findings.

Would really like to know if someone can hear the difference without doing A/B'ing instantaneously.
Well prepared test.
But was not difficult for me (as I expected).
Not sure, if I could distinguish files after high quality up- and down-sampling (or down- and up).
harder than expected.
I just could slightly notice the dynamics more clear but wasnt 100% sure...
Harder - not very confident on any of the samples. But did not expect it to be "night and day", as some "audiophiles" say it is.
Samples 2 and 3 have lost resolution in editing and have become noisy. (digital resolution have been limited)
I just felt curious. I don't think I have "golden ears", so my ABX tests are just from an untrained person.
I am considering upgrading my gear to a pair Sennheiser HD650 headphones in the future, since the sony's are not great for mid and high frequecies according to many reviewers and experts. Also, I am unsure wether a dedicated headphone amp would better help to discern each version.
Assuming that my results are correct I found it so-so. See my comments above. But I knew before that the differences between 16-bit and 24-bit recordings can be quite subtle and therefore not so easy to identify.
Much easier than expected. I was expecting to have to play the tracks a number of times to discern any differences, but I played each track through only once and could pick up the differences within seconds of the second track starting.
I'd like to see more of these tests. Too many people saying that high-resolution is pointless and tests like these prove otherwise.
easier than expected - nothing to hear, move on :)
I switched between A and B extensively on all three tracks, I could not discern even the slightest difference. Not even a hint.

Therefore, I didn't even complete the test by picking X = A or Y = A. It would have been pure guesswork.
dear archimago

first i would like to congratulate you for your awesome work. I greatly enjoy your damn fine blog. You are a champion of rational thought and enlightment in this hifi world which unfortunately is plagued by snake oil, hocus pocus and gullible fools (I once was one myself)

please go on with your excellent work!
This was interesting. It would be so much more with some other equipment...
i expected to get about 50%.
In case my findings are relevant in reality i could confirm my phylosophy about audio - if you must to perform intensiv listening on your thoes in order to find some pluses or minuses then those differences are irelevant when you actually listen to the music itself. Big enough diffs will be transparent from the go and that is when you should think about upgrade or trying another recording of the same peace. YMMV
I found this very hard. After one or two passes I formed a view which I didn't change, but I know if I'd been listening properly blind, I wouldn't have been able to ABX the tracks successfully.

Thanks for setting this up!
Difrents is small.
I was not expecting anything. The differences were small but I am fairly confident in noticing them.
easier
Easier, but still the difference is quite small. I think the sound engineering quality of a recording is the most important aspect. Some very well recorded 16/44.1 CDs sound as good or better than some multi-channel SACDs, DSD being somewhat equivalent to 24/96 PCM. But I also have some amazingly realistic SACDs...
One seemed very obvious but the other two were both very subtle. I wouldn't be surprised if I got them wrong.

You should really try the demo CD (yes - CD) from http://www.earthworksaudio.com/products/microphones/ it's a great demo of how better quality kit shines even through a 'bottle neck'. Personally I think it's more to do with accurate timing than high frequencies.
Just as I expected.
Very nice
Probably some people (who also are certain about their abilities) are going to answer right by chance. Without proper ABX those are of course worthless data points. It has become clear to me that even really rigorous scientific testing doesn't apply to audiophiles or marketeers - people who have a "I want to believe" poster up in their heads just ignore facts and continue spreading myths no matter what.
This is very hard (read impossible).
I would like to se a test where CD quality is compared to hirez.
About what I expected.
Differences were subtle, but noticeable without doing fast A/B switching. Particularly with Bozza, fatigue would set in with track A and not so much with B version.

I think some more musical tracks would be nice, something that most would want to listen to and keep in their collection. More percussion and maybe string bass would be helpful. A mix of Jazz and classical and acoustic guitar would also be helpful.
the file sizes for A and B are different- maybe this factored into people's judgements.

as hard/easy as expected.
You can so a similar test like I made where the ultrasonics are removed only.
So convert 96/24 to 48/24 and back to 96/24.
Make sure to use a good down-upsampler.
Info for this here: http://src.infinitewave.ca
thank you for your hard work in finding out the truths within this myth filled hobby of ours. Your Blog is a breath of fresh air and is truly excellent. Thank you once again, regards, James.
Impossible but i expected that .

But i actually tried hard .
As you can see, there were a variety of experiences described in the comments above. Some felt it was harder, some felt it was as expected, and some seemed surprised by the difficulty. Interestingly a number thought it was easier and expressed high confidence in their ability to discriminate the sonic difference. Since this is all anonymous anyhow, let's correlate a few scores to the comments above: the person who said "easier" was 1/3 correct, "much easier than expected..." 1/3, "it was easy" 0/3, "easier to decide than thought" 1/3, "bit easier!" 1/3, "it was easier than I expected..." 2/3, "Easier than expected." 0/3, and "It was approximately as simple as expected." 3/3. Easier? Really?

III. Final Thoughts & Personal Impressions...


After reading the testimony of the respondents above, I think it's just as fascinating watching again the promo video for Pono where Neil Young apparently wows his buddies with his car stereo system allegedly on account of fantastic sounding high-resolution audio (Neil of course seems to have a "thing" for 24/192).



Given the 24-bit vs. 16-bit audio test results, unless high samplerate (ie. 44kHz vs. 192kHz) is to explain the striking difference so dramatically captured in that video (take any 24/192 track and down sample it to 24/44, tell me if you were WOW!ed by the difference), I honestly wonder what those celebrities are talking about. In my opinion, if these dramatic Pono testimonies are to be believed as genuine, then Neil Young must either be playing different masterings (eg. distorted vs. better mastering, playing them at different volumes), and/or he's playing some ridiculously data compressed track (64kbps MP3?) versus high-resolution to get that kind of reaction. I'm sure this can be easily answered if Pono would just release a couple of minutes of what was used on those celebs - I'm sure the rest of us would love to experience the apparent glory.

On a side note, CNN listed Pono as a "game-changing gadget" of 2014. Yeah... I guess we'll see about that...

On a personal note, I did try the blind test myself on two systems:
- Windows 8.1 PC --> ASUS XONAR Essence One via USB (ASIO) --> Sennheiser HD800
- Windows Server 2012 R2 or Win 8.1 PC -->  Squeezebox Transporter or TEAC UD-501 DAC --> Emotiva XSP-1 pre-amp --> Emotiva XPA-1L monoblock amps in 35W Class A bias mode --> Paradigm Signature S8 + SUB1 (balanced XLR interconnects, 4' 12G OFC speaker cables; <30dB SPL quiet sound room at night, room correction DSP off)

Total cost of the systems above would be in the $10,000 - 20,000 range. I used Foobar ABX on the PC with HD800 headphones and achieved 6/10, 6/10, and 4/10 correct in identifying the 24-bit sample over 3 listening sessions with each musical piece (that would be 2/3 "correct" I suppose). I could not tell the difference with the Transporter or TEAC DAC played through the full-sized speakers with sub and would easily grade my level of confidence as a "guess" or at best slightly "more". I'm currently 42 years old.

Over the 2 months that I was gathering data for the survey, I also tried this test with friends and family of various age and both males and females (results not entered in the survey). Never did anyone express the opinion that differentiating sample A vs. B was "easy".

I do remain open minded, however. Although I have not met anyone who could easily and accurately detect 24-bit vs. 16-bit audio file differences in a controlled setting, I'm also not saying it's impossible. Who knows, maybe the fellow above who responded "It was approximately as simple as expected." and got 3/3 is one of these. Humans are capable of amazing feats after all... However, I do believe hearing acuity of this magnitude would at best be rare and I suspect most reasonable individuals would recognize this once they try an ABX for themselves or logically figure this out based on understanding of the science. Listening volume would also be a consideration and most people would understand that at normal listening levels, the extra 8-bits would be highly unlikely to be of any benefit, especially if audible ambient noise is present.

Finally, someone asked me the other day whether I thought 24-bit music was therefore some kind of a "con". Well, no, not necessarily. Assuming an album was recorded, mixed, and mastered well with extremely high-resolution equipment, then one could be buying music of the highest fidelity/accuracy (more dynamic range if ever needed, more "complete" ultrasonic frequency and low level details captured from the recording session). It would be hypocritical of me to desire a high-end DAC capable of >16-bit resolution but turn my nose up against a truly high resolution album, wouldn't it?

The key of course is that first, the 24-bit/high-resolution audio file must be actually of high quality (accurate digital chain, superb microphones, music worthy of the dynamic range and frequency response, expert engineer doing the job, all processing maintaining at highest level of resolution). Secondly, that the album was one I truly love and desire the best resolution version in order to "go the extra mile" in terms of finding the hi-res version. Fulfilling these criteria, I would personally find some value in the purchase (how much $$$ over the same CD resolution version is another matter requiring consideration!). The pragmatic reader could just as easily ask "what's the point at all if we can't hear the difference?" and I would not argue with that either. For me, this is still about "perfectionist audio" and I believe one is allowed a certain level of neuroticism in this (and any) hobby... :-)

[Speaking of assessing value, obviously when it comes to music, this is totally a subjective personal matter. But let us also keep in mind that high-resolution 24-bit downloads (any music downloads) are intangibles. There is no "street value" attached (I don't even know if it's legal to sell them 'used' - presumably the laws could be different depending on country). Let me know what's on offer at the local pawn shop if you ever bring over to them a USB stick with your beloved 24/192s, JPEG "covers", and PDF "booklets" assuring them that these are your last copies and transferring all rights/privileges of ownership. In this fashion, "collecting" high-resolution music downloads is quite fundamentally different than having a library of stamps / books / spirits / wines / paintings / vases / cigars / cars / CDs / LPs... To me, there does not appear to be any material "store of value" in the digital music download collection from a monetary perspective.]

What I think consumers should not have patience for is hyped up talk about high-resolution audio applied to questionable old recordings that never had more information in them than what a 16-bit CD was always capable of encoding. Or even worse, new volume compressed recordings and remasterings of low dynamic range especially when sold as 24-bits as if this somehow magically makes it better (like this). Sadly the above conditions cover the majority of what I believe are current "high-resolution" offerings (at least in the pop/rock genres). Remember too that there could be a different mastering used in the high-resolution version like they did for the SACD of Joe Satriani's Engines Of Creation documented in my upsampled SACD list (eg. more dynamic, less clipped or volume compressed) which would make it the preferable one to get as an audiophile... But this would not necessarily be because it's in a 24-bit format.

I suspect the marketplace will figure this all out in time, as it did for the likes of most DVD-A and SACD "high-resolution audio" over the last decade and a half.

A final thanks to all those who helped me put together this test and to all the folks who gave the test a try. I appreciate your willingness to participate in this little experiment! I hope it provided some personal insights beyond the conclusions presented as a group.

Until next time, enjoy the music.

Wednesday, 2 July 2014

24-Bit vs. 16-Bit Audio Test - Part IIa: The 20 Correct Respondents...

As suggested by one of the comments in Part II, I have put together a summary of the respondents who got the 3 sample audio tracks correct (answered B-A-A). Let us see if there are any demographic variables that stand out. Remember, out of 140 respondents, 20 were able to identify all 3 24-bit samples. This is of course not significant (p 0.30) - by chance alone, one would expect approximately 1/8 to be correct (17.5 out of 140).

Here are the graphs direct from the survey site. Feel free to compare / contrast with those in Part II in the "Demographics" section.

95% males with 1 woman in the group. Not unexpected.

Average age calculated from the median age in each category: 45 years. Essentially the same as the 44 years old average for all 140 respondents.

24% musicians as compared to 22.1% for all respondents.

30% does audio "engineering" as compared to 24.3% for all respondents.

None of these who got all the choices correct identified themselves as an audio hardware reviewer. Since this was an optional item, 1 individual did not answer the question.

11/19 (58%) were Windows users (almost exactly the same as the full group at 60%). Again, we see external USB/Firewire DAC being used mainly (13/20 - 65%). The only thing that seemed quite different was a higher proportion of headphone users here at 12/19 (63%) versus about 50% in the total group. Note that there was 1 individual who got all 3 24-bit selections correct who did not go into detail with his reporting, hence the total of 19 since he did not click either the speaker options or headphone option.

Average cost of the systems (based on median price in each price category) used by those who got all 24-bit samples correct: ~$6600. This is lower than the $8160 for all respondents. It's a smaller sample so I wouldn't put too much into this... No evidence in any case that these folks on average had more expensive systems. Like in the main group, the $1000-3000 segment was most common.

The person who used the "$100-$250" system was running what looked like a basic Windows PC with some Sennheiser HD433 (~$25) plugged in. (Remember, I dissuaded folks from listening directly off the computer since often the on-board DACs are poor - worked out well in this case!)

The person using the "$50,000-$100,000" system included the use of a Squeezebox Touch --> Berkeley Alpha DAC, Cary CAD 120S tube amp, Revel Ultima Studio 2s. Not sure about the preamp function. Looks like a nice setup...

Other gear used in this subgroup as I browse through them: balanced cable Sennheiser HD800 headphone, Grado SR80 + Bose QC15 headphones, Parasound amp & preamp, PSB speakers, Senn HD800 connected to Woo Audio WA7 amp, Senn HD600 connected to AudioQuest Dragonfly 1.0, NuForce DDA-100 to KEF Coda 70 bookshelf speakers, Arcam rDAC to Arcam DiVA AVR280 receiver, KEF iQ30 bookshelves, Benchmark DAC2 HGC with Grado RS1i headphones, Wyred4Sound DAC-2 to Senn HD800. Certainly some excellent gear there but not really exotic.

20% (4/20) used a listening tool like the ABX test. Exactly same as total group average.

So... How "confident" were they of the answers picked?
45-50% graded their confidence low - either "guessing" or "2 stars = More than a guess" for each track. Again, overall about the same as the total group. There seemed to be a bit more confidence in the Bozza track however (looked like everyone selected "4" rather than spread out between "3" and "4") and the confidence level dropped off by the time Goldberg was evaluated similar to the previous report. With such small numbers, it's difficult to put strong weight into specific results such as these.

In summary, this demographic doesn't look very different from the general group of all respondents - perhaps proportionally more headphone listeners. Remember however that I was not able to find a significant improvement in general accuracy of identifying 24-bit audio comparing all the headphone-using respondents in the overall group.

Of interest also was the fact that the average cost of the hardware used by these respondents was no higher than that calculated for the total group.

Although this subgroup was accurate in their choice of which sample was the 24-bit audio, confidence overall remained relatively low with 20-25% admitted to being "guesses".

Part III: SUBJECTIVE COMMENTS & FINAL THOUGHTS

Friday, 27 June 2014

24-Bit vs. 16-Bit Audio Test - Part II: RESULTS & CONCLUSIONS

See Part I: PROCEDURE for details around the test samples used and how this study was conducted.

In this installment, let's have a look at the results from the 24-bit vs. 16-bit listening test among respondents.

First I need to remind everyone that the test procedure was not easy. As demonstrated in Part I, the sonic difference between the original 24-bit track and the 16-bit dithered version is down below -90dB. This makes the test much more difficult than the previous high bit-rate MP3 test from last year... Whether you were able to detect the 24-bit version or not, I applaud your efforts and input.

As I noted previously, there were 140 total respondents and looking at the transfer statistics from my FTP server, I know the test was downloaded at least 350 times. Response rate just based on my FTP server transfer was therefore about 40% of all who downloaded. The actual response rate would likely be significantly lower since there were other download sites.

Results

I. Demographics:


First let us consider the characteristics of the respondents taking this blind test. Being that this is an internet test, involves downloading 200MB worth of high-resolution audio data in FLAC, and given the target audiophile forums where the test was advertised, it is reasonable to conclude that many if not most are tech savvy audiophiles rather than the "average" music listener.

Not surprisingly, the vast majority (98%) were men which is expected (just have a look around audio clubs, audio shows, etc.) - thanks to the 2 ladies that responded!:
The age distribution likewise isn't a surprise. Audiophiles tend to be a bit older overall, and the average age if we estimate using the median age in each range comes out to about 44 years old. The distribution looks like this:
Nice to see some teenagers and early 20 year olds with the majority in the 41-50 age category. If one were a computer audio manufacturer, the 40-50 age group would be the one to target for maximal effect in 2014.

The survey also asked if some of the respondents belonged to specific categories such as musicians and those with audio engineering experience. This could be useful  in the sub-analysis to see if there were more "golden ears" in these groups:
 
By self report, there were >20% musicians and audio "engineers". Of course these 2 groups were not exclusive and 17/31 musicians also identified themselves as doing audio recording/mixing/editing.

As for the hardware utilized by the respondents, here is the general layout of the type of gear being used to evaluate:
In terms of operating systems, of the 3 main OSs - Windows, Mac, Linux - it's clear that Windows predominated. 129 respondents used one of these 3 OS's and Windows was 60% of that followed by Mac at 23% and Linux 17%. Among streaming devices the Squeezebox was tops. Most respondents used an external USB/Firewire DAC to conduct the evaluation; not surprising that in the computer audio world, SPDIF interfaces are no longer as common and a few used the HDMI interface (surround receiver devices).

There was an even split in respondents using speakers (bookshelf + tower) of 74 and headphones 72 (a few used both).

Here's how the audio system "cost structure" looked (US$):
Weighted average using the median price in each category yields a system price of around $8160 on account of the number of expensive 5-figure systems reported (22% had systems >$10,000). The median audio system price is in the $1000-$3000 range. This is very reasonable and again speaks to the demographic who would download and try a test like this. Objective >16-bit resolution is easily achieved in a $1000-3000 system as demonstrated with even relatively inexpensive DACs measured here over the last year and by having a look at the Stereophile objective results.

Many respondents went into detail describing their systems in the survey.  The first 25 responses included full Meridian active speakers, Sennheiser HD800 headphones with upgraded cabling, custom amplifiers, tube amplifiers, custom ESS9023 DAC, NAD amp, Lyngdorf TDAI 2170 digital amps into Intonation Terzian speakers, Overdrive SE USB DAC, Parasound Halo JC-1 monoblocks, custom ribbon speakers, Cambridge Azure 840E, Focal 1028BE speakers, Sonus Faber Cremona Auditor M speakers, Sony MDR-7509HD headphones, Grado SR325 headphones, Audiolab M-DAC, Chord Hugo DAC, AKG Q701 headphones, Squeezebox Transporter, PS Audio 4.6 preamp, Pass Aleph 5 amplifier, Devialet 170 integrated DAC/amp, Martin Logan Montis speakers, Geek Out 720. Clearly, many respondents used very high quality equipment for this test.

As a reflection of the technological savvy of the respondents, many utilized ABX testing such as the Foobar ABX tool:
20% utilized listening tools to evaluate (ignore that 3rd bar above since it's just a reflection of how many left a description, 29/140 used an ABX tool). Other than Foobar ABX, Mac ABXTester was common, and others described their own script.

II. Were the 24-bit audio files distinguishable from the same files dithered down to 16-bits (and fed into the DAC in the 24-bit container) by the respondents as a whole?

In total, the final result looked like this:




As you can see, in aggregate there is no evidence to show that the 140 respondents were able to identify the 24-bit sample. In fact it was an exact 50/50 for the Vivaldi and Goldberg! As for the Bozza sample, more respondents actually thought the dithered 16-bit version was the "better" sounding 24-bit file (statistically non-significant however, p-value 0.28).

Looking at the individual responses, there were a total of 20 respondents who correctly identified the B-A-A selection of 24-bit samples, and 21 selected the opposite A-B-B. This too is in line with expectations that 17.5 would pick each of these patterns based on chance alone.

III. How certain were the respondents that they answered correctly (ie. able to identify the 24-bit sample)?

24-32% of respondents felt they were unable to hear a difference (1 star = "Guessing"). If we consider that those who chose "2 Stars = more than a guess" also represent a very low level of certainty, then we can see that 45-52% of respondents really had quite low confidence that they were able to tell the difference.

Fewer respondents were "certain" about the solo piano piece (Goldberg), and in general more seemed confident about the Bozza piece. This could be listening fatigue if one were to progress through Bozza-Vivaldi-Goldberg in sequence to account for this result.

IV. Were the respondents who felt more certain about their answer more likely able to identify the 24-bit audio?

Let us have a look at the results reported by those who rated their confidence level as 4 or 5 ("very confident" to "certain" - 25-30% of all the responses):

"Correct" responses being the ones who were successful in identifying the 24-bit sample. As can be seen, there is no evidence to suggest that even in those respondents with a strong sense of confidence were able to identify the 24-bit sample (as sounding better). In fact, for the Goldberg sample, only 44% of those who were quite "certain" selected the 24-bit version correctly.

V. Were the subgroups (musicians, sound engineers, hardware reviewers) able to identify the 24-bit audio better?

Due to the fact that respondents admitting to "guessing" tended to answer with A-A-A and this would severely impact a small sample size, I decided to not count the "guesses" in these smaller subgroups and see if there was any pattern of higher accuracy compared to all respondents.

Musicians:


As a subgroup (total of 31 respondents), the self identified respondents with a "good amount" of musical background did not do well. In fact, this group of respondents consistently scored worse than the combined result. Curiously, the musician group seemed to select the 16-bit dithered Vivaldi as the "better" sounding version (p-value 0.047).

Sound "Engineers" (those with experience recording, mixing, editing):


As a group the "engineers" faired better than the musicians in terms of accurately identifying the 24-bit tracks. This subgroup surpassed the accuracy of the combined respondents marginally. Again, the number of individuals was small (34). There was an overlap between the "musician" and "engineer" group with 17 individuals identifying themselves as both.

Hardware Reviewers:

This was an optional survey item that could be interesting to look at since audiophiles who provide hardware review opinions can have significant influence on sentiment and purchasing decisions.


With only 8 respondents, it would be difficult to draw any firm conclusion other than there is no evidence to suggest this subgroup was any more able to identify the 24-bit from dithered 16-bit audio.

VI. Were those with more expensive hardware able to identify the 24-bit audio better?

In total, there were 44 (31.4%) respondents using $6000+ equipment to perform this test, let us see if they were more accurate than the group average in identifying the 24-bit sample:


As you can see, the ~30% of respondents utilizing equipment costing >$6000 were not able to accurately identify the 24-bit audio track any better than the group average. The Vivaldi track was exactly at 50% accuracy.

VII. Did Headphone Use Improve Accuracy?

72 respondents used headphones in their evaluation. Since headphones can be potentially more accurate (no room acoustics, better noise isolation) at a lower overall cost, it would be interesting to see if accuracy in determining which was the 24-bit sample was any better.

As you can see, headphone use did not result in any appreciable improvement.

VIII. Did age have any effect on the accuracy?

There were 44 respondents 51+ in age. As a group, this is how they did compared to the overall result:


No evidence again of any significant change in accuracy in identifying the 24-bit audio.

Conclusions:


In a naturalistic survey of 140 respondents using high quality musical samples sourced from high-resolution 24/96 digital audio collected over 2 months, there was no evidence that 24-bit audio could be appreciably differentiated from the same music dithered down to 16-bits using a basic algorithm (Adobe Audition 3, flat triangular dither, 0.5 bits).

This survey was targeted to audiophile enthusiasts who in general reported using equipment beyond typical consumer electronics. The majority (77%) were using audio systems reported in excess of US$1,000 and 22% were listening with systems in excess of $10,000. Furthermore, 20% used an ABX utility in the evaluation process suggesting good effort in trying to discern sonic differences. There were no surprises in terms of demographics with the vast majority being males, with an age distribution centred around 41-50 years old.

Subgroup analysis of "musicians" and those who work with the technical aspects of recording, editing and mixing ("engineers") did not demonstrate evidence of special abilities at discerning the 24-bit audio. The "engineers" group did perform slightly better overall. The small group of individuals who identified themselves as writing hardware reviews did not show an increase in accuracy.

About 50% of respondents admitted that they had low confidence in their ability to discern differences. Conversely, 25-30% (depending on which musical sample) of respondents reported a strong sense of "certainty" that they were correct in identifying the 24-bit sample. Nonetheless, analysis was not able to demonstrate improved accuracy despite claims of increased subjective confidence by the respondents.

Furthermore, analysis of those utilizing more expensive audio systems ($6,000+) did not show any evidence of the respondents being able to identify the 24-bit audio. Those using headphones likewise did not show any stronger preference for the higher bit-depth sample. No difference was noted in the "older" (51+ years) age group data (not surprising if there is no discernible difference even with potential age-related hearing acuity changes).

Limitations of the study includes the fact that this was an open test distributed via the Internet in an uncontrolled fashion. This allowed the opportunity for test subjects to analyze the audio files objectively rather than through pure listening. However, this is also the mechanism of delivery for high-resolution downloads and the test participants would likely be using the same equipment to listen. The benefit of course is that the results may reflect realistic feedback from potential consumers (if not the target audience) of high-resolution audio. Respondents were able to listen in their own home using their own equipment rather than an artificially controlled environment. The fact that there was no time limit (other than a 2 month window to gather survey submissions) should have been a less stressful experience for the testers.

140 participants is not a particularly large number of data points but it was adequate to demonstrate an even 50/50 split in preference across the 3 musical samples; a level of consistency which adds to the idea that listeners were unable to differentiate 24-bit audio from the dithered 16-bit counterpart. Replication of the results is of course advised.

As expressed previously in "High-Resolution Expectations" (See "Good Enough Room?" section), there is no good rationale for a dynamic range of greater than 16-bit digital audio in the home environment. The results of this survey appear to support the notion that high bit-depth music (24-bits) does not provide audible benefits despite the fact that objectively measurable DACs capable of >16-bit resolution are readily available at very reasonable cost these days.

If 24-bit audio imparts no audible benefit when listening to music compared to the same data dithered down to 16-bits, how certain can the audiophile consumer be that higher sampling rates (eg. 88/96/176/192kHz) would make much of any audible difference? This perhaps should be the target for another blind test. Methodologically, it would be extremely difficult to maintain the blind testing condition over the internet since it would be trivial to run the audio files through a spectrum analyser with no easy mechanism to conceal the bandwidth limitation of lower sampling rates (eg. 22kHz frequency headroom for 44kHz sampling). The reader is encouraged therefore to explore the effect of higher sample rates for him/herself.

One final comment in closing. Notice that the Goldberg track was soft and had a peak amplitude of -10.35dB as demonstrated by the DR Meter (see PROCEDURE post). This means that the full potential dynamic range was not being utilize and for the 16-bit dithered sample, the dynamic range can be encapsulated in <15-bits. Even with this limitation, there was no evidence that respondents were significantly able to identify a difference in aggregate or within subgroups.
 

-------------

As usual, I encourage others to do their own testing. Feel free to drop a link especially if there are other controlled, preferably blind tests showing a significant audible difference between 24-bit and 16-bit audio.

I will put up a Part III over the next week as well documenting the subjective comments made by respondents and final observations... Stay tuned.


Saturday, 21 June 2014

24-Bit vs. 16-Bit Audio Test - Part I: PROCEDURE

Disclosure: Just in case anyone is wondering, I want to make it clear that I have no affiliation with any audio company. I do not derive any financial benefit of significance from conducting this survey (a few dollars from the ad revenue I suppose). I enjoy the audio hobby and wanted to do some "reality testing".

Over the course of 2 months (April 19 to June 20, 2014), an invitation was extended from this blog (archimago.blogspot.ca) to various "audiophile" forums on the Internet for participants to submit responses to an anonymous survey to see if they can identify which sample of music was the original 24-bit source versus the same piece of music (exact same mastering) dithered down to 16-bits.

Although the following may seem pedantic, I want to lay out the procedure used transparently and in detail so as to be clear of the nature of this test and what was done to collect the data.

The musical samples were taken from freely available sources on the internet; 2 classical pieces from the Norwegian studio 2L recorded in high resolution digital and 1 from the Open Goldberg Variations. For the purposes of this test, the "high resolution" 24/96 file samples were utilized directly from those sources (ie. I did not want to do any manipulation of the data like resample to 48kHz).

Musical samples from 2L (available here):
1. Eugène Bozza - la Voie Triomphale (performed by The Staff Band of the Norwegian Armed Forces): A well recorded orchestral track originally recorded in DXD (32/352.8).

2. Vivaldi - Recitative and Aria from Cantata RV 679, "Che giova il sospirar, povero core" (performed by Tone Wik & Barokkanerne) - String orchestra with female vocals. Also DXD-recorded originally based on the description from the website.

The third sample is taken from the excellent recent recording off the Open Goldberg Variations. Again, I am using the 24/96 high-resolution download as a starting point:

3. Bach: GoldbergVariations BWV 988 - Aria (performed by Kimiko Ishizaka). The recording was done at Teldex Studio in Berlin using the Bösendorfer 290 Imperial CEUS concert grand piano. It has been said by some audiophiles that the piano is an extremely difficult instrument to reproduce well. It's also a much slower piece which provides an opportunity to listen to the note decay quality. Low-level spatial room acoustics are also easily heard on this recording.

Due to the size of high-resolution downloads, each sample was limited to 1.5-2 minutes (the 2L samples were 2 minutes long, 1.5 minutes for the Bach). Some of the more interesting or dynamic portions of the musical samples were selected. Only fade in and fade outs were added to the beginning and/or end of the tracks of <2 seconds so as not to be too abrupt. FLAC compression was used to decrease file size.

The dithering process was basic. Using an older version of Adobe Audition (version 3.0.1), a flat triangular dither of 0.5 bits was utilized with settings as shown:
The sample rate was kept at 96kHz. These are very conservative settings and no advanced settings like noise shaping was utilized as featured in some of the "better" dithering algorithms like iZotope's MBIT+ or Weiss' POWr, etc. Adobe Audition again was used to convert the dithered 16-bit data back to a 24-bits container.

The 24-bit and (effective) 16-bit versions were randomly assigned as Sample A or B and files were enumerated 1 to 6 in the final package downloaded by the respondents.

Due to the fact that this is an "open" test released on the Internet (rather than a listening test in a lab situation where variables could be easily controlled), some measures were implemented to prevent easy differentiation of 24 vs. 16 bit-depth by other means than just listening. (Thanks to Wombat for giving me some ideas.)

1. Files 2, 4 and 6 (Sample B of each track) had 1 ms cut off from the start and files 1, 3, and 5 (Sample A) had 1 ms truncated from the end. This maintains the exact duration of Sample A and B but shifted them temporally. Doing this confounded simple null tests that did not take into consideration the slight timing offset.

2. A very low level -140dB (average RMS power) white noise was mixed into the 16-bit dithered samples (remember, they were placed in 24-bit containers) to affect the LSB so that a simple program that just checked the bit-depth (by looking for "0" in the least significant bits) will think that this is an actual 24-bit resolution file. This small amount of white noise would be inaudible and well below the dithered 16-bit audio noise floor (and below the objective noise floor of actual DACs).

3. FLAC was consistently LESS EFFICIENT at compressing the dithered (effective 16-bit) files resulting in larger file sizes. As a result, one of the 24-bit files was purposely compressed at FLAC level 2 (versus level 8) to make the file size slightly larger than the respective dithered version.

[Of note: the beta-testers wanted me to implement even more than the above to hide the identity of the 16-bit dithered files! I suppose I had more faith in human nature.]

Knowing the above, if one were to align the files, cut off 2 seconds from the front and end (to account for any slight variation in the fades), we could run the files through a null test and obtain the following amplitude results:
Bozza - La Voie Triomphale
Vivaldi - Recitative & Aria
Bach - Goldberg Aria
As you can see, the null test demonstrates peak amplitude difference down in the -90dB level (and average RMS difference down at -98dB) as a result of dithering from 24 to 16-bits. Also, for those who had a peek, you can see the higher noise floor during quiet portions such as this fade-in portion in the Bach Goldberg (0.501 seconds in):
24-bit
Dithered to 16-bits
The resulting samples were also run through the DR Meter (version 1.1.1) in foobar to ensure that the volume levels were equivalent:

DR         Peak            RMS           Duration Track
--------------------------------------------------------------------------------
DR12     -10.35 dB   -26.90 dB      1:30 05-Sample A - Goldberg Aria
DR12     -10.35 dB   -26.90 dB      1:30 06-Sample B - Goldberg Aria
DR13      -0.17 dB   -17.36 dB      2:00 01-Sample A - Bozza: La Voie Triomphale
DR13      -0.17 dB   -17.36 dB      2:00 02-Sample B - Bozza: La Voie Triomphale
DR14      -4.13 dB   -21.41 dB      2:00 03-Sample A - Vivaldi: Recitative & Aria
DR14      -4.13 dB   -21.41 dB      2:00 04-Sample B - Vivaldi: Recitative & Aria

This also demonstrates that the samples were of good dynamic range - DR12 to 14. No major dynamic range compression, clipping or peak limiting in any of the source material as shown below:
Bozza - La Voie Triomphale
Vivaldi - Recitative & Aria
Bach - Goldberg Aria
These "audiophile" samples should therefore provide a good chance to experience dynamic nuances between 16-bit and 24-bit audio. (Much better than the typical compressed, limited audio of modern rock/pop recordings sold as "high resolution" routinely with <DR10.)

The samples were ZIPped together and distributed in a single file (~200MB in size). My FTP server was the primary download source with secondary download sites at privatebits.net (thanks again Ingemar), Uploaded.net, and FilePost.com.

Here then is the randomization used:

01 - Sample A - Bozza - La Voie Triomphale --- 16-bit
02 - Sample B - Bozza - La Voie Triomphale --- 24-bit
03 - Sample A - Vivaldi - Recitative & Aria --- 24-bit
04 - Sample B - Vivaldi - Recitative & Aria --- 16-bit
05 - Sample A - Goldberg --- 24-bit
06 - Sample B - Goldberg --- 16-bit

The 24-bit original audio files for the test samples are therefore B-A-A.

"Advertising" for this test was done through forum invitations extended to:
A few other smaller forums had invitations advertised as well. Invitations included a request for participants to NOT share their findings so as to affect others, and a warning that this is a 24-bit test, so the participant should try to ensure that the equipment (at least the DAC) is capable of >16-bit resolution. In general, participants were dissuaded from just using a direct computer motherboard/laptop output. I visited the advertisement threads on occasion and also reminded of the closure date on June 20, 2014. "Golden eared" audiophiles and those with high-end audio equipment were encouraged to participate. Due to the 2-month window, participants were asked not to rush the listening evaluation.

Participant results were collected through an active, paid account on: http://freeonlinesurveys.com/. Cookies were used to prevent double entries from the same computer. Participants were asked to:
1. Identify what they believe to be the 24-bit sample. (Presumably the "better sounding" track.)
2. Identify their level of certainty for each test track. Asked to grade on a 5 point scale (1 = "guess", 5 = "certain").
3. Tell me whether an ABX tool or other instantaneous comparison tool was utilized.
4. Provide demographics: gender, age, "musician" background, audio engineering/editing background, audio hardware reviewer status.
5. Describe evaluation hardware: components, cost of equipment.
6. Provide their subjective input: details on the hardware, any surprises in terms of difficulty, and a description of the audible difference (if any).

As suggested by the nature of this test and the data collected, I wished to answer the following questions (as expressed on April 30th on this thread in the Squeezebox forum):

Primary objectives:
1. How "easy" was it for people to detect (or report) a difference?
2. How accurate were the respondents in detecting the 24-bit sample?

It'll be interesting also to have a look at:
1. Which musical piece was it easier to hear a difference in.
2. Whether more expensive gear resulted in more accurate detection.
3. Whether age was a factor (might be hard to generalize unless I can normalize the gear quality).
4. Whether those who felt confident that they got it right actually did. Perhaps a measure of human ability to self-evaluate.
5. Whether there were more successful results from headphones vs. speakers.

Thank you to all the "beta testers" involved before the survey went public! Also, thank you again to all the participants who took the time.


24-bit vs. 16-bit Blind Listening Test Closed...

The day has arrived...



The survey for the blind test ended today! Thank you for everyone with the patience in taking the time to listen to the 3 samples and submitting your results. A few people admitted to only listening "a few times" but it certainly looks like the majority took the time to seriously listen and I certainly appreciate the detailed responses provided.

In total, I received 140 responses over the 2 months. Here's the map of the countries with submissions:



As you can see, not unexpectedly we have 3 main "clusters" of input from audiophiles - N. America, Europe, and the Pacific region (Asia + Australia + New Zealand). Then there's the single South African submission :-). The breakdown looks like this:

North America: 36 USA + 12 Canada = 48

Europe: 14 UK + 1 Spain + 6 France +2 Belgium + 8 Netherlands + 12 Germany + 1 Denmark + 4 Sweden + 4 Norway +1 Finland + 1 Estonia + 2 Austria + 3 Italy + 7 Croatia + 4 Hungary + 1 Bulgaria + 1 Turkey + 1 Cyprus +1 Israel = 74

Asia & Oceania: 1 India + 1 China + 1 Taiwan + 2 Malaysia + 3 Australia + 1 New Zealand = 9

Africa: 1 South Africa

Unknown: (for some reason IP could not be traced to country, I've seen this with Russian IP addresses) 8

I didn't work out the per-capita numbers but 7 from Croatia caught my eyes! Nice.

As in the MP3 Blind Test, I'm going to be posting the results over the next week or two in parts. Coming up in the next 24 hours will be a description of the procedure. This will include the "answers" as to which samples were the 24-bit audio. I'll speak about how the files were created as well as the dithering algorithm used. Following this will be the results and then a discussion of the implications of the findings...

Stay tuned!

Part I: PROCEDURE

Thursday, 12 June 2014

REMINDER: 1 Week Left (24-bit vs. 16-bit blind test)

¡Hola amigos!
Greetings from here:
Swimming with the turtles and stingrays off the coast of the Mayan Riviera...
Thought I'd just put up a little reminder that I'll be closing the blind test on June 20th - approximately 1 week from now. At this point, we're up to 120 responses on the survey (muchas gracias).

Although there are always limits to test methodology, and I certainly do not pretend that all variables have been controlled for, (indeed, it is impossible in cases like this where it's being done "remotely" over the internet!) I do believe this is a valuable test for the audiophile community. It's an opportunity to expose one's expectations (yes - 24-bits provide 16 million "levels" vs. the paltry 65536 "levels" of 16-bits) to reality testing in the comfort of one's home; away from Industry biases, suggestions from audio gurus, and group expectations that may be set-up when one goes to a show room or trade shows. This is about what audio lovers around the world actually hear in the real world...

Please put in your own response and suggest it to audiophile friends who may want to give this a try before the closing date. Feel free to also put it up on audio(phile) forums you may frequent. Just remember - you better have a system that has >16-bit capability.

Golden ears and those with 5+ figure audio systems - I would really love to have your continued survey response! I would also love to get musicians, sound engineers, and reviewers of audio hardware involved.

As usual, test details including procedure and files can be found here:
http://archimago.blogspot.com/2014/04/internet-test-24-bit-vs-16-bit-audio.html

Talk to you all later - likely after the test end date...