Since they're out there already, let us spend some time this week to look at these visual analogies as a way to "think" about what the authors of these works want us to consider/believe. I'm going to screen capture without permission a couple of these images to explore. As usual, I do this out of a desire to discuss, critique, and hopefully educate which I consider "fair use" of copyrighted material; as a reminder to readers, other than a tiny bit of ad revenue on this blog (hey, why not?), I do not expect any other gain from writing a post like this.
First, here's PONO's "Underwater Listening" diagram released around the time of the 2014 SXSW (March 2014):
|PONO: Underwater Listening|
How audio formats would evoke a desire to compare underwater depths remains a mystery to me. Obviously, there's a desire to impress upon the recipient two main messages - a direct correlation between sampling rate (from CD up) with quality, and to make sure the MP3 format gets deprecated as much as possible (1000 ft?!). On both those counts, this image gets it so wrong, it's almost comedic. Clearly, one cannot directly correlate samplerate and bitrate with audio quality because the relationship isn't some kind of linear correlation. Why would CD quality be "200 ft", and 96kHz "20 ft"? Surely nobody in their right mind would say that 96kHz is 10 times perceptually "better". Sure, there is a correlation such that a low bitrate file like 64kbps MP3 will sound quite compromised with poor resolution, but without any qualification around this important bitrate parameter, how can anyone honestly say that all MP3s sound bad? I might as well say that Neil Young's a poor-sounding recording artist because the Le Noise (2010) and A Letter Home (2014) albums are low fidelity.
I suspect that the PONO camp must be a bit ashamed of this diagram since I don't see it around anymore and I don't find it on their website (might have missed it). I don't think the "underwater" diagram made many friends nor sold many machines in any case...
Here's a more recent chart from Meridian circa late 2014:
|Meridian: History of audio quality & convenience?!|
Of course this is the myth that they primarily want to perpetuate because guess what... Buy this "revolutionary" Meridian MQA and that'll make streaming sound awesome!
While in some cases, sure, we can say a very poorly encoded 192kbps MP3 download (like something done in 1999 with XING MP3) could sound significantly worse than CD and a 64kbps stream can be worse than an excellent cassette copy, like the PONO "artwork" above, there are some truly horrible gross generalizations here! Many LPs sound poor due to low quality pressings, many downloads are qualitatively superb, and clearly any reasonable music streaming service sounds better than a cassette tape - who's kidding whom?! Furthermore, a high resolution digital master (like with high-res downloads or encoded on DVD-A/SACD) has the capability to be more accurate than reel-to-reel tape, but of course subjectively, analogue tape can add its own unique signature/color/distortion that can be preferred... (To be able to mix in the digital domain without generational fidelity loss compared to analogue tape is obviously a big plus.)
Of course, it's easy for me to just criticize without putting something forward... Therefore, please allow me to add for your consideration my submission to the "overgeneralized sound quality vs. audio format graph":
The Y-axis therefore represents the "Perceived Fidelity" up to 100%. Exactly how fidelity is measured is not important in this simple diagram but obviously will consist of frequency response, dynamic resolution, low noise floor, low distortion including timing anomalies using the same mastering of a recording of superb quality for all formats. On the X-axis, we have "Effective Uncompressed PCM Bitrate" as a measure of approximately how much data is used for encoding the audio. This is a proxy for how much "effort" is needed to achieve the level of fidelity. Note that the scale is logarithmic, not linear to correspond to the logarithmic perception of frequencies and dynamic range. More data, more storage, more "effort" is needed to achieve any improvement to perceived quality as we go towards the top of the plateau to the right of the graph.
As you can see, the curve plateaus since we obviously cannot hear beyond around "100%". At some point, it really does not matter how much data we use to encode the sound, there just will not be any significant perceivable difference and all we've done is wasted storage. The big question of course is at what point along this curve do we place the capabilities of the various audio formats.
Starting with good old CD, we know that scientific research has shown little evidence to suggest in controlled trials that higher resolution sounds much better (see discussion here). Therefore, I think it's reasonable to put it at point (1) which is quite far along the curve already - this would correspond to the 16/44 stereo PCM bitrate of ~1.5Mbps. It's very close to the 100% point - I don't think it's unreasonable to say around 95% so there is a possibility for some improvement. Where exactly this lies is not that important, it could be 90% for example; the main idea being that qualitative gains beyond the CD format are not going to be really massive. As we go higher to 24/96 (~5Mbps, point 3) and 24/192 (~10Mbps, point 5), we achieve essentially 100% perceived quality and for all the effort in terms of bitrate/file size, relatively little is gained. Although mathematically these high-resolution formats can capture more frequencies and greater dynamic range, the actual auditory benefits are limited.
Where does DSD sit in this? Realize that 1-bit DSD isn't as efficient as PCM (a description I've seen calls each bit of DSD an "additive" refinement to the sound, versus a "geometric" refinement with multibit PCM). Furthermore, noise shaping shifts the quantization noise into the higher frequencies resulting in non-uniform dynamic range across the spectrum; this is generally not a problem because hearing sensitivity also drops in the higher frequencies. From what I have heard and through examining DSD rips, I think that DSD64 is better (more accurate) than CD but not much more (I personally think 21-bit/50kHz PCM, about ~2Mbps, is good enough for DSD64 conversion and avoids encoding all that excess noise) whereas DSD128 is just short of 24/96 but very close. Note that this inefficiency in DSD encoding screams for the use of compression which I have argued should really be implemented in DSD file formats a couple years back.
So what about lossy compression in terms of perceived fidelity? Considering that there has not been good data to demonstrate that many people can differentiate high bitrate MP3 from lossless PCM, I have no issues placing it just shy of CD quality. To keep the graph clutter-free, I just used a single line to denote the MP3 320kbps quality even though I recognize that there could be a wide range to the fidelity depending on quality of the encoder and demands of the music. There are special cases, usually containing high frequency content that can demonstrate limitations with high bitrate MP3 but these are rare and generally will not be evident in actual music. You might ask "why is 320kbps MP3 equivalent to ~1.5Mbps uncompressed PCM!?" The answer is due to psychoacoustics techniques employed. Sure, there is significant data reduction, and yes, taken out of context of the rest of the audio, you can hear the difference (as in "The Ghost in the MP3" project). However the data removal was done with sophisticated algorithms informed by models of human hearing. As encoding algorithms have improved, so too have the sonic quality of the resulting MP3 over the years. This is a good example of how you cannot compare bitrates directly; the way the data is being encoded is obviously very different! And sadly PONO advertising doesn't seem to understand this when they keep using diagrams like this:
Just because a lot of data is used doesn't mean there's much benefit even if the recording were done in true high resolution. By the time we get to 24/192, we're way into the zone of diminishing returns and may in fact as some have suggested entered a point where the sound quality suffers because of potential intermodulation distortion from ultrasonic frequencies and some DAC's may no longer be functioning in an optimal fashion. The fact that technologically we can get this far into the curve is also a reflection of the state of maturity of audio science. Personally I remain partial to 24/96 as a "container" for the highest resolution humans will ever need; one which is already standard on both recording and playback equipment.
Finally, as I indicated in a previous post, vinyl has limitations. Yes, it can of course sound great but there are limitations to accuracy (including differences for outer grooves vs. inner grooves), higher overall distortion, and material imperfections. As a result, there will be a wide range to the sound of LP playback as identified in the graph. Perceived fidelity compared to the original source would be lower but also remember that just like the reel-to-reel tape discussion above, some of the distortion and coloration could be "euphonic" as well - hence preferred by some (many?).
I'm sure a graphics artist could produce a much more pleasing image than what I kludged together above :-). Like the PONO and Meridian pictures, it's simplistic but I think compared to the others, a more realistic representation.
Notice that the Meridian graph above tries to suggest that there has been deterioration of potential sound quality over time (especially when they suggest streaming quality is like cassette tape!). I've seen a number of people parrot this same idea in magazines and forums. I think this is nonsense. Consider that even free Spotify is streaming with Ogg Vorbis 160kbps on the desktop (still very good!). With a premium account, you get 320kbps. And sites like Tidal already do lossless 16/44 FLAC. We're looking at quality either reasonably close or identical to CD quality. Here's my version of the chart:
One final comment to those who feel that just because folks like myself do not believe high bitrate MP3 sounds substantially different from lossless 16/44, that I'm somehow "advocating" for lossy audio. That's not exactly true since I don't think anyone would deny that lossless formats are superior for the best accuracy / fidelity. I still prefer FLAC as my archive format because then I can convert to whatever other format I want without multigenerational lossy degradation. However, I do believe MP3 is the way to go with cars and portable audio even if they support lossless and high resolution. High bitrate MP3's are quick to transfer, take up less space, and there's just no way I will be able to hear a difference in my car or walking down the street. I personally find high-resolution lossless files (or God forbid uncompressed DSD) on a phone or portable device extremely wasteful even if storage size were not an issue. MP3 (and similar formats like AAC, WMA, Vorbis...) has its place as a tool for high quality compression and there are many applications where it's all one ever needs to get the job done completely. Plus MP3s are universally supported.
Bottom Line: Remember the principle of diminishing returns as we're dealing with mature audio technology and limitations of the hearing apparatus. It's important to keep this in mind when assessing the promise of "new technology" and manufacturer claims such as the diagrams above.
(Did anyone see any critical comments from the audiophile press about PONO or Meridian's ad material above? How about Sony's 64GB "Premium Sound" SD card recently? There sadly seems to be a lack of critical thinking in much of the audiophile reporting these days, which only serves to isolate this hobby and solidifies the concept of the pejorative "audiophool".)
Regretfully, I missed a live performance by Cécile McLorin Salvant here in Vancouver last weekend. A friend went and thought the performance was amazing! She seems to be channeling a young Ella...
Check out her albums Cecile (2010) and the Grammy-nominated Womanchild (2013) if you like jazz vocals.
Enjoy the music...