Sunday, 1 March 2015

MUSINGS: Audio Quality, The Various Formats, and Diminishing Returns - In Pictures!

Let me be the first to say that graphs and charts where audio formats are plotted out in terms of unidimensional sound quality ratings are ridiculously oversimplified and can be very misleading! However, they can be fun to look at and could be used as bite-sized "memes" for discussion when meeting up with audio friends or for illustration when people ask about audio quality.

Since they're out there already, let us spend some time this week to look at these visual analogies as a way to "think" about what the authors of these works want us to consider/believe. I'm going to screen capture without permission a couple of these images to explore. As usual, I do this out of a desire to discuss, critique, and hopefully educate which I consider "fair use" of copyrighted material; as a reminder to readers, other than a tiny bit of ad revenue on this blog (hey, why not?), I do not expect any other gain from writing a post like this.

First, here's PONO's "Underwater Listening" diagram released around the time of the 2014 SXSW (March 2014):
PONO: Underwater Listening
Others have already commented on this of course (here also). I don't know what ad "genius" came out with this diagram, but it is cute, I suppose. I remember being taken aback by this picture initially as it's so far out of "left field" (creative?) that I felt disoriented when I first saw this thing...

How audio formats would evoke a desire to compare underwater depths remains a mystery to me. Obviously, there's a desire to impress upon the recipient two main messages - a direct correlation between sampling rate (from CD up) with quality, and to make sure the MP3 format gets deprecated as much as possible (1000 ft?!). On both those counts, this image gets it so wrong, it's almost comedic. Clearly, one cannot directly correlate samplerate and bitrate with audio quality because the relationship isn't some kind of linear correlation. Why would CD quality be "200 ft", and 96kHz "20 ft"? Surely nobody in their right mind would say that 96kHz is 10 times perceptually "better". Sure, there is a correlation such that a low bitrate file like 64kbps MP3 will sound quite compromised with poor resolution, but without any qualification around this important bitrate parameter, how can anyone honestly say that all MP3s sound bad? I might as well say that Neil Young's a poor-sounding recording artist because the Le Noise (2010) and A Letter Home (2014) albums are low fidelity.

I suspect that the PONO camp must be a bit ashamed of this diagram since I don't see it around anymore and I don't find it on their website (might have missed it). I don't think the "underwater" diagram made many friends nor sold many machines in any case...

Here's a more recent chart from Meridian circa late 2014:
Meridian: History of audio quality & convenience?!
From this, we "learn" that "downloads" have poorer quality than CDs (always?!). Also, I "learn" that LPs sound significantly better than "DVD-A/SACD" (and by extension high-resolution audio). But the most important point is that current streaming audio sounds worse than cassette tapes in quality. Does that make sense to anyone? Is this saying that streaming Spotify, Tidal, Qobuz, etc. customers are so hung up with convenience that they're willing to listen to sound quality worse than an 80's Walkman?

Of course this is the myth that they primarily want to perpetuate because guess what... Buy this "revolutionary" Meridian MQA and that'll make streaming sound awesome!

While in some cases, sure, we can say a very poorly encoded 192kbps MP3 download (like something done in 1999 with XING MP3) could sound significantly worse than CD and a 64kbps stream can be worse than an excellent cassette copy, like the PONO "artwork" above, there are some truly horrible gross generalizations here! Many LPs sound poor due to low quality pressings, many downloads are qualitatively superb, and clearly any reasonable music streaming service sounds better than a cassette tape - who's kidding whom?! Furthermore, a high resolution digital master (like with high-res downloads or encoded on DVD-A/SACD) has the capability to be more accurate than reel-to-reel tape, but of course subjectively, analogue tape can add its own unique signature/color/distortion that can be preferred... (To be able to mix in the digital domain without generational fidelity loss compared to analogue tape is obviously a big plus.)

Of course, it's easy for me to just criticize without putting something forward... Therefore, please allow me to add for your consideration my submission to the "overgeneralized sound quality vs. audio format graph":

It's a graph of the law of diminishing returns in terms of audio technology and sound quality. I think it's important to take into account the fact that hearing ability is obviously NOT infinite. Due to biological phenotypic variation, there's probably a bell-shaped curve to hearing ability as well as moment-to-moment fluctuations in acuity which is represented by the "Zone of max. auditory acuity" gradient [See comments: probably more of an asymmetrical negatively skewed distribution]. Depending on a person's maximum hearing ability, the 100% point will shift up or down relative to another but let's keep this graph simple and say that for any individual, we can only hear up to 100% based on how we're endowed. Day to day, our hearing acuity changes - everything from current stress level affecting the ability to attend to the sound, to ear wax, to allergies, to sinus/ear infections, to noise induced hearing loss, to tinnitus, to age will result in a decline in the maximum acuity (some of this sadly irreversible). Obviously, mental training can help improve how well we attend and pick up subtle cues.

The Y-axis therefore represents the "Perceived Fidelity" up to 100%. Exactly how fidelity is measured is not important in this simple diagram but obviously will consist of frequency response, dynamic resolution, low noise floor, low distortion including timing anomalies using the same mastering of a recording of superb quality for all formats. On the X-axis, we have "Effective Uncompressed PCM Bitrate" as a measure of approximately how much data is used for encoding the audio. This is a proxy for how much "effort" is needed to achieve the level of fidelity. Note that the scale is logarithmic, not linear to correspond to the logarithmic perception of frequencies and dynamic range. More data, more storage, more "effort" is needed to achieve any improvement to perceived quality as we go towards the top of the plateau to the right of the graph.

As you can see, the curve plateaus since we obviously cannot hear beyond around "100%". At some point, it really does not matter how much data we use to encode the sound, there just will not be any significant perceivable difference and all we've done is wasted storage. The big question of course is at what point along this curve do we place the capabilities of the various audio formats.

Starting with good old CD, we know that scientific research has shown little evidence to suggest in controlled trials that higher resolution sounds much better (see discussion here). Therefore, I think it's reasonable to put it at point (1) which is quite far along the curve already - this would correspond to the 16/44 stereo PCM bitrate of ~1.5Mbps. It's very close to the 100% point - I don't think it's unreasonable to say around 95% so there is a possibility for some improvement. Where exactly this lies is not that important, it could be 90% for example; the main idea being that qualitative gains beyond the CD format are not going to be really massive. As we go higher to 24/96 (~5Mbps, point 3) and 24/192 (~10Mbps, point 5), we achieve essentially 100% perceived quality and for all the effort in terms of bitrate/file size, relatively little is gained. Although mathematically these high-resolution formats can capture more frequencies and greater dynamic range, the actual auditory benefits are limited.

Where does DSD sit in this? Realize that 1-bit DSD isn't as efficient as PCM (a description I've seen calls each bit of DSD an "additive" refinement to the sound, versus a "geometric" refinement with multibit PCM). Furthermore, noise shaping shifts the quantization noise into the higher frequencies resulting in non-uniform dynamic range across the spectrum; this is generally not a problem because hearing sensitivity also drops in the higher frequencies. From what I have heard and through examining DSD rips, I think that DSD64 is better (more accurate) than CD but not much more (I personally think 21-bit/50kHz PCM, about ~2Mbps, is good enough for DSD64 conversion and avoids encoding all that excess noise) whereas DSD128 is just short of 24/96 but very close. Note that this inefficiency in DSD encoding screams for the use of compression which I have argued should really be implemented in DSD file formats a couple years back.

So what about lossy compression in terms of perceived fidelity? Considering that there has not been good data to demonstrate that many people can differentiate high bitrate MP3 from lossless PCM, I have no issues placing it just shy of CD quality. To keep the graph clutter-free, I just used a single line to denote the MP3 320kbps quality even though I recognize that there could be a wide range to the fidelity depending on quality of the encoder and demands of the music. There are special cases, usually containing high frequency content that can demonstrate limitations with high bitrate MP3 but these are rare and generally will not be evident in actual music. You might ask "why is 320kbps MP3 equivalent to ~1.5Mbps uncompressed PCM!?" The answer is due to psychoacoustics techniques employed. Sure, there is significant data reduction, and yes, taken out of context of the rest of the audio, you can hear the difference (as in "The Ghost in the MP3" project). However the data removal was done with sophisticated algorithms informed by models of human hearing. As encoding algorithms have improved, so too have the sonic quality of the resulting MP3 over the years. This is a good example of how you cannot compare bitrates directly; the way the data is being encoded is obviously very different! And sadly PONO advertising doesn't seem to understand this when they keep using diagrams like this:

Just because a lot of data is used doesn't mean there's much benefit even if the recording were done in true high resolution. By the time we get to 24/192, we're way into the zone of diminishing returns and may in fact as some have suggested entered a point where the sound quality suffers because of potential intermodulation distortion from ultrasonic frequencies and some DAC's may no longer be functioning in an optimal fashion. The fact that technologically we can get this far into the curve is also a reflection of the state of maturity of audio science. Personally I remain partial to 24/96 as a "container" for the highest resolution humans will ever need; one which is already standard on both recording and playback equipment.

Finally, as I indicated in a previous post, vinyl has limitations. Yes, it can of course sound great but there are limitations to accuracy (including differences for outer grooves vs. inner grooves), higher overall distortion, and material imperfections. As a result, there will be a wide range to the sound of LP playback as identified in the graph. Perceived fidelity compared to the original source would be lower but also remember that just like the reel-to-reel tape discussion above, some of the distortion and coloration could be "euphonic" as well - hence preferred by some (many?).

I'm sure a graphics artist could produce a much more pleasing image than what I kludged together above :-). Like the PONO and Meridian pictures, it's simplistic but I think compared to the others, a more realistic representation.

Notice that the Meridian graph above tries to suggest that there has been deterioration of potential sound quality over time (especially when they suggest streaming quality is like cassette tape!). I've seen a number of people parrot this same idea in magazines and forums. I think this is nonsense. Consider that even free Spotify is streaming with Ogg Vorbis 160kbps on the desktop (still very good!). With a premium account, you get 320kbps. And sites like Tidal already do lossless 16/44 FLAC. We're looking at quality either reasonably close or identical to CD quality. Here's my version of the chart:

As you can see, I don't believe there has really been any inverse correlation between sound quality and convenience over time. Note the drop in convenience from CD to DVD-A/SACD which I don't think is a big deal since many DVD-As play in regular DVDs and are easy to rip now (dead format anyway), plus SACDs are often hybrid and play on standard CDs (and can also be ripped these days with some inconvenience). The shift from physical media to "virtual" digital data storage has been tremendously convenient although it brings with it a new skill set - file management, proper tagging, and of course managing backups. Now the shift towards streaming has become even more convenient and "mobile" through wireless data networks (but there's limited ability to customize and tag one's collection and the sense of "ownership" of the music - a problem if one is a "collector"). As far as I'm concerned, the only real qualitative decline was from LP to cassette tapes where convenience in terms of portability improved (can listen in cars and Walkmen, less need for cleaning, but no random access song selection which is why I gave LP a 50, and cassette only an increase to 60 overall). I believe streaming just needs a little more bandwidth and if we can reliably get 24/48 FLAC streaming, we will achieve a quality and convenience beyond what most music lovers and audiophiles would feel they "need" (we'll see if MQA really offers much more). Of course, there's always that desire to have physical artwork and booklets to thumb through while listening to the music - vinyl remains the "king" of album art in that regard.

One final comment to those who feel that just because folks like myself do not believe high bitrate MP3 sounds substantially different from lossless 16/44, that I'm somehow "advocating" for lossy audio. That's not exactly true since I don't think anyone would deny that lossless formats are superior for the best accuracy / fidelity. I still prefer FLAC as my archive format because then I can convert to whatever other format I want without multigenerational lossy degradation. However, I do believe MP3 is the way to go with cars and portable audio even if they support lossless and high resolution. High bitrate MP3's are quick to transfer, take up less space, and there's just no way I will be able to hear a difference in my car or walking down the street. I personally find high-resolution lossless files (or God forbid uncompressed DSD) on a phone or portable device extremely wasteful even if storage size were not an issue. MP3 (and similar formats like AAC, WMA, Vorbis...) has its place as a tool for high quality compression and there are many applications where it's all one ever needs to get the job done completely. Plus MP3s are universally supported.

Bottom Line: Remember the principle of diminishing returns as we're dealing with mature audio technology and limitations of the hearing apparatus. It's important to keep this in mind when assessing the promise of "new technology" and manufacturer claims such as the diagrams above.

(Did anyone see any critical comments from the audiophile press about PONO or Meridian's ad material above? How about Sony's 64GB "Premium Sound" SD card recently? There sadly seems to be a lack of critical thinking in much of the audiophile reporting these days, which only serves to isolate this hobby and solidifies the concept of the pejorative "audiophool".)


Regretfully, I missed a live performance by Cécile McLorin Salvant here in Vancouver last weekend. A friend went and thought the performance was amazing! She seems to be channeling a young Ella...

Check out her albums Cecile (2010) and the Grammy-nominated Womanchild (2013) if you like jazz vocals.

Enjoy the music...

Saturday, 21 February 2015

MEASUREMENTS: The Intercontinental Internet Audio Streaming Test...

Time to go intercontinental. :-) [Scene from the old movie War Games.]

After the ethernet cable results last week and the absence of any difference, there was discussion about an extreme internet music server test. What if instead of maybe 100 feet of ethernet cable from server to player, we had thousands of miles of cables in between?

With help from Mnyb from the Squeezebox Forum, we were able to orchestrate a test to demonstrate the extremes of measured performance with the server system basically on the other side of the world. He lives out in Västerås, Sweden approximately 100 km from Stockholm. A direct distance of:

More than 7300 km away from my home in Vancouver, Canada. This test will require the data to be transferred across the Transatlantic internet "backbone" and across North America to get to the west coast of Canada. Considering that the internet infrastructure cables are not straight lines, I suspect the data probably is traveling a substantially longer distance to reach my home than 7300 km.

Undersea Internet backbone map circa 2011.
I can tell that we're traveling a substantial distance based on the results of data latency. Using the paping (port ping) program, I can 'ping' the TCP port of Mnyb's Logitech Music Server in Sweden to see how long 2-way data takes between the two locales. First within my home, here's how it looks:

As you can see, my server is located internally at, and it takes on average <1 millisecond to go from the media room to my music server and back. Port 9000 is the LMS server control port if you're wondering. But when we reach out to Sweden:
IP address removed of course...

It now shows an average of 213ms data latency. Note that this amount of latency is problematic for real-time interactive data transfer like playing a first-person video game off the server... I'd get slaughtered if it took 1/4 of a second to tell the server what I'm doing in a high speed game of Call of Duty! But remember, music streaming is about bulk data transfers.. Let's see if I can find any measurable issue.

I. Set-up

My local system is set up exactly as described last week. I'll use the green 20' standard CAT 6 UTP cable I measured last week to connect my local switch with the Transporter player.

Mnyb's Server in Sweden <--> generic ethernet patch <--> Netgear GS108 gigabit switch <--> 20' flat ethernet cable <--> Netgear GS105 gigabit switch <--> generic patch <--> Linksys WRT1900AC <--> patch cable to wall socket <--> THE INTERNET >7000 km <--> 3' Cat 6 UTP patch cable <--> NETGEAR Nighthawk R7000 router <--> 30' Cat-6A STP Gigabit cable (in wall) <--> 6' Cat 6A STP patch cable <--> TP-LINK TL-SG1008D gigabit switch <--> 20' Cat 6 UTP cable <--> Logitech Transporter streamer

Whew... As for Mnyb's server computer:
HP ProLiant MicroServer N36L - released in 2010, dual-core 1.3GHz AMD Athlon II NEO. 1GB DDR3 RAM, gigabit port, 250GB OS hard drive, Western Digital "Green" 2TB drive.
Running ClearOS 6 (Linux), Logitech Media Server 7.9.

Note that his server machine is less powerful than what I'm using in my tests last week with those different ethernet cables (AMD A10-5800K, 16GB RAM). Not a worry though since serving music is a rather straight forward task not needing much CPU speed; especially with an efficient OS like Linux.

Mnyb's ISP is rated 100Mbps down/10Mbps up. Mine is a 50Mbps down/5Mbps up.

II. RightMark Audio Analyzer (24/96)

Summary of the calculated values. The first 3 rows consist of measured results from last week using various ethernet cables, the fourth row is the result of the "Intercontinental Test":

As you can see, there's really not much difference! Remember that these tests are very sensitive so little variations can happen just because of cables being moved for example. Furthermore, the "Intercontinental Test" was done a week later after I had put everything away from last week's test. Of interest and importance is that distortion results (THD, IMD) were essentially the same and certainly no worse than having a server in the next room. Again, these results are for a 24/96 "high resolution" audio test.

Some graphs then - same set as last week's:
Frequency Response: Essentially a perfect overlay representative of the Transporter device.
Noise level: There is a small spike at 120Hz I didn't see last week. Little spurious noise like this can be seen due to the sensitivity of the system. In any case, there's the 60Hz power line noise, and nothing else measures greater than -110dB.

IMD: Nothing unusual to write home about!

Stereo crosstalk: Very minor differences with tests run 1 week apart.

III. J-Test (24-bit)

I took the J-Test composite from last week and overlaid the result from today's "Intercontinental Test":

You can make out the result today as slightly brighter than last week's plots. Again, this is a pretty close overlay of the 24/48 Dunn J-Test spectrum. No evidence of any significant jitter being added to the signal during the >7000 km journey.

IV. Conclusion / Discussion

As you can see, objectively, there is no evidence that locating the server >7000 km away did anything to the output quality of the Transporter. Neither the RightMark results nor the Dunn J-Test suggests any change beyond what I normally expect from inter-test variation.

Despite the tests completing without issue, I did have trouble streaming the 24/88 FLAC-encoded Shelby Lynne song "Pretend" from the Just A Little Lovin' SACD rip which I had included in the directory for Mnyb to put up on his server... Because the test signals were short and consisted of quite a bit of silence, they were easily compressed by FLAC and transferred easily with an average data rate of <1Mbps. No issues with keeping the Transporter's buffer adequately full. However, a full length 24/88 song has a much higher average bitrate (>2.5Mbps) and this resulted in the need for rebuffering about every 15 seconds streamed across from Europe. Other than the unfortunate buffer under-run every 15 seconds, the song sounded excellent subjectively during playback.

Is this surprising? Of course not! Like I said last week, computer networking protocols are robust in terms of data error correction. When it comes to TCP/IP, internet, and ethernet data transmission, everything happens asynchronously and in a bit perfect way. This is what digital is good at - getting data transmitted perfectly with no degradation. And for asynchronous interfaces like ethernet and asynchronous USB where there is bidirectional flow control, there is furthermore no "clock recovery" process like in older digital interfaces like S/PDIF where jitter can be introduced in the source clock (remember, jitter is a property of the digital source and DAC themselves, not the cable assuming we're looking at reasonable cable lengths as far as I can tell). This is why the objective results look fine even though the server is a continent away. Even though the latency was high as demonstrated by the "paping" data above, it doesn't matter so long as the Transporter's data buffer did not "run dry" as it did when I tried playing a full song in high-resolution resulting in annoying pauses. Considering that both his and my ISP are usually able to manage specified speeds (100 Mbps down/10 Mbps up, and 50 down/5 up), it's unfortunate that I wasn't seeing such throughput during the test (maybe because it was ~9:30PM local time and about 6:30AM in Sweden on the first day of the Lunar New Year?!).

There are at least a couple important implications. First, this test again reiterates the idea (fact) that ethernet cables will make no difference. Why bother with expensive wires for the last number of feet leading to the player device if there is no evidence that data transfer over >7000 km using standard cables makes any difference when played back? Note that along the way, there has likely been a number of conversions between electrical conduction and optical lines. Even the most expensive ethernet cable will of course not speed up transfers and avoid buffer under-runs.

Second, there's no issue with jitter - bits are bits as it applies to asynchronous interfaces and so long as bit-perfection is achieved and data transfer rate is good enough for the local buffer, there's nothing to worry about. This is good as audio consumption gradually transitions to streaming services for many music lovers / audiophiles. The most that will happen are pauses from connection speed issues rather than qualitative difference to the sound during playback assuming the software isn't somehow messing things up!

(BTW - ever wonder why some audiophiles and reviewers obsess over a few feet of digital cable - especially blaming jitter despite really no evidence to show that it makes a significant difference? Yet these days, with the advent of digital streaming and advertising dollars from places like Tidal, nobody talks about the "potential" for jitter when data is being streamed from miles and miles away? Instead there's just isolated talk about ethernet cables this and that to sell expensive stuff yet no consideration for the truly big picture! Obviously there's something wrong with this whole cable "industry".)

A final thanks to Mnyb for "opening up" his LMS server for me to tap into and helping out with this test! Nice music collection you got there, buddy :-). I managed to stream a couple 16/44 FLAC songs off there with very minimal buffer under-runs (3 rebuffers/song rather than every 15 seconds with the 24/88 tune), and no rebuffering issues at all with 256kbps MP3 which still sounded great. (Don't forget the result of the MP3 test in 2013! High bitrate MP3 sounds excellent and still has a role to play when data speed is restricted or unreliable.)

Enjoy the music as we end off February 2015... Happy lunar new year for all celebrating!

Saturday, 14 February 2015

MEASUREMENTS: Ethernet Cables and Audio...

Remember folks, this is what an "ethernet frame" looks like. It's data. There is no "music" in there until decoded and prepared for the DAC! Notice the CRC bytes for error detection. (The numbers represents how many bytes for each segment.)

0. Preamble

Hey folks, over the years, I have been critical of high-end audio cables... Previously, I have shown that RCA analog interconnects can result in measurable differences with channel crosstalk changes with long lengths. But the digital interconnects themselves do not result in measurable differences even in terms of jitter (TosLink SPDIF, coaxial SPDIF, or USB). Although my HDMI receiver's DAC isn't as accurate or jitter-free, different HDMI cables don't seem to make any measurable difference either. The only caveat to this being that a digital cable can just plain fail, in which case the distortion to the audio signal has a particular (annoying) characteristic which is clearly audible and not a subtle change (eg. a poor USB cable sound).

So far, I have not seen any further measurements to suggest my conclusions are inaccurate. I have seen audiophile reviewers and forum posters still claim digital cables make an audible difference and when questioned they provide lots of words but no actual empirical evidence. It has been awhile since I've seen any articles claiming objective evidence for cable measurements - haven't come across new ads or audiophile articles although of course I may have missed some.

However, as computer audio expands, there will be opportunities to "brand" more hardware as somehow "audiophile approved" and companies that make audio cables likewise will naturally capitalize on new lines of interconnects / cables... And as expected, cost of these things will be commensurate with "premium" products.

Which brings us to the concept of "audiophile ethernet cables" (see here also, and recent mainstream press exposure of the "madness"). Let me be clear. If I have issues with USB cables, or SPDIF cables, making any significant contribution to audible sound quality (assuming again essentially error-free transmission of data), there is no rational explanation whatsoever that ethernet cables should make any difference. The TCP/IP protocol has error correction mechanisms that allow for worldwide transmission integrity (otherwise Internet financial transactions should be banned!),  and is asynchronous so there is no temporal dependence on exact timing mechanisms (jitter not an issue with adequate buffer to reclock and feed the DAC). So long as the "protocol stack" is functioning as it should between the devices, there will not be any issue. Systematic errors causing audible distortion either means hardware failure or poorly implemented communication software. Therefore the expectation if we were to test or "listen to" different ethernet cables is that there would be no difference.

Since I like to make sure objectively, let us at least run a few tests to see if indeed evidence can be found to support the hypothesis.

I. Test Setup

First, we must decide where to place the ethernet cables to test... You see, in any server/streamer system, we expect that there would be a few network cables in the data path. For the sake of ease in measurements and assuming the same thing as audiophile beliefs in power cables, let us place the test cables as the last leg of the data path between the streamer and the last ethernet switch (this guy also thinks the last leg is important). Here then is my setup:

Server PC <--> 6' Cat 6 UTP patch cable <--> 20' Cat-6 Gigabit generic cable (in wall) <--> NETGEAR Nighthawk R7000 router <--> 30' Cat-6A STP Gigabit cable (in wall) <--> 6' Cat 6A STP patch cable <--> Test switch <--> Test cable <--> Logitech Transporter streamer

As you can see above, if we trace the route the data takes between server and streaming device, we're usually looking at quite a bit of cable! In a typical "wired" house, much of the cable exists in the wall and would not be amendable to easy rewiring. Since I just did some renovations last year, I made sure to run high quality Cat 6A STP from router to the sound/media room. I am going to not just test a few cables, but I'm also going to try a different ethernet switch! Here are some details:

Server PC: AMD A10-5800K quad core PC, stock 3.8GHz speed, running Windows Server 2012 R2, Logitech Media Server 7.9.0 build 1420794485 [Jan 12, 2015], 16GB DDR3 RAM, built-in Realtek PCIe ASUS motherboard gigabit ethernet interface.

NETGEAR Nighthawk R7000 router: running custom firmware "kongac build 24345". Very stable with >100days uptime currently, underclocked to 800MHz just because I never needed the 1GHz speed.

Streamer/DAC device is the venerable Logitech Transporter. Remember that the Transporter only runs at 100Mbps whereas the rest of the system is capable of gigabit (1000Mbps) speeds.

The "Test switches": for the most part, I will use the inexpensive gigabit TP-LINK TL-SG1008D which I bought at a local computer store slightly more than a year ago (<$30). It's got 8 ports and fast enough for 100MB/sec (that's 100 megabytes/sec) file transfer through it from server to my HTPC:

The white thing underneath is just my LP carbon fibre brush to lift it a little to photograph the front easier. Ahem... Pardon the dust... :-)

In comparison, for a couple of the tests I will use this little guy:

A LinkPro SOHOHub 4-port 10/100Mbps switch which I believe is about 10 years old (the TC3097-8 interface controller inside came out around 1998). I found it in the attic of the house I bought, powered by a Canon AC adaptor which provided adequate juice.

For both these switches, I will keep my HTPC computer connected to one of the other ports.

The "Test cables":

So, I rummaged around my pile of computer parts and found these cables to test. Note that I was shopping at when I was doing some renovations and getting my network system up. I "standardized" on some rather inexpensive Cat 6A cables on sale there - hence the nGear brand which they carried.

The top picture, from the left we have a 1-foot length of Cat 6A STP (<$3.00) - presumably the "best" cable given the short length and excellent shielding. Note that Shielded Twisted Pair (STP) cables are not necessarily better than UTP (Unshielded...); one must make sure the shield is properly connected at each end. Next we have presumably the "worst" cable of the bunch - a generic "freebie" 3-foot length of Cat 5E UTP patch cable that has been sitting around for the last 5 years in my pile of parts. The blue plastic jacket is loose and quality so flimsy that I can probably pull it apart easily without much strength needed. Then we have a 10-foot length of Cat 6A (<$6.00), and finally, a much longer 50-feet length of Cat 6A STP (~$15.00). Cables from the same brand will allow us to see if length makes a difference.

The green cable in the lower picture was one I found in my office. It's a 12-year old 20-feet generic Cat 6 UTP cable that has been in daily use for the last 12 years... I guess you can call it "burned in"!

Sorry folks, I don't have any Cat 7 cables here. At this point, I don't see any reason to use these since I'm only running a 1 Gbps network. Anyone out there running a 10 Gbps network at home requiring Cat 7 cables? Realize that even Cat 6 is potentially capable of 10 Gbps up to 50m (>160 feet) or so.

I will measure with RightMark (newest 6.4.1) to look at the usual dynamic range, noise floor, distortion along with the Dunn J-Test signal to see if there's any evidence of jitter anomaly in the Transporter's RCA DAC output (rather than the XLR for the sake of convenience). Some well shielded 6' RadioShack interconnects used (Cable C here). As usual, my E-MU 0404USB device was used for the measurements. All measurements done in 24/96 (high resolution) or 24/48 for the jitter test.

Let the measurements begin...

II. RightMark Audio Analyzer (24/96)

Here's the summary table of all results with 5 cables with the TP-LINK gigabit switch and 2 other measurements with the old 100Mbps LinkPro switch:

As you can see, there are no significant differences in the audio output at all. Analogue output was measured all the way to 48kHz - well beyond the audible spectrum. It didn't matter whether the cable was 1-foot all the way to 50-feet. Likewise, Cat 5E, Cat 6, Cat 6A, UTP or STP made no difference whatsoever. In 2 of the tests (50' CAT 6A & 3' CAT 5E + LinkPro), I was playing 20Mbps 1080P MKV video concurrently on the HTPC connected to the switch to increase the data rate coming from the server - no difference in background noise or anything else.

A few graphs from which the calculated data were derived:
Frequency Response: Exact overlay.
Noise level: slight 60Hz hum measured down at -115dB, everything else even further below this.

IMD: Again, essentially perfect overlay with the different cables.
Stereo Crosstalk: Would be very bizarre to see any anomaly here!

III. J-Test (24-bit)

Instead of showing 7 individual J-Test graphs, I decided to overlay each one to create a composite image:

As you can see, there is some normal variability in the noise floor around the 12kHz primary frequency but otherwise, nothing sticks out. There's some low-level jitter around 12kHz, some of which I'm sure related to the E-MU device itself rather than just the Transporter.

No evidence that any of the cables / switch changes resulted in any anomaly using the 24-bit Dunn jitter test. None of the sidebands exceeded -110dB from the primary frequency peak at 12kHz. Note that the peak itself is at -3dBFS, but I measured it a bit lower to avoid the use of the E-MU's input amplifier which would add some noise. Again, no change observed (ie. worsening of noise floor or stimulated jitter sidebands) even when the HTPC was concurrently streaming a 20Mbps movie from the server.

IV. Summary / Conclusion

I believe if there indeed is an ethernet audio device that "sounds different" because of different cables being used, then that device should be returned because it is obviously defective. Remember folks, it is like accepting that the earth is spherical or that 2+2=4 - because that's just the way it is. Ethernet communication is an engineered system, the parameters and capabilities of this system is not only understood but designed to be the way it is by humans! You really cannot claim to have "discovered" some combination of dielectric or conductor or geometry that works "better" within an already errorless digital system unless you're claiming improved performance outside technical recommendations (in the case of Cat 6 for gigabit networks, it's 100m or 328 feet lengths within a reasonable ambient electrical noise environment).

It's also worth remembering that audio data bitrates are quite low. Today, I hope nobody is running anything slower than 100Mbps "fast ethernet". Although my music is generally streamed out as compressed FLAC, even if you stream uncompressed WAV files, standard stereo 16/44 CD-quality audio requires <1.5Mbps, 24/96 requires ~4.6Mbps, and stereo 24/192 ~9.2Mbps. Even if we went uncompressed multichannel, 5.1 24/96 would only use up <14Mbps.  Considering how cheap gigabit (1000Mbps) networks are, there's no reason not to build upon the gigabit standard these days. There's generally no reason to complain about decent Cat 5E cabling, but splurging a little on Cat 6+ isn't a big deal. The Transporter device used in these tests is almost 10 years old at this point and limited to 100Mbps. I would certainly be surprised and disappointed if a modern audio streaming device measured differently with various cables these days with even faster ethernet interface hardware!

Ultimately, I'm not suggesting anyone use the cheapest ethernet cable he/she can find. If you like the esthetics and build construction, go for it! Just realize that it's essentially impossible to argue that a functioning (free of data transmission error) ethernet cable will "sound" any different or worthy of significant cost differential based on sonic quality. The idea of specialized "audiophile" ethernet cables (or "ethernet switches" for that matter) is plain nonsense.

For the record, subjectively, I have never heard a difference between ethernet cables on my system. For fun I did have a listen to Shelby Lynne's Just A Little Lovin' (2012 Analogue Productions SACD ripped to 24/88) - sounded great to me even with the Cat 5E freebie cable and cheap LinkPro switch while a 20Mbps movie was playing off my HTPC. I have never tried those expensive cables from AudioQuest or Chord, but seriously, why bother when there's no logical rationale based on understanding of an engineered product and the lack of empirical evidence? Must a person try out or demo every claim made or testimonial uttered when some things are self-evident? Must I double check when someone comes up to me and tells me the world is flat or the sun rises in the west? Should I also try Snake Oil if someone in a crowd around the traveling salesman yelled out that it "Works for me!" without any other evidence?

Well, it looks like Chord got their hands slapped for claims about sound quality with their ethernet cable ads determined to be "misleading advertising", lacking in "substantiation", and "exaggeration" in November 2014. Bravo to the UK's Advertising Standards Authority. Truth is important.

Bottom line: There's no evidence that any of the digital cables make an audible difference be it TosLink, coaxial, USB, or now ethernet within an error-free system.**

As usual, if anyone feels that I am in error, please demonstrate and leave a link to the evidence.

Okay... Now I really have to go do some real work :-). Enjoy the music!


** I was reminded about this post I made using the Squeezebox Touch and EDO plug-in awhile back. In it, I was able to demonstrate measurable differences using an unshielded cheap 3' "zip-chord" RCA cable instead of a proper coaxial cable (I'm sure it's nothing close to the 75-ohm impedance spec). It is a reminder that we of course should be using *proper* cabling and that extreme situations like in this post will allow demonstration of noise phenomena that otherwise would be highly unlikely. Notice also how this poor RCA cable degraded sound quality when pushed to 24/192 which is also outside the usual Squeezebox Touch specification but available thanks to the Triode plugin.

Friday, 6 February 2015

MEASUREMENTS: Bob Dylan's "Shadows In The Night" - when 24-bit HRA isn't! (Qobuz)

Of shadows... and hot air...
A reader gave me a tip about the new Shadows In The Night album from Bob Dylan. The allegation is that the 24-bit high-resolution downloads of this album are in fact NOT true 24-bits as claimed!

To start, here's a video to show what the inversion-null should look like with a true 24-bit audio sample:

Now, consider what happens when we use a 24-bit track from Qobuz versus the same track ripped off the 16-bit CD:

As you can see, there is essentially a complete null using the 16-bit CD version on the 24-bit file when the amplitude is boosted by 0.1dB in 24/32 bits and time aligned! Upsampling like from 44kHz to 96kHz is usually easy to spot but this is one of the first times that a "fake" 24-bit file is so easily spotted! Shame on Sony/Columbia/Qobuz for doing this...

If we look at the DRDatabase, one sees that the DR values for the LP appear very similar to the digital editions. It's quite possible that the LP was pressed with the same mastering as well.

16/44 CD Dynamic Range log.

As you can see from the DR log for the CD above, the results are a bit of a mixed bag. DR11 is good! (Maybe we're getting somewhere with new releases not being so badly dynamically crushed?) But what's the deal with the peak volume of only -6.45dB? We've basically wasted 1-bit's worth of dynamic resolution which could have just been optimized with a simple normalization step! Instead of being unnecessarily loud, this CD is unnecessarily quiet! In effect, the maximum bit-depth resolution has been reduced to 15-bits; which in this case also means the 24-bit version is no better!

I guess if I bought the 24-bit version, I'd be wanting my money back. Note that the 24-bit sample I was sent to analyse was from Qobuz, it would be interesting to know if the HDTracks version is also the same. Maybe others can double check if what I'm reporting here holds for all the tracks on that album.

This kind of thing really cannot be tolerated as it's basically a cash grab for zero benefit. There's a 30% price premium on the 24-bit file on Qobuz! You really have to wonder in this digital download model, who is responsible for quality control - especially for a high profile artist like Dylan?

Bottom line: Buy the Bob Dylan CD if you want the album, but do not bother with the 24-bit high-resolution download until there's clarification that either what I'm seeing is wrong or updated 24-bit files have been uploaded. Also, I wish companies like HDTracks and Qobuz would open up a review/comment system like Amazon for people to share information about the quality of the files since the issue of provenance and concerns such as this (and Beck's Morning Phase last year) do arise.

Sony, so does this Bob Dylan album deserve to be called "Every bit a master."? I think it would be precious if we start seeing this album cover on one of those ads promoting HRA instead of Bennett/Ga Ga! :-)


BTW: On a positive note, as an avid 80's music guy, I like looking out for compilations of stuff I might have missed... Recently I took a chance on the Blank & Jones' "So80s Presents Alphaville" 2 CD set since it had some mixes I had not heard/seen before... WOW! I was impressed. Sounds great and the dynamics were excellent (DR11 both disks). Considering I had been disappointed by previous Blank & Jones compilations, I was pleased by the significant improvement! Keep up the good work, guys...

Thursday, 5 February 2015

MUSINGS: The ongoing Vinyl vs. Digital debate...

The always assertive Batman. But assertiveness isn't necessarily correct...
(Just a shortish post this week... I've covered much of this in the past.)

As I mentioned in the comments section of the previous post on HRA, there have been discussions recently again around the sound quality of vinyl compared to CD. I assume it started from this article "Why CDs May Actually Sound Better Than Vinyl" from the L.A. Weekly. Overall, I think it's a good article! Some excellent quotes from veterans in the audio engineering business like Bob Clearmountain and Bob Ludwig. We even have the pleasure of an interview with James T. Russell - the "father" of digital optical media. Folks... These people know what they're talking about, so do not take their comments lightly.

Despite the fact that there's nothing ground breaking in that article, it's no surprise that the vinylphiles are up in a tizzy. Michael Fremer has now dragged out an old 1996 article about "Does Vinyl Have Wider Dynamic Range Than CDs?" (the PDF in the article). Let's talk about this...

Scroll down to the comment by "comfortablynick" in that blog post for a good discussion about some highly questionable comments made by the author around his beliefs about digital audio. The lack of consideration for the importance of dithering to 16-bits is a particularly egregious oversight. "Phuzzyday"'s comment about the author referring to "41kHz" as the CD sampling rate in the final page of the article also does not help the author's credibility.

Even with all these concerns, I want to lodge one objection to his basic "thesis". It's his definition of dynamic range for the purpose of this article from which everything else builds:
"For CD players, dynamic range is essentially the ratio of the loudest possible output signal to the quantization noise, i.e., the noise corresponding to the round-off error of the least significant bit. For the 16-bit standard CD format, this is 96dB dynamic range. 
To parallel the previous CD definition, I define the dynamic range of a phono cartridge as the ratio of the loudest sound to the background noise referenced to the preamp output terminals..."

As mentioned, the first paragraph is inaccurate in that it doesn't take into consideration the effect of dither on perception of a low level signal, allowing us to effectively hear below the 96dB quantization noise floor. With 16-bits, dithering will allow us to perceive down to about 110dB with "flat" dithering and even more with noise shaped dithering in the frequencies that humans are most sensitive to (see this page for an example using 8-bit audio to demonstrate the effect). Dithering has been standard practice since the dawn of digital media. Using the definition in the second paragraph, the author then produces calculations based on ideal values predicted by the electronics and mechanical velocities to estimate the capability of the phono cartridge-preamp system; coming up with up to 110dB for the systems analyzed. The problem obviously is that this level of performance is not what we experience when we put on a piece of vinyl for a spin!

Where in these calculations do we take into account the real world? Remember, vinyl doesn't have "infinite resolution". The LP itself and the playback system are imperfect physical object. Surface noise? Pressing inaccuracies? How about geometry issues with the tonearm and cartridge setup? Groove wear over time? Stylus wear? Noise picked up in the low voltage phono wires? Imperfect dampening? I'm sure there are other issues I've neglected.

Of course these issues will also result in other (IMO more bothersome) distortions and not just dynamic range limitations. For example, the temporal distortions from wow and flutter compared to the accuracy of digital playback (as I've said before, nobody should be concerned about DAC jitter if one can tolerate the inaccuracies from a turntable!).

In comparison, for digital, the data can be transmitted with 100% accuracy quite routinely. Already excellent with spinning polycarbonate CDs using laser light to read the information, and much more so with hard drives, or SSDs these days. The media itself (basically just the file with computer audio now) is "perfect" compared to the limitations of the physical analog vinyl. Furthermore, the conversion of the digital information to analogue electrical signal in the DAC is completely free from mechanical issues. Therefore the real world output characteristics like dynamic range is also much closer to the ideal (and more predictable compared to the variability of LPs from a store). These days, 16-bit dynamic range is quite easily surpassed with decent 24-bit DACs.

It's interesting that the author acknowledges noise in LP playback (p. 24) but he doesn't seem to estimate the (significant) contribution or incorporate this into his calculations. Needless to say if he did, the numbers would not be impressive. Probably just ending up as the "widespread notion that phono dynamic range is only 60-70dB" (p.17). Notice the sleight of hand here in that the author is only calculating the cartridge and preamp's theoretical dynamic range but the title of the article and implication from Fremer is that it refers to the whole phonograph playback system so as to have any meaningful comparison with typical CD playback.

In practice, realize that 60-70dB of dynamic range isn't that bad; that's about 11-bits of dynamic range in the digital world... In fact, take a typical 16-bit digital file and knock off the last 5 bit of resolution. It doesn't sound so bad still unless you're listening to stuff with lots of dynamic range and you need to up the volume - like classical music perhaps. It's odd that he seems to blame the RIAA; "lack of quality control is abetted by the RIAA record industry standard permitting surface noise of 55dB below a 1kHz sinusoid recorded at 7cm/s". Hmmm, again, if this is indeed the issue that limits the true dynamic range at playback, why isn't Mr. Bauman spending time to show us exactly how low surface noise can be and if state-of-the-art material sciences in vinyl is able to embed a signal anywhere close to his idealized calculations of dynamic range and show us how that would compare with digital!?

Which brings me to my last point. The classical music people I know are interested in CDs and SACDs (maybe DVD-A and Blu-Rays), not vinyl. Other than a few collectible classical albums usually on display, the used vinyl stores I peruse typically sell the rest at $3.00 or less. In fact, one of the stores has a massive rear portion with probably thousands of classical LPs that appear to go untouched; the stock doesn't seem to move and I rarely see people rummaging around back there compared to the rest of the rock/pop/jazz sections up front. Classical music demands silent noise floor for the quiet portions and dynamic range is truly an asset. I have no problem with the higher noise floor from vinyl with most rock and pop since it can be masked by the higher average volume, but classical music demands higher quality vinyl; it certainly makes sense for classical music lovers to go digital primarily. And as far as I can tell, this is exactly what has happened given digital's superior fidelity. As you can see in the James Russell interview with L.A. Weekly, he developed optical media because he felt LPs were not good enough for the resolution demands of classical music. (Furthermore, beyond sound quality, maybe convenience is even more important in classical music to avoid flipping sides and changing disks while trying to enjoy a full symphony.)

To end off, like others, I do enjoy my collection of LPs. I'm up to >300 albums now (out of that only a handful of classical). They live side-by-side with my digital music harmoniously in the sound room. No problem switching between LPs, digital files, multichannel music in an evening of listening. They each have their pros and cons. LPs can sound fantastic when the vinyl is in good shape, and the artwork can't be beat. The cleaning, cataloging, playback ritual provides a physicality that blesses and satisfies the collector's soul (or conversely a curse to the obsessive-compulsive hoarder). Personally, they bring back old memories of a time in my youth (70's & 80's) when I could not afford to own this stuff (but much of the popular albums I like are now cheap!). But it is old technology, superseded in all kinds of ways by the potential fidelity* offered with CD and now digital files. Let us not kid ourselves about the inherent limitations of the vinyl format when it comes to reproduction accuracy/fidelity compared to the original "studio master". I hope we can be graceful with the acceptance of this fact rather than with denial and anger.

One final thing... Cassette tapes making a come back? Sure! I love the scenes in Guardians Of The Galaxy. Just don't call it "high fidelity" or better than MP3, OK? :-)

Exhibit A (from a few years back - for kicks, check out the YouTube comments on this!):

* Take note that I said "potential fidelity" here. There are many LPs that clearly sound superior to the CD counterpart. This happens all the time with better mastered (eg. more dynamic) LPs compared to terribly dynamically compressed CDs. The "vinyl rips" to digital files sound fantastic with these LPs and demonstrate just how well digital technology is capable of capturing what vinyl has to offer. But if we were doing a direct comparison with exactly the same mastering on both LP and CD (using the best LP cutting system and best ADC), I'd go with the science and say that the CD digital format would be able to reproduce the frequencies up to 22kHz with greater precision both in terms of dynamic range and time accuracy; it would sound more like the original master. LPs can reproduce >22kHz frequencies if the source contains ultrasonic material but evidence for needing ultrasonic frequencies to improve audio fidelity remains elusive (see the HRA post last week and the Oohashi stuff).


Alright folks... Lots of "real" work to get done now. Have a wonderful weekend and week ahead.  See you in the next instalment (I might be away a couple of weeks). Enjoy the music - in whatever format you darn well choose!

Thursday, 29 January 2015

MUSINGS: What Is The Value of High Resolution Audio (HRA)?

I promised not to talk about Pono anymore, so I won't to a significant degree :-). This article is one about high-resolution audio in general.

It has been interesting seeing the audiophile press's responses to articles such as:
Gizmodo's "Don't Buy What Neil Young Is Selling"
Pitchfork's "The Myth and the Reality of the $43 Download"

with these:

Analog Planet's (Michael Fremer) "Gizmodo Won't Post My Comment So I'm Posting It Here"
AudioStream's (Michael Lavorgna) "Is High Resolution Audio Elitist?"

As I had laid out in this blog a number of moons ago (March 2014 to be exact) in "High Resolution Audio (HRA) Expectations (A Critical Review)...", there are many challenges to overcome in order to perceive an audible difference between a true high-resolution recording and the same track down-sampled to standard CD (16/44) resolution. This challenge is significant and technically difficult; culminating in the ultimate question of whether one's own hearing mechanism is even capable of the feat. (Remember folks, aging is good for wine, not so much for hearing acuity!)

I think there are a number of issues here and it's important to treat each separately without getting overly simplistic into a single declaration of whether HRA is "good/needed" or "bad/worthless". Furthermore gentlemen, there's no need for name-calling or ad hominem attacks.

For your consideration, here's how I see it:

1. Is the high resolution 24/96+ PCM format better (more accurate)?
Of course! Assuming we have a good quality recording, high-resolution formats afford better objective dynamic range for the music and accurately records more of the spectrum than 44kHz sampling rate. We know that LPs retain more than 22kHz of sonic information (the resolution of course is limited in dynamic range and distortions are higher with LPs). High sample rates will get us away from any concerns around ringing due to filters functioning near the audio spectrum which Fremer refers to. But folks, let's not overplay this either because "issues" like pre-ringing from digital filters are of questionable audibility and papers like this one from the Meridian folks (see this thread by page 5 for details and criticisms) show that even if we ignore all the concerns raised, aggregate correct responses was only 56.25% (160 trials) - just above their statistical significance level.

Higher 24-bit resolution is obviously beneficial in the studio to allow for more accurate digital processing. These days I'm doing digital room correction on playback with convolution filters so I think it's nice to have 24-bit files for that extra bit of accuracy during playback.

As a perfectionist audiophile, I don't think there's anything wrong with wanting the most accurate version of the album if available as a high-resolution file representing the "studio master". However, we have to realize that the perfectionist audiophile's desires are not the same as most mainstream music lovers. I don't know if anyone has done a demographic survey, but I'm sure it's a pretty small sliver of the music-buying public that would even care.

2. Is High Resolution Audio (HRA) audible?
For the vast majority of people, I believe the answer is clearly NO.

Differences are at best subtle. Neil Young's musician buddies (video) are clearly over-dramatic about what they heard or he did something to the car audio to accentuate the difference between MP3/lossless/hi-res in my opinion. CD vs. HRA is not analogous to the visible difference between DVD 480P and Blu-Ray 1080P obviously, otherwise we wouldn't be arguing about this. The 24-bit audio test performed here last year as well as the 44kHz vs. 88kHz sampling rate discrimination test from this study for example are consistent with this conclusion (for the 44 vs. 88kHz paper, the abstract was vague and I think misleading, look at the overall results and you see only 3/16 listeners able to get significant results but selected the wrong answer consistently; 13/16 trained listeners scored non-significantly). Other studies in the literature over the years have not been able to show significant effects at all (such as this 2005 study with 24/192).

Despite the above, who knows, maybe there are lucky (and more than likely young!) folks who have awesome auditory acuity. The DAC has to be good enough. Amps, speakers, headphones will need to be up to the task of reproducing the dynamic nuances and high frequency response in a reasonably flat fashion without exciting too much intermodulation distortion or high frequency ringing (like with the tweeter in the old Meridian DSP8000 active speaker measurements). Plus you need a quiet sound room. (Remember folks, research studies are generally conducted in controlled ideal environments using equipment with known objective capabilities. I have a sneaky suspicion that showing off hi-res sound quality in a car ain't gonna cut it!)

[For completeness, I've included Appendix A below for those who want to think about the Oohashi "Hypersonic Effect" referred to in Fremer's article.]

As demonstrated by the Fremer and Lavorgna articles, the audiophile press really likes to claim that the Meyer-Moran study from 2007 has been "debunked". One really should not expect large magnitude differences anyway based on existing literature, so the Meyer-Moran negative study is absolutely to be expected. For those who are unfamiliar, this study basically is a blinded ABX test to see if members (presumably audio enthusiasts) of the Boston Audio Society, around 60 of them, can tell the difference between "high resolution" SACDs and DVD-As played back directly versus going through a 16/44 A/D/A chain (basically "dumbing" the signal down to CD resolution). You can look at the equipment and music used here. The result was that respondents detected the 16/44 "loop" playback accurately 49.8% of the time - purely chance. No evidence the women or younger folks were any better either. To make a long story short, I think it's fair to criticize the study for using questionable high-resolution recordings (like old classical albums, or questionable quality analogue recordings, I've listed many SACDs already which are likely just upsampled PCM here). But remember these recordings were available and sold as high-resolution and would be typical audiophile fare back in 2007. There were at least six albums from Chesky, a couple from Telarc, Steely Dan's Two Against Nature DVD-A, the Dark Side Of The Moon remastered SACD from 2003, and Patricia Barber's Nightclub SACD from MFSL. Sure, we can discount the study as "flawed" because the music wasn't good enough or of high enough quality, but then we would have to contend with the following question...

3. Do we actually have many albums worthy of High Resolution Audio?
As I indicated last week, this is the BIG problem - literally the elephant in the room. I can understand why the industry may not want to talk about it! Because to change it will require massive effort and criticism of what has been "business as usual" for decades in terms of digital production standards since the mid-1990's. And it also means taking a hard look at whether there is truly any benefit issuing old recordings in the analogue era for high-resolution reissues.

Other than new audiophile all-digital recordings (mainly classical, jazz, vocals from specialized sources like Channel Classics, 2L, AIX, etc.), or maybe remasters from high-quality sources from Audio Fidelity/MFSL/SHM-SACD, etc... the vast majority of music does not require the resolution of HRA whatsoever. Heck, most of the top-40 pop/rock tunes don't even challenge high bitrate MP3. Mark Waldrep (Dr. AIX) has been warning us about this for years in his blog - I believe he's right. Highly dynamically compressed music with high inherent noise floors and unnatural recordings that were done without intent to preserve the full frequency spectrum or decent dynamic range does not sound any better in 24/96+. In fact, to truly take advantage of the resolution available, one must ensure full resolution through the whole production chain from recording to mixing to ensuring digital processing is done with adequate precision, to the final mastering step. I think Waldrep is right that the high resolution era (that is, capable of utilizing both >16-bit resolution, and >44/48kHz sample rate) truly began after the availability of high-quality digital recording gear (mid to late-90's?). All the music before then in all likelihood may only benefit from higher sampling rate, but 16-bits is all that's needed. As an example, although we can use 96kHz+ sampling on our favourite analogue recordings to preserve as much of the extended frequency as reasonable, I think most of us realize that as much as one may "love" yet-another-remaster of Miles Davis' Kind Of Blue, Jazz At The Pawnshop, Dave Brubeck, Living Stereo classics or the Rudy Van Gelder discography, the dynamic range for these recordings are limited and 16-bits would be more than enough to capture everything down to the tape noise.

It's easier isn't it to pretend that all music can be digitized in high-resolution and sold in this "new and improved" format? The music industry of course would loathe to not be able to sell yet another re-issue (this time with the HRA sticker on the cover) and have us all buy another copy of something we already have...

4. How much 'should' this cost?
Ah, the billion dollar question! The other day, I was in BestBuy and noted that the high-resolution Blu-Ray copy of Gone Girl goes for $25 and the "standard resolution" DVD was $20 (enjoyable movie BTW especially if you're a David Fincher fan). So, unless one is still stuck with a small (<30") TV or incapable of viewing higher resolution video, having access to a high-resolution movie (which by the way also has lossless surround soundtrack to boot) costs 25% more. Considering that HRA isn't as easy to differentiate from a CD compared to 480P from 1080P video, should we even be charged a 25% premium? Consider that a CD can be bought these days for $10, how much do you think we should be charged for a digital music download when all it is is a data copy sent down a utility which I personally pay for (ie. my internet provider's monthly cost)? We do not get a plastic case, printed cover/booklet, or a piece of polycarbonate in hand even. Furthermore, the CD can be resold! Can I resell a 24/96 music download?

Ultimately the market will sniff out what the cost should be based on demand... Personally, I have purchased many SACDs and DVD-As over the years at significant premiums over the CD. Only a couple music Blu-Rays so far. Truth be told, I mostly buy those with a 5.1 surround mix or maybe a 3.0 mix (like the Analogue Productions Nat King Cole SACDs) and there being a "collectability" component to the purchase which a download will never have. If I were to throw out a number, I think $13 for a high-resolution 24/96+ FLAC download is OK with me (assuming the CD costs $10) knowing that it's a more specialized item and there's a cost premium to that. Short of some kind of inflationary tailspin, I can't see spending $20+/album as reasonable for a standard digital download. I personally cannot see the jump from 24/96 to 24/192 as representing any value so would not likely pay more than maybe a dollar or two. I'd love to hear what others think should be the target price.

High Resolution Elitism!?
In the "Is High Resolution Audio Elitist" post by Michael Lavorgna, he suggests "you have to wonder if some of the stronger negative takes on what is a plea for better sound quality are rooted in an emotional response as opposed to a technical one". Indeed, there has been a strong negative emotional response hasn't there? But really, is this surprising? Considering the unrealistic hype around HRA (like the 'artist coming out of car' video above), and misleading diagrams (the underwater qualitative diagram, Sony's stair-step digital waveform), it's no wonder folks who know a thing or two about digital audio cannot help but feel perturbed by the claims. Mr. Lavorgna, it's not "envy" that is the primary emotion; it's actually disgust.

I find it fascinating that Mr. Lavorgna will even bring up the price issue or "elitism" with HRA. The truth is that even relatively inexpensive devices these days like say a $200 24/192 DAC off eBay from China can easily achieve high-resolution playback (even this oldie). USB-stick DACs like the AudioEngine D3 are also fine (up to 24/96 in this case). What's the big deal?! From a hardware perspective, $$$ should really be spent on quality speakers, a decent sound room, and room treatments for the best "bang for the buck" in general and especially for high resolution music reproduction. In fact, I think it's important to scrutinize expensive audiophile gear and ask for objective evaluation - even the highly touted PS Audio PerfectWave DirectStream DAC doesn't have an impressive measured noise floor that can benefit from >17-bits resolution (it sounds good at the local dealer BTW, but the accuracy is measurably limited). Definitely avoid weird expensive tube DACs like the Allnic and Lector Strumenti. Objective reviews with measurements become even more important when we get into high-resolution audio. Furthermore, just because some piece of equipment is expensive doesn't mean it's desirable. And just because some people criticize the cost of the hardware and software doesn't mean there's an underlying unfulfilled desire because they're lacking financial resources available to the "elite" of this world! Sometimes the asking price is just obviously ridiculous for what one gets. (And "sometimes a cigar is just a cigar.")

In Summary...
Ultimately, I hope folks don't get too side tracked by things like the hardware, whether any particular company/spokesperson is worth backing (ie. Pono/Young, HDTracks/Chesky, Sony Walkman, Astell&Kern, etc...), or even what encoding technique (DSD, PCM-FLAC, PCM-MQA/Meridian/Stuart, etc...). HRA has been around for more than a decade with SACD and DVD-A; it's not really that new or sexy for many of us listening to this stuff for awhile. Better recordings are really what the world needs, not bigger file sizes. Recordings truly worthy of 24-bits and >44kHz because care and judicious processing were applied to maintain nuances and realism. Whether we can hear the difference with HRA or not I think we have to leave to each audiophile to decide through experience, intellectual consideration, or likely some combination of the two. In time, we will see just what "value" digital high resolution recordings hold in the marketplace from a cost perspective and whether lossless/high-resolution store fronts are able to succeed in the face of competition like the streaming services and traditional physical media. If companies, consumers, reviewers, and the press unite to advocate for and get us better sounding albums that can actually benefit from high resolution instead of the crappy, loud, typical mastering "quality" we've been subjected to in the last 2 decades, we all win. This, in my opinion, is the evangelical "mission" which audiophiles and music lovers should be pursuing.

Isn't it ironic that over time, for the most part, hardware like DACs objectively improve and become cheaper, yet it seems like the mainstream music software side just gets further away from high fidelity and realistic sounding recordings?


Appendix A: The Oohashi "Hypersonic Effect"

I'm sure some folks will raise the spectre of the Oohashi (J Neurophysiol, 2000) study as evidence that ultrasonic frequencies make a difference - the "hypersonic effect" (a presentation was made by the same group in 2002). Michael Fremer already has in his spiel. (I see he just posted up another piece using this paper as the main point.)

In a nutshell, this study showed that there was enhanced alpha-frequency EEG occipital-parietal power when test subjects were played music with extended frequency response (up to 50kHz; the 20-50kHz ultrasonic contribution peaking at -30dB compared to the audible spectrum normalized at 0dB), and also PET scanning demonstrated increased deep brain activity (increased rCBF in the midbrain and lateral left thalamus) with full frequency music which was not noted when a low-pass filtered version of the sound was presented. I actually like this study and it's one of those nuggets in the audiophile psyche that stands out as a fascinating talking/thinking point! Remember though that functional neuroimaging is a hot topic these days, and there are many reports out there of questionable significance. Suppose we accept fully the methodology, there still remains the question of what it all means... For example, the recording equipment is a "high-speed one-bit coding signal processor operating at 1.92 MHz" - I'm not clear how good this is compared to modern ADC/DACs (it's supposed to have flat frequency response over 100kHz, so did they use noise shaping, if so, doesn't that introduce ultrasonic noise? The sample rate is obviously lower than SACD/DSD64 at 2.8MHz so noise shaping could be more intense!). The music chosen was the "Gambang Kuta" from Bali (have a listen here) - lots of high frequency content; can the results be generalized to Western music? We don't know if the test subjects even like this music so it's worth considering if subjective pleasure would change these results (what if most of the test subjects thought the "music" sounded like fingernails on chalkboard!?). Oohashi also designed the Pioneer-made "super-tweeter" that's supposedly flat to 100kHz - practically, how many reviewers/writers/listeners have speakers/headphones capable of this?

As far as I am aware, nobody has replicated the PET results. The same Oohashi group has now published again in 2014 (Fukushima, PLOS, 2014) using DSD128 (1-bit, 5.6MHz) and the TAD PT-R9 ribbon super-tweeter. They used the same Bali music. This time they feel there is both a positive hypersonic effect (with high frequencies >32kHz added to the audible component) and a negative hypersonic effect (for lower ultrasonic frequencies up to 32kHz) determined by whether the alpha-EEG power increases or decreases! There's even a vague reference to whether there's safety issues with these high-frequencies. Ultimately this appears to be even more confusing. Since they didn't ask about perceived subjective sound quality, we don't seem to know which (positive or negative) hypersonic effect sounded better in this study! If the negative hypersonic effect is "bad" for perceived quality, does that mean it's better to low pass down to 20kHz than retain all the frequencies up to 32kHz? But if the music recording has lots of frequencies >32kHz, then it's better to retain that since the brain then experiences a positive hypersonic effect? Even if these neurophysiological effects were real and replicable, should we even care if there's no conscious awareness? (It's worth mentioning again that DSD generally uses noise shaping and adds to the ultrasonic noise, so in the 2014 study, what happened to the noise that usually accompanies DSD128 playback starting around 50kHz?! In 2014, 24/192 PCM would have been better with less quantization noise I think.)

Bottom line: Before we accept theories around the importance of high frequencies affecting central nervous system functioning, realize that the data is limited and significance unclear. I think it's highly speculative to link these studies with the idea that they argue for high-resolution audio.

One more thing while we're speculating here... I wonder why there's no discussion about beta EEG frequencies? Cortical processing and alertness is correlated with the higher frequency beta activity. Studies of temporal processing for example will use beta-band activity for auditory guidance (like this one). Furthermore, alpha tends to be concentrated occipitally which is of course the major visual processing cortical area (usually when eyes closed, at rest)... Since we're thinking about music as perhaps being emotionally and cognitively engaging, we really should be looking for frontal and temporal activity I suspect although sensory modalities can cross-affect processing (example).

[Of interest, in the Oohashi 2000 paper, they also claimed that the 26 young (18-31, 42% female!) Japanese subjects who took the "psychological experiment" portion scored significantly and presumably in favour of the full-frequency test sample compared to presumably a 22kHz low-pass version (I'm unclear because they also used a 26kHz cut-off version for one of the EEG tests). I would have loved to see a more rigorous examination of the "psychological experiment" portion since they claim that the respondents were able to hear the full-spectrum sample as "softer, more reverberant, with a better balance of instruments, more comfortable to the ears, and richer in nuance". Considering all the negative or barely significant studies in the literature over the years, even this alone would be worth discussing irrespective of EEG or PET data!]