|I guess an LP can still be considered "high fidelity" by some in 2014... But doubtful it should ever be called "high-resolution"!|
Like DSD / SACD, high-resolution PCM (24-bit, 88kHz+) in the form of DVD-V (up to 24/96) and DVD-A has been widely available for more than 10 years already ("rebadged" recently as HRA for "High Resolution Audio"). DVD-V's with 24/96 audio tracks could be easily ripped back in the early 2000's, and by early 2007, the DVD-A copy protection was overcome allowing easy DVD-A ripping and evaluation of the sonic data up to 24/192 2.0 and 24/96 5.1.
In 2010, out of curiosity, I ran a little foobar ABX trial using the equipment I had back then to see if I could tell the difference between 24/96 and 16/44 and posted this on Audio Asylum. The conclusion... I didn't think I could. (I've included that post below as Appendix A for completeness.)
As you know from previous posts, I'm not a DSD fanboy. It does have some limitations compared to PCM; the technological limitations themselves in terms of high frequency noise due to noise shaping, non-uniform noise floor as a result, lack of opportunity to run DSP algorithms, and the current file format implementations make it cumbersome. In business, the opportunity to differentiate oneself from another provides the opportunity to sell the item as "new", "different" and "better" and indeed DSD provides many talking points and sales opportunities. As I wrote in the PCM-to-DSD article, there are certainly some audible differences DSD processing imparts and this change can be perceived as euphonic even though the actual underlying resolution is no better. It's a bit of a philosophical point: is it better to listen to something played back accurately? Or should we aim for euphonia even though it clearly adds something (distortion) to the signal that was not there originally? In the case of the PCM-to-DSD algorithm, clearly ultrasonic frequencies are being added to the signal as demonstrated by the measurements. As I have opined a number of times in the past, my preference is for accuracy; this is my definition of high fidelity and I generally feel that PCM retains accuracy better than DSD64 and with DSD128 obviously capable of better accuracy than DSD64. (Of course, if a recording began life as DSD and wasn't tampered with, that's as good as it gets and no point losing precision going to PCM...)
Let's consider in this post what it would take to overcome the "limits" of 16/44 in a home component system. Remember that headphones are a much simpler case and for a lot less money, one should be able to better speaker systems - but of course the presentation of soundstage would be much different. (In a way, this post will be similar to a previous one I linked to in What Hi-Fi? but I hope to be more thorough and more realistically critical of the whole endeavour.)
I. The High-Resolution Hardware Needed.
A. Good enough DAC?
Let's start with the DAC since unless it can reproduce analogue waveforms of high dynamic range and the full frequency spectrum, there's no point proceeding. I have already posted measurements throughout this site and demonstrated these qualities in many DACs already. I do not believe it is difficult to pick out DACs which measure well. A DAC capable of >16-bit dynamic range is not hard to find; almost any decent unit today should do (even the electronics-packed Squeezebox Touch from years back can do this). Furthermore, it doesn't have to be expensive; that old AUNE X1 DAC (<$200) from at least a couple years back can easily convey the measurable benefits of hi-resolution PCM up to 24/192. If you have a look at the Stereophile measurements page for modern DACs, it's easy to see that the state-of-the-art DACs all essentially measure in an ideal fashion these days for 16-bits with the best achieving a noise floor around 21-bits with high-resolution material. Talk about 32/48/64-bits and such is nonsense in terms of DAC output resolution. High bit-depth would be beneficial for accuracy of complex DSP calculations like in the studio environment with numerous Pro Tools effects. Although I haven't heard all the "best" DACs, I have heard a number of good ones and feel that differences are minor and most likely inaudible in volume-controlled blind testing despite reviewers' comments. Measurements can show slight differences with impulse response, the occasional jitter difference. No biggie.
B. Good enough pre-amp, amp, speakers?
If you look at the measurements in Stereophile under the pre-amp section, it's not unexpected that vacuum tube preamps generally measure poorer in terms of unweighted audioband SNR; typically around 75-85dB with a 1V signal including some very expensive models. In comparison, good solid state preamps should be able to achieve around 100+dB with a similar test. In general, you'll find this difference with any vacuum tube vs. solid state comparison... As per the philosophical question posed above, vacuum tubes are less accurate (older obsolete technology) but can be euphonic (and nostalgic) for some people.
In my system, I've already shown that the Emotiva XSP-1 pre-amp and Onkyo TX-NR1009 receiver are capable of passing through an analogue signal of easily >100dB dynamic range from the TEAC UD-501 DAC. Unfortunately the old Denon AVR-3802 was incapable of this, therefore, it would not be a device I'd use in the high resolution audio chain. Measurements are needed to know whether the gear is good enough for high-resolution audio.
The amplifier and speakers are even harder to quantify... For high resolution dynamics, we need the amplifier to be able to produce enough power to drive the speaker to create >96dB dynamics in the room! Likewise the speaker will need to be able to handle the power to recreate the full dynamic range without distorting to a significant degree in order to benefit from the resolution afforded by >16-bits! This is tough.
That's just the dynamic range. What about the frequency response from a speaker? Assuming that ultrasonics have some kind of beneficial effect (dubious), most excellent speakers can reproduce frequencies beyond 20kHz. Few speakers have a "super tweeter" to reproduce those frequencies up to 40+kHz flat however. In fact, many listeners would prefer a slightly rolled off response by 20kHz as per the Brüel and Kjær "house curve". I know I can't hear much above 16kHz so would find it hard to feel any need to spend money on accurate ultrasonic frequency reproduction.
Success in conveying the high resolution audio signal also depends on...
C. Good enough room?
It goes without saying that to achieve a dynamic range over 96dB, we need a quiet room; the main point of my previous post about the importance of "silence". Assuming you have amps and speakers capable of it, to better the dynamic range afforded by the 16-bit CD format in a very quiet 20dB(A) listening room, we need to be able to achieve dynamic transients >116dB SPL. In my stereo system consisting of 250W (continuous power) monoblocks feeding quite sensitive 92dB/W/m speakers, placed near a wall, the theoretical maximum SPL at my sweet spot 11-feet away is "only" 111.5dB with the amp already contributing 0.05% distortion (check out the Collins' Cinema SPL calculator to do your own calculations). Accounting for some headroom, I can maybe get up to 115dB for dynamic transients in music. Remember, due to logarithmic properties, even if I doubled the amplifier power to 500W, the SPL would only increase by 3dB to 114.5dB (maybe getting close to 118dB transient peaks). I suspect the speakers would be significantly distorting the sound at these loud volumes, not to mention we're well into hearing damage levels with repeated exposure.
Furthermore, in-room frequency response, reverberation time, speaker placement effects are also significant... I believe all of this would be of greater effect than what is afforded by the extra resolution.
As you can see, we've got a problem here already from the hardware setup perspective in arguing for high resolution audio of >16-bit dynamic range and >22kHz frequency response.
[Note that I haven't even mentioned dithering and noise-shaping increases the perceived 16-bit signal's dynamic range even further at reasonable volumes, so in fact, it's not as "easy" as the 96dB quoted above. Read this iZotope Ozone guide for more details.]
D. Good enough ears/brain?
Serious golden ears (platinum ears?) are mandatory I believe :-).
II. The High-Resolution Software Needed.When we buy our music, how do we know that all the complex bits and pieces were done well enough to ensure quality? Furthermore, when we go on a web site like HDTracks, Qobuz, 2L, Channel Classics, etc... how do we know it's worth downloading 24/48, 24/88, 24/96, 24/175, 24/192, DXD, DSD64, DSD128?
Sadly, I think the state of affairs right now for many audiophiles (promoted by music web sites, distributors and those with financial/advertising revenue interests) is akin to the megapixel race in digital cameras. Thankfully, that megapixel race seems to have died down significantly as consumers have gotten wiser to the fact that quality of each pixel is important. The JPEG image from a cellphone's 12 megapixels is way inferior to that from an SLR with nice lens at 12 megapixels (even though JPEG itself is lossy)! But in audio, there's still the impression that bigger numbers are somehow supposed to be better without qualifying where the source comes from. 192kHz is better than 96kHz. DSD, the "super audio" format beginning at 2.8MHz sampling rate is believed to be even better (by some). I have yet to see a professional reviewer prefer the sound of a 24/96 over 24/192 when both versions are available - is it because he has hearing higher than 48kHz? Do all DACs sound better running at 192kHz? Or is it just bias from the numbers game? (I suggest it's just the latter.)
Of course, in real life it's never as simple as bigger numbers being better... As consumers we usually have no insight into the hardware used like the microphones or how the ADC was accomplished (remember the technical limits of the studio equipment!). Complex studio "processing" for multi-tracked projects, and the myriad of DSP plug-ins require expert sound engineers to ensure quality with each step. I would be remiss to bring up concerns around decisions to use dynamic range compression (eg. Loudness War) which could have artistic merit but which also lowers the ultimate fidelity of the original recorded signal and diminishes the number of bits of dynamic range required to fully encode the sound. As a result, many audiophiles including myself would regard some of the first CD releases back in the 1980's to about mid-1990's as superior to subsequent remasters using heavy handed processing despite the fact that ADCs back in those days would have been inferior. For example just look at the mess UMG did in 2010 with the Rolling Stones remasters among many recent cases.
To some extent we could make decisions based on the reputation of certain companies. Audiophile remastered releases by Analogue Productions, Audio Fidelity, Mobile Fidelity can generally be accepted as the best remasterings using better equipment and maintaining higher standards of audio quality whether it's standard CD, SACD, or potentially high-resolution PCM. Some mastering engineers like Barry Diament, Kevin Gray, and Steve Hoffman are among those who have made a name for themselves as mastering to a higher audiophile standard (as opposed to this guy and his ruinous work on RHCP's Californication and I'm With You). Realize of course that remastering demands that the original source material to be of high resolution quality if one is to show the benefits of the high-resolution format over CD - this is debatable for even the best analogue tape. Likewise, certain record labels like 2L, AIX/iTrax, Channel Classics, Naim, Blue Coast, Reference Recordings produce superb sounding original recordings catered to the audiophile segment with a strong reputation to maintain.
However, we are witnessing the rise of the high-resolution internet store fronts. Places like HDTracks, Qobuz, Acoustic Sounds, Native DSD Music, Linn seem to be resellers of high-resolution format files (PCM and/or DSD) from some of the major/larger record labels. Quality control issues have been discussed over the years as obvious "upsampling" (taking a standard 44kHz and resampling it to 96kHz to sell for example) have shown up on places like HDTracks which really does a disservice to the music buying public. Like the list of SACD's appearing to be nothing more than upsampled PCM, we do not know the origin of many of these "high resolution" albums. I've certainly run into my share of questionable releases that appear to be blatant standard resolution upsampled music or looking like standard 44/48kHz PCM data run through an analogue board and re-recorded with low-level high-frequency noise on a 24/192 release. The former is easy to spot, but the latter is hard to prove!
As audiophiles and audio enthusiasts, we can of course look beyond just the hyped numbers and acronyms like 24-bits, 96kHz, DSD... I suggest doing a search on places like the new Computer Audiophile Music Review segment (good job Chris on the new feature) and see if folks have "measured" the release for the following characteristics to look for a "true" high resolution master as best we can as consumers:
A. Absence of steep low pass 'filtering' suggesting upsampling.
Good examples are all over my list of suspected upsampled SACDs. Note that in the case of DSD, since there's rising high frequency noise, the discontinuity exists as a steep cliff around 22kHz. You can also see a good example of this in the HDTracks release of Graceland 24/96 as per this thread on Computer Audiophile. Because upsampling of PCM is done within the digital domain, it results in the "brick wall" abrupt transition like this:
|Blatant example of 24/96 upsampled to 24/192 using "Blue Train", John Coltrane. No frequencies above 48kHz at all...|
|Upsampled track with -100dB spatialized white noise mixed in to look like there's something >48kHz.|
Here's what that track in true 24/192 looks like, but as shown above, it's really easy to conceal a 24/96 upsample:
|The real 24/192 "Blue Train" off the 2001 Classic Records HDAD release.|
B. High average dynamic range.
What is the point of going above 16-bits if the mastering is compressed to hell and back? The opportunity to easily measure the DR "Dynamic Range" value using the foobar plugin (use the newer version found in the DR Database) has been extremely useful. This value is calculated by looking at the difference between peak volume vs. RMS "average" volume over chunks of time throughout the music. A higher number correlates to the presence of loudness peaks which of course corresponds to a more dynamic sound. Real life sounds dynamic! You can also check the Dynamic Range Database before purchasing to see if the title is in there.
Although there is no simple rule about this, the DR value can give you insight into whether the high resolution version of an album is the same mastering as the CD release and what difference exists. Consider the example of Lorde's Pure Heroine...
Here's HDTracks 24/48 "Studio Master":
In this example, the HDTracks release looks (thankfully) a little bit better than the CD - this is actually somewhat rare with modern pop/rock recordings since I find the majority are the exact same master with the same DR value. Generally we're seeing a 1dB improvement in average dynamic range with the high-res version. The problem is, we're still at a DR of 7dB only! Furthermore, there are no nuances to the peaks with every track pushed up and limited to 0dB (not really an issue here since the music is clearly synthetic but this would be very bad to see with acoustic music or orchestral music like some soundtracks such as the recent DR7 The Dark Knight Rises).
For perspective, back in the late 1980's and early 1990's, the average pop and rock album had average dynamic range around 10-13dB. A well recorded classical or jazz album with all the dynamic transients that come with natural recordings should average around 13-16dB.
Personally, my audio "New Year's resolution" for 2014 is to avoided purchasing *any* high-resolution album without at least a value of DR10; and I feel this is very conservative already (I probably should advance this to DR12). The rationale is simple... Recordings with low dynamic range and peaks pushed to ~0dB are loud and do not require us to turn up the volume to listen at reasonable levels - typically I play DR7 tracks at something like -35dB through my system. This type of recording does not utilize anywhere near the limits of a 16-bit medium like the CD nor that provided by my home audio system, much less demand the 24-bit dynamic range of high-resolution downloads.
As an aside, be very cautious in using the DR Meter for vinyl rips! Just because the values tend to be higher doesn't mean that it's due to the LP having better mastering. Have at look at Ian Shepherd's video here (essential viewing IMO):
In that example, with pops and clicks removed, even though the DR score is higher with vinyl, the original source is the same and we can speculate in a later post as to why this might be the case. Remember that vinyl rips remain plagued with the limitations of the LP source including surface noise, subtle wow/flutter, inner groove distortion, idiosyncrasies of the playback gear, etc. They can and do still sound great especially with some judicious post-processing, but not true high resolution like a good quality pure digital recording.
Nonetheless we see on DR Database someone's vinyl rip of Pure Heroine:
III. Conclusion - Expectations for High-Resolution Audio?So, let's try to answer the question posed above of what to expect from high resolution downloads in conclusion... I think the answer is simple: Not much if anything compared to a technically good 16/44 version or CD of the same mastering.
From a hardware perspective, digital audio technology has advanced to the point of easily overcoming the 16/44 "limitations" within the DACs. The problem revolves around the ability of the analogue side (especially the speakers and room) and ultimately, the perceptual limitation of being human. Apart from listening in unreasonable ways (eg. listening to fade outs at extreme volume levels comparing undithered 16-bits vs. 24-bits), I don't know of any controlled study suggesting an audible difference between standard and high-resolution music.
From the software perspective, we are faced with just how well the recording was done and mastered in the first place. I would gladly listen to an MP3 192kbps of a well recorded/mastered album than a poor DSD128 or 24/192. For folks to place an emphasis on the value of 24-bits, 88+kHz, and DSD appears to be ultimately a fallacy. The most egregious examples of these are the pseudo-high-res albums like the upsampled SACDs or upsampled PCMs sold under the high resolution moniker. But as Mark Waldrep ("Dr. AIX") has been talking about for years, true high-resolution demands that the source be high resolution... The hardware reproduction challenges are difficult enough, but if the source is anything other than a pristine (digital) production, the potential resolution of 24/96 and above would never be utilized (assuming anyone can even hear it!).
I often criticize Neil Young and his comments on these pages (and I see many others do as well) simply because he hypes up the 24/192 number in a way personifying the "megapixel race" of the audio world (I'm sure he's a decent fellow and all, and I certainly respect his artistry... But just watch this cringe-worthy "Dive Into Media" video from 2012 to see what I mean.). I think it's quite clear that unless his Pono brings something truly new like getting the record labels to provide good quality, dynamic, remasters, there is essentially no hope of Pono hi-res sounding any different than what we have so far with HDTracks or Qobuz. The 16-bit, 44kHz resolution "container" was never a significant "problem".
In the years since my post in 2010 (Appendix A), my opinion really (perhaps to my surprise) has not changed. Even with better equipment like newer DACs, better headphones (Sennheiser HD800), a much improved room, separate components, better speakers. Taken together - the challenges of hardware and software - I suspect that high-resolution PCM (like DSD) doesn't have much future in "making money" for the companies other than as it is today... Niche products with a small target audience. Unless the software companies somehow decide that the default digital resolution is greater than 16/44 and provides it as such, I don't see why the average person would pay more for what essentially is of little (if any) benefit in the consumer environment. The thought of making high-resolution lossless downloads default is unlikely given that the majority still just buys music from iTunes which isn't even lossless to begin with!
Now on a personal note, I am mindful that hobbies in general are not meant to be rational pursuits; they're emotional endeavors for the sake of pleasure and fulfilment of some form. We (audiophiles) probably spend more time than needed obsessing over all manners of "trivial" things from what to buy to whether a cable is "good enough". Most people in this world would look upon the perseverative audiophile as a form of obsessive-compulsive neurotic but so be it. So are car collectors, stamp collectors, wine connoisseurs, watch aficionados, etc. In this context as an emotional pursuit, I'm actually OK with still collecting my favourite music in 24/96 or DSD if that's "as good as it gets". It is about perfectionism. There still could be something wrong with brickwalling at 22kHz and 16-bits do not challenge the objective capabilities of my DAC - perhaps some day I'll hear the difference assuming there are some merits to the recording being "high-resolution". Nonetheless, to buy a recording which is just upsampled or has severely squashed dynamics offering no potential benefits at all to the collector is just plain foolish. As a "more objective" audiophile, there are limits to perfectionism which I would be unwilling to cross based on the scientific data.
For those who insist that they can hear high-resolution audio, I would love to see comments below... I would highly encourage doing a controlled test like the ABX below for yourself with your own high-resolution music. Let me know if you ABX successfully!
Recently, I've been listening to some Larry Carlton. Excellent stuff especially some of the older recordings from the early/mid 1980's - have a listen to Discovery from 1987. No argument from me that 16/44 sounds fantastic when done well!
Audio low point of the week? A friend notified me that Beck's recent album Morning Phase scores DR6 from the HDTracks 24/96 "Studio Master" (I see the Computer Audiophile just put up an article on this). Same as CD. As per my New Year's resolution above, this is one album I'll just pick up at the local CD vendor for $10.
As usual... Enjoy the music. I've promised my wife I won't be buying another copy of "Kind Of Blue" - even in High-Resolution Audio 32/384 :-).
APPENDIX A: Old post from Audio Asylum in 2010
Test with 24/96 vs. 16/44 LITTLE TO NO DIFFERENCE.
Recently got the very versatile E-Mu 0404 USB (AKM AK4396 DAC) to play around with on my computer (quad core 2.8GHz, 8G RAM, low DCP latency, Win7, through USB of course). With some recent high definition downloads / DVD-A source:
Rebecca Pidgeon - The Raven: "Spanish Harlem" (24/88 Bob Katz 15th Anniversary Ed)
Carol Kidd - Dreamsville: "When I Dream (2008)" (24/96 Linn Studio Master)
Laurence Juber - Guitar Noir: "Guitar Noir" (24/96 AIX DVD-A rip)
Took these FLAC/WAV files, down sampled in Adobe Audition to 16/44 (no dither, no noise shaping) then resampled back up to 24/96. Verified that frequencies all truncated to 22kHz. Then listened to them with Foobar 2000 ABX comparator using the E-Mu ASIO output plugin. This allows me to A-B on-the-fly and do some "blind" ABX'ing.
Listened with headphones: Audio Technica ATH-M50, Etymotics ER-4B.
With this setup, I figure I've removed all variables except for sample rate change - same mastering, same DAC running at same sample rate.
Results: Essentially NO DIFFERENCE between the native 24/96(88) and 16/44. Blind ABX results NO SIGNIFICANT DIFFERENCE. When I do the rapid A-B switch in the middle of a song, I thought there MAY have been slightly more smoothness/openness in the high-def version but this could just be placebo and the improvement was MAYBE 5%.
At 38 years old, very few loud concert experiences, I don't think I have 'tin ears' (hey my wife thinks I have better ability to pick out music in noisy environments so I guess it's at least as good as some females :-).
1. Either my equipment sucks or these samples suck and there's alot more but I need to fork up more $$$$.
2. Or high-def cannot be well appreciated with headphones.
3. Or the upsampling back from 16/44 --> 24/96 somehow reconstitutes the sound.
4. Or, there's really not much difference.
5. At this point I'd probably spend a few more dollars to buy a high-def download (maybe at most $5-10 more if it's something I like) when given the option but not expect significantly more revelation in the sound.
I've listened to good SACD as well and like them but there's no way to do tests like this. I didn't bother with 24/192 material since I figured most improvement should come from this first step up 44 --> 96. Anyone else done such tests for themselves?