Saturday, 11 February 2017

MUSINGS: Discussion on the MQA filter (and filters in general)... [Update: Including a look at the classic "Apodizing" filter.]



Here's an interesting comment from the post last week...

Excellent article but I have one query. On Sound on Sound they say "MQA claim that the total impulse-response duration is reduced to about 50µs (from around 500µs for a standard 24/192 system), and that the leading-edge uncertainty of transients comes down to just 4µs (from roughly 250µs in a 24/192 system)." In that case wouldn't you need an ADC with higher resolution than the RME Fireface 802 in order to see any real differences between the Reference and Hardware MQA decode?
As I said... Dammit CBee! Now you've made me post another blog entry on MQA :-).

So, for those who have not seen it, here's the interesting and very slick article in Sound on Sound (August 2016):
http://www.soundonsound.com/techniques/mqa-time-domain-accuracy-digital-audio-quality

Back in mid-2016 there were a number articles with this time-domain claim wasn't there? I recall The Absolute Sound had something similar as far back as mid-2015. Notice if you compare the articles that although the graphics are updates and more "slick" in 2016, the general content is the same. The information appears to be Industry material rather than independent research. Furthermore, these graphs are nice but they're not actual measurements - for example, the impulse-response-like waveform graph in Figure 12 of the TaS article, and Figure 13 in SoS are not actually representative of what we see of MQA these days.

CBee, since the MQA decode for the 2L track came out as a 352.8kHz DXD, sure, it would be nice for the Fireface to capture at DXD or above in the event that there is something at those higher frequencies. However, notice the lack of anything but noise from 60kHz on up so I don't see the test as missing anything of significance; certainly nothing that human ears should hear unless intermodulation distortion were so severe that the effect folded down into the audible frequencies!

What about "time domain" you say? Well, this seems "controversial" but I don't believe it really is... You see, as I noted in a previous blog entry on the limits of human hearing, I suspect the data being used by MQA comes from papers from Kunchur (linked in that article) referring to thresholds <10μs using various paradigms which may or may not be relevant to music reproduction. But a trap that is easy to dip into and one which my original post touched on but I was thankfully corrected on, is that time accuracy for bandwidth limited signal is determined by bit-depth. We can actually demonstrate this ourselves... But why bother; look at Monty's "Digital Show & Tell" around 17:20 when he talks about "Bandlimitation & Timing":



This is extremely important... Sampled bandwidth limited material placed in time between any two samples will still be reproduced accurately as demonstrated in that last part of the video segment. Again, this timing accuracy is determined primarily by bit-depth, not the idea that we need to sample at 192kHz in order that time domain accuracy of 5.2μs is achieved because that is the intersample duration. Sampling at 192kHz only gives us an ability to capture frequencies accurately up to 96kHz, not that it makes timing much better for a 15kHz audible tone. As you can see, the DAC Monty's using is simply of a standard linear phase upsampling antialiasing filter variety. There is a segment in the Sound On Sound article in the "Sampling Evolution" section making this claim:
Of particular interest are new techniques which essentially discard brick-wall anti-alias filtering as we currently know it, and employ new forms of sampling and reconstruction kernels that can resolve transient signal timing with extraordinary resolution, even conveying positional time differences that are shorter than the periods between successive samples! (emphasis mine)
This claim is confusing and clearly wrong as you can see in the demonstration by Monty. "Positional time differences" can already be represented between successive samples very well, thank you. The only "magic" in MQA is that they can reconstruct ultrasonic frequencies embedded in the signal higher than what is assumed when we look at a 44.1 or 48kHz PCM file because it knows how to decode that information hidden in the lower bits of the signal (typically under the noise floor).

An idea for MQA: Why not have Bob Stuart do a similar show-and-tell video using a 30kHz wave for the Goldenears; demonstrating how MQA can decode this but not a flat PCM file? I'd also be very curious about how accurate this encoded ultrasonic reproduction can be in the time domain. Since the >22.05/24kHz tone will have to be represented by multiple samples, the time domain accuracy for the unfolded frequencies might not be as granular.

Now as for the claims of total impulse-response reduced to 50µs, well, I would say "so what?!" Is there actually good research that the shorter the impulse response (which also increases the risk of aliasing distortion) is audibly "better"? Surely there must be a balance in terms of duration of the filter, phase shift with minimum phase settings (as used in MQA), and aliasing distortion depending on empirical findings with listening tests. Is there evidence that using a minimum phase filter; which I presume is all they're referring to when they say "leading-edge uncertainty of transients comes down to just 4µs (from roughly 250µs in a 24/192 system)" is noticeable to listeners in blind tests? (Remember, the iPhone uses a minimum phase upsampler and has a low "leading-edge uncertainty" as well. Yet nobody hyped that up, right?)

Remember, in the audiophile world, there are others with essentially an opposite stance compared to MQA about impulse response duration! For example, look at the measurements for Chord's DACs like the Hugo TT (US$4800 back in late 2015). Notice that Chord is proud about their FIR filters with a large number of taps - 26,368 in the TT and even higher at 164,000 with the new DAVE DAC! I'd certainly be curious to see how the DAVE measures. In any event, these DACs are well liked and well reviewed by the audiophile press despite the very long duration of pre-ringing as measured for the TT (and I presume similarly with the DAVE).

Remember, an "impulse" wave as used to show the effect of the filters is not sound; certainly not a sound I would like to hear on my CDs and digital downloads! The impulse response is a way for us to view how the DSP reacts to this sudden discontinuous change in the input signal. It tells us how complex the filtering is, and allows us to extrapolate how it would affect the frequency domain - the other half of the "transform pair". I've spoken more about this in a previous blog post.

Ultimately, remember again the importance of speakers and room acoustics for these time domain discussions... Have a peek at the speaker measurements of the step response in Stereophile. What time scale do you see on the x-axis? About a millisecond for a good speaker to go from the start of the impulse to a low level trail (this can be improved significantly with DSP correction). Likewise, when we examine room acoustics, who listens in an anechoic chamber such that there aren't reflections affecting the sound past 50µs or whatever miniscule time period being reported?

The bottom line is that from what we know currently looking at the Bluesound Node 2's firmware and the results from Stereophile's measurement of the Mytek Brooklyn, MQA uses a minimum phase filter which has a slow roll-off and moderate amount of aliasing beyond Nyquist, thus a relatively short impulse length. If you look at Figure 6 in that Stereophile measurement page, for a 44.1kHz signal, the Brooklyn is -6dB at 22.05kHz and I estimate about -40dB by around 30kHz... Okay, we can approximate this kind of curve in iZotope RX 5:

So if I run a 16/44 white noise signal through that, it looks like this:

Not exactly the shape of the measured Brooklyn but good enough for us to have a peek at the impulse response:

So, pending further information, that's likely about what MQA's impulse response would look like if we had the ability to encode a test signal and run it through a decoding DAC like the Mytek Brooklyn or Meridian Explorer 2... It's similar to the image from Mans R's examination of the Bluesound Node 2 (see post 67) +/- 1 or 2 more beats of notable post-ringing.

There is one major caveat though if you wanted to try running some music through iZotope to have a listen to these filter parameters. iZotope's resampler is clearly technically superior to what was shown coming out of the Brooklyn's DAC. iZotope's 64-bit DSP pipeline protects against intersample overloads so it's a lot less noisy than MQA decoding (remember, the Brooklyn has other cleaner filter settings - see Figures 4 & 5 in the Stereophile page). For example, here's a 0dBFS 19.1kHz tone through iZotope:

We can easily see the 25kHz alias "image" due to the "leaky" filter. But because it doesn't overload at high amplitude, we don't see all that other noise in Figure 6 using the same sort of test tone. As you can see in Figure 6A, it took a -1dBFS tone to remove the overloading condition. True, as per Bob Stuart, "real musical spectra does not have full-scale content at the top of the audio band", but that doesn't mean a more robust filter that can handle all signals without error should not have been used, right? Especially these days when so much of the material out there is highly compressed and approaching 0dBFS!

By the way, we can compare this to the emulated Ayre/PonoPlayer filter in the previous digital filter blog post. Clearly the PonoPlayer uses an even more gentle filter with shorter total impulse response duration.

I've stated before that I believe digital upsampling filters these days are really quite subtle in effect if even detectable the vast majority of the time. Furthermore, they're only relevant for 44.1/48kHz samplerate; at 88.2/96+kHz, any reasonable form of upsampling would not be audible. Remember, the data from the blind test a couple years ago between sharp minimum phase versus linear phase filters did not show significant difference among respondents despite very obvious pre-ringing with the linear phase filter. Having said this, I have experimented over the years and I've settled on a setting that works for me both intellectually and sounds great for upsampling:


A filter steepness of 8 is not that strong. Using these settings, for a 44.1kHz signal, there's only marginal high frequency roll-off below 20kHz; about -1.75dB by 20kHz which would not be audible given the ear's reduced sensitivity (compare with the simulated MQA filter with around -2.75dB). Notice the "Cutoff shift" parameter which implements a passband reduction to 98%; this in effect shifts the curve towards the left, reducing the amount of potential aliasing at the expense of earlier high-frequency roll-off (when I think of the term "apodizing"**, it is in regard to this effect of lowering the passband - I don't see MQA doing this). With the "Pre-ringing" set at 1.0, my preference is a linear phase:

Not bad I think; about 50-70µs of low-amplitude pre-ringing which would only show up in poorly low-passed "music" (typically not audiophile approved masterings!). Remember that it's not only duration that determines audibility - amplitude needs to be examined and in a linear phase filter, because there is both pre- and post-ringing, the intensity of the pre-ringing is less that the magnitude of post-ringing which is where all the energy is concentrated in a minimum phase filter. Furthermore, high frequencies will not suffer from phase shifts with a linear phase filter unlike the minimum phase type used in MQA. Finally, the filter setting is strong enough to not result in much potentially audible aliasing. For example, here's the 16/44 19kHz sine wave at 0dBFS upsampled to 24/192:

Yes, the 25kHz alias tone is there, but significantly more attenuated compared to the simulated MQA filter (-35dB suppression versus around -15dB for the MQA-like setting above). The more gentle roll-off should satisfy those who feel a very steep "brick-wall" filter sounds artificial. I think I'd be very happy with a DAC if a filter like this were implemented with high precision and is free of intersample overloads in hardware.

Have fun listening if you want to experiment and let me know whether you hear any differences and have clear preferences...

------------------------

Yet again, not as much of a "quickie" post as I had hoped when started writing this... I'll try to make sure we talk about something else not MQA next time :-).

This week, I ran across this lawsuit filed from the Spinal Tap folks. Media business sounds like dirty business... Got me to pull out my This Is Spinal Tap soundtrack, not a bad DR13 remaster from 2000, at a time when massive dynamic compression was all the rage. Still holds up pretty well.

A friend also gave me the recent Drive-By Truckers album American Band which received good reviews last year. Southern rock with a political bite. No matter where one stands, I think it's a good thing as the world (especially the Western nations) confronts questions of freedom, truth, morality, honesty, rights, acceptable tolerance, privilege and identity. While admittedly I would not want to listen to a steady diet of albums with political content one way or another, I do appreciate social commentary in art to stimulate thought and debate. Whether one leans right or left, I think we can all agree that there is no excuse for the CD to be dynamically compressed to DR6 with audible distortion :-(.

Have a wonderful week enjoying the music, everyone!

** ADDENDUM: Speaking of "apodizing" filter, remember that this term, at least in the way we use it in audio commonly these days began with Meridian and the release of their 808i.2 Reference Signature CD player about a decade ago (2008-2009 timeframe). Measurements for this device can be found from the HifiCritic and in Stereophile. There's also independent measurement and discussion at Mr. Apodizer's Blog back in 2011 (including frequency response and phase shift of the minimum phase filter).

Of course back in those days, everyone was curious about the sound which started the trend for DACs to include various filter options like what we have these days.

If you're wondering, we can indeed model that "Classic Meridian Apodizing Filter" with iZotope RX as well:



By reducing the passband to 96%, we can still maintain flat response beyond 20kHz but allow us to use a not-too-steep filter with post-ringing similar to the measurements in the articles. There is practically complete aliasing suppression by Nyquist at 22.05kHz. Of course, the impulse response is a minimum-phase one.

It is interesting how compared to the simulated MQA setting above, it looks like Meridian has abandoned their "apodizing" filter of yesteryear which did an excellent job with suppressing aliasing distortion in favour of the new MQA filter which is much inferior in filtering out aliasing but shorter in duration. This idea of a shorter duration filter (likely determined by a "nicer" looking, less ringing, impulse response) of course has become their mantra of late in emphasizing "time domain performance".

As usual, feel free to experiment and listen... Just be prepared to be underwhelmed by the likely lack of audible difference.

13 comments:

  1. Hey man,

    Not quite sure if you meant it literally or not, but your comment about the only “…‘magic’ in MQA is that they can reconstruct ultrasonic frequencies embedded in the signal higher than what is assumed when we look at a 44.1 or 48kHz PCM file because it knows how to decode that information hidden in the lower bits of the signal (typically under the noise floor).” is not complete. To borrow from a audioXpress guest editorial , “Think of MQA decoding as a three-legged stool. The first leg is detecting that an unsullied MQA data stream is present…The second leg is "unfolding," or unpacking the hidden HRA data…(while) The third leg occurs in an MQA-enabled DAC, where the MQA software provides compensation or deconvolving of temporal damage done to the audio by the DAC itself.” You cover the 1st and 2nd “legs,” but that 3rd leg me thinks you missed… ;)

    P.S. - Since you clearly like your Dynamic Range meter, there are plans afoot by my friend, the founder of the Pleasurize Music Foundation and inventor of the DR meter, to create a modern version.

    ReplyDelete
    Replies
    1. Hello OMas,
      I said 'magic' in jest of course. Because there really is no magic here... I see your editorial:
      http://myemail.constantcontact.com/The-Audio-Voice-116--MQA-Streaming-on-Tidal--Fresh-News-from-NAMM-.html?soid=1104292817535&aid=WW2ZuRdl-zk

      Sure, I can accept the 2 legs of that stool - detection of signal, and unfolding. But show me that 3rd leg of "compensation or deconvolving of temporal damage". I don't see it so far as per my look at hardware decoding with the Mytek Brooklyn last week. Hopefully MQA can shine a light on this...

      Great to hear a new DR version is coming!

      Delete
  2. I don't think Montys video addresses the MQA temporal claim completely. In the MQA paper that has a diagram and accompanying text describing triangular overlapping sampling compared to discrete sample periods (sorry I dont have a link) it illustrates the time "blurring" that can occur in discrete sampling. Whereas the triangular sampling can potentially pinpoint at what time within the sample period a pulse occurs. Whether this more accurate timing is audible or not is another matter.

    ReplyDelete
    Replies
    1. Hi CBee,
      Seriously, don't worry about the triangular sample point diagram on the Sound-on-Sound article. My suspicion is that all they're saying is that instead of a single sample representing the usual frequencies governed by Nyquist, that they can expand the frequencies represented with the the MQA decoding. This is not in itself time-domain accuracy and I suspect that for the frequencies >22.05/24kHz, the accuracy is lowered (I interpret the fact that the triangles overlap in that diagram an indication of loss of time domain performance at high frequencies beyond Nyquist being decoded). Remember, we're basically trying to decipher an illustration in an "infomercial" type ad. The author is clearly wrong when he implies positional time differences somehow cannot be shorter than successive samples.

      Remember, at 16/44, for frequencies below Nyquist, time domain performance in on the order of 60 picoseconds which is what Monty's video is showing for that ability to place the signal between sampling periods. (See the old post: http://archimago.blogspot.ca/2015/10/musings-meditations-on-limitations-of.html). This time resolution is already beyond the 5-10µs threshold the Kuncher papers speak of.

      And since I find no evidence of time domain "correction" otherwise as per the post last week... Well then, I think it's going to be up to MQA to justify their claims

      Delete
    2. I can’t let go of this yet, just this final post! The triangular sampling shape can be thought of as a scaling factor so that anywhere where the 2 triangular samples overlap the sum of the 2 is equal in value to the peak of the triangle. So if the pulse occurs on the overlapping slope of sample 0 where its value is 2/3rds of the peak value and sample 1 is 1/3rd, this enables the time where the pulse occurred to be more accurately ascertained.

      That is what they imply in their Audio Engineering Society Convention Paper 9278.

      I’m not supporting MQA, just pointing out that this seems to be the basic idea behind this part of their deblurring process. No more on MQA from me!

      Delete
    3. In a properly band-limited signal, no such pulses can occur in the first place, so attempting to record them is pointless.

      Delete
    4. The pulse is just used to explain the method.

      Delete
    5. Well, interesting theory about the triangular shape and all...

      Hope MQA can clarify the situation in any event.

      All I see is a standard PCM signal with some high-frequency recovery. And the filter is just as described above.

      Mans: Yup. The only time of course is with those unfortunate poor recordings with square waves these days :-(.

      Delete
  3. I think that if Meridian were a little (a lot?) more transparent about the workings of MQA, then perhaps some of us would be less inclined to cynicism.

    However, so long as the inner workings are shrouded in secrecy, and the audio community have to reverse engineer the process to establish it's credibility, we can never be sure of the true effectiveness of MQA because we have to rely on being told how good it is.

    It is my experience however, that when you have very little to sell, the more you can hype your product and surround it in a veil of secrecy, the more likely is it to succeed, particularly in a world fool of audiophools.

    ReplyDelete
    Replies
    1. Hey Tony,
      We'll see how the market accepts this one. I think the problem with MQA is that they overstated things and threw in too much hype around claims of time-domain accuracy and just in general claims of "revolutionary" technology.

      I used to think that maybe a sophisticated DSP time-domain correction algo could be interesting. But without evidence of anything special, what's left seems to be the "origami" compression technique of putting data into the lower bits and then unfolding. Not unreasonable for streaming audio I suppose.

      Otherwise, there does not appear to be anything all that interesting from the sound quality perspective. Even the quality of the filter as per the Mytek Brooklyn measurements isn't that interesting from a sound quality perspective IMO.

      Delete
  4. JohnW, over at pinkfish, and I quote "I read some of "archimago" postings and as a result dont give him much time as he blankly discounts certain aspects of Digital Audio that I do know have a significant impact on audio quality.". Can't help wondering exactly how he 'knows'?

    ReplyDelete
  5. I would say all the typical things like hearing quality in higher bit depths, higher sampling rates, different coding schemes (EG DSD), ultrasonics, impulses etc.

    That's usually what it comes down to if they are an "everything makes a difference" kind of person.

    ReplyDelete

  6. Very helpful suggestions that help in the optimizing website.
    thank for sharing the link.
    goldenslot

    gclub online

    ReplyDelete