Saturday, 6 April 2019

MEASUREMENTS: Roon 1.6 Upsampling Digital Filter Options & A Discussion on "Signal Path" Quality...

As discussed last month, I've started using Roon as my main music player for the sound room recently. Back in the days of Roon 1.2, many users performed upsampling using HQPlayer. While HQPlayer integration is still available (go to Settings --> Setup to access the installation option), since version 1.3, Roon has incorporated its own DSP samplerate conversion which I suspect would be completely adequate for the majority of users.

I was curious about the upsampling digital filter options available in Roon. If you look at the "Sample Rate Conversion" control panel, we see the four main "Sample Rate Conversion Filter" settings:

On the left panel, notice that Roon allows you to select the different DSP options and add various filters to the "chain" (left lower panel). "Headroom Management" is always available if needed which basically means you can set the amount of attenuation you want to use to prevent clipping while doing the DSP processes. Default setting is a very reasonable -3dB.

On the right panel with "Sample Rate Conversion" checked, it's nice to see straight forward descriptions of the four filter options. "Precise" implies that the filter is "clean", that it can do the job of removing "aliasing"/imaging in the reconstruction. Whereas one would expect the "Smooth" (we'll talk about this adjective a little later) setting to have a shorter impulse response, which perhaps looks "prettier" in magazines.

Based on the impulse responses, it certainly looks like the filters are as described. We can already predict that the "Precise" versions will be steep filters and should provide very good suppression of signals above Nyquist. The "Smooth" options are linear and minimum phase short tap-length filters that will allow quite a bit of ultrasonic leakage.

Let's have a look with the Digital Filter Composite (DFC) graphs to check on the filter performance... For consistency with recent articles, I'll use the good Oppo UDP-205 "Linear Phase Fast" as the comparison DFC:

Nice, as we've seen before, this is a non-overloading filter option built into the Oppo player. If I were to choose one filter to use based on the standard options for this device, this would be it.

As mentioned above, Roon does provide the "Headroom Management" feature. I like the options of turning it on/off as well as selecting the amount of attenuation to apply. Sometimes, one knows that it's unnecessary to provide extra headroom. For example, when I use volume normalization/leveling (for example ReplayGain), we can be quite sure that those nasty DR6 and lower dynamic compressed albums will have been attenuated (for example, to average amplitude of -18LUFS), and there's typically no need to use this feature.

Presumably Roon knows that their filters will overload without the use of "Headroom Management", hence the feature. Indeed, this is what the Roon "Precise, Linear Phase" DFC looks like without a little attenuation.

We can see the intersample overloading with the wideband white noise containing 0dBFS digital peaks. Notice that I've set Roon to upsample to 176.4kHz (integer 4x upsampling). This is why you see the second roll-off of the noise around 88.2kHz (while the RME ADC is capturing at 192kHz for frequencies up to 96kHz).

Knowing the potential for intersample overload, let's just turn Headroom Management on with -3dB attenuation, and have a look at each of the Roon filter setting DFC graphs to compare. Since we know there has been -3dBFS attenuation applied, there's no need to include that -4dB wideband white noise signal overlay.

As expected, the "Precise" filters provide very good filtering with essentially no ultrasonic leakage. It's in fact a steeper/faster filter than the Oppo's "Linear Phase Fast" setting.

The difference between linear and minimum phase can be demonstrated by the difference in phase response:

Quick and dirty using REW to check phase...
Ideally, the two "Linear" filters should be perfectly flat. Remember this is an actual measurement with a 48kHz sweep signal having gone through both the upsampling DSP, the Oppo DAC then analogue captured with the RME ADC @ 48kHz so please forgive the imperfection.

What's clear from the graph are the two minimum phase plots showing negative degree phase shifts as frequencies increase. While the effect isn't massive, remember that this change is a given with minimum phase filtering. The phase shift will be more significant with a steeper filter like the "Precise" than the "Smooth" setting.

As I discussed about a year back, IMO, the whole premise with minimum phase being somehow preferable by showing off impulse responses without "pre-ringing" is another one of these questionable claims sold to audiophiles when the Industry needed something "new" to use for marketing purposes somewhere around 2009. The Industry unfortunately neglected to tell audiophiles that with well mastered music, there is no "pre-ringing" during typical DAC playback!

Remember that I'm not saying that pre-ringing isn't ever present or completely inaudible. There are special instances when pre-ringing could affect the sound - like applying EQ (as per this example). As discussed here and here, for filter tweakers, feel free to try "intermediate phase" filtering if you still want some pre-ring suppression using steep filters without as much phase shift. To me this is a reasonable balance and sounds great :-).

The "Smooth" filters will not remove all the imaging artifacts. Realize that the word "smooth" does not imply that the music is somehow "less grainy" or somehow more subjectively pleasant! It just means the suppression is more gentle rather than a "brick wall" or "cliff" as you can see with the curvature of the wideband white noise in the DFC graphs above.

During music playback, the imaging doesn't necessarily look "smooth" at all. The ultrasonic image presents itself as "mirror frequencies" particularly of the high frequency content. Therefore, big dips or fluctuations will show themselves in certain situations - for example:

This is similar to what was shown with MQA processing awhile back using BeyoncĂ©'s music off Lemonade. Notice the presence of image frequencies beyond 22.05kHz with the weaker so-called "Smooth" filter. As you can see, there's nothing particularly smooth about that deep valley at 22kHz.

The "Smooth" setting might cater to folks who don't mind or even prefer some ultrasonic distortion. NOS DAC lovers, Ayre "Listen" filter enthusiasts, Pono-filter admirers, and of course MQA aficionados, the "Smooth, Minimum Phase" setting will get you some of that "magic" :-)!

As you can see, it's all very civilized. This is one thing I like about Roon. While it packs quite a number of features under the hood, the software is doing it with clearly quite a bit of consideration to the advanced audiophile. I like that they're not overloading the user with too many options. And I hope they stay away from the computer audiophile voodoo some preach (keep an eye on ex-AudioQuest employees... Just sayin'.).

Bottom line: I'm happy to listen to my 44.1/48kHz music with 176.4/192+kHz upsampling using "Precise, Linear Phase". It has steeper "brick wall" filtering than Oppo's default and provides better anti-imaging ability. When I use "volume leveling" in my day-to-day listening with a target average volume of -18LUFS (usually through ReplayGain tag, but Roon's own scanning during playback would be just fine), I usually will keep "Headroom Management" off.

Of course, if Roon wanted to go the extra mile, they could add a "Precise, Intermediate Phase" option and I think it'd be cool :-).


While I have praised Roon for much of what the software is doing... I do take a small issue with this:

As you can see, Roon is calling this a "Low Quality" signal path because the source is a 320kbps MP3 file decoded as 24/44.1. As audiophiles, I agree that we should be targeting for lossless audio for our music libraries and while I would not consider MP3 "high quality", I think it's more useful if we differentiated "low quality" 128kbps from something like 320kbps MP3 which is clearly qualitatively much better. Remember, MP3 at 320kbps is essentially indistinguishable from CD-quality FLAC for the vast majority of people - "perceptually lossless" you might say :-).

Remember the blind test here years ago? The overall result is basically the same with every blind test of high-bitrate MP3/AAC vs. lossless. The words we use are important and I think as audiophiles we should remain realistic in how we express qualitative statements. This will improve our credibility rather than seen as being hysterical or perpetuating FUD. This is like Neil Young declaring streaming audio offers "the worst audio in history" without bothering to specify bitrates or lossy vs. lossless.

I think it would be more reasonable for Roon to show some finesse. Given the choice, I would prefer a well encoded 320kbps MP3 using a recent version of LAME over many audio sources like FM radio, MiniDiscs, or even (gasp!) the average vinyl playback assuming the same master source. [Don't worry analogue audiophiles, I'm not referring to excellent sounding meticulous rigs with clean, pristine vinyl - just remember the objective limits of vinyl of course.]

So as an exercise, here's how I see things if I were to categorize the technical abilities of various digital source formats. While subjective quality is difficult to grade and depends on many factors, I think these technical description can be understood by the educated audiophile at a glance:
1. Lossy - any MP3/AAC/Vorbis CBR/VBR 192kbps or less
2. High Quality Lossy - MP3/AAC/Vorbis >192kbps
3. Lossless - any 16-bit lossless PCM, 24/44.1, 24/48, and DSD64. Some might disagree and prefer to call 24/44.1 "High Resolution". I personally prefer both bitrate and samplerate higher to make the hi-res designation unequivocal. While DSD64 can provide better-than-16-bit performance through much of the audible spectrum, it's variable and the rapid raise in noise just over 20kHz keeps the system a bit short of unequivocal "hi-res" IMO (see here for recent DSD measurements).
4. Lossless Plus - Basically you have a "core" lossless component with some extensions added that could be lossy. This is basically what decoded 24-bit MQA is (16-bit MQA / MQA-CD is simply ridiculous). Clearly, it's not true "High Resolution Audio". HDCD with decoding might be another example of this. Who knows, we might have more of these kinds of schemes in the future.
Depending on whether one thinks the "extension" is of value, the sound could be altered and one could certainly prefer the standard "Lossless" variant of an album of course! 
5. High Resolution - PCM 24-bits and 88.2+kHz samplerate, or DSD128+. These are source formats capable of unencumbered encoding content clearly above human auditory ability. Remember that with DSD128, noise level will still escalate above 40kHz - should be easily filtered out and far from the 20kHz empirical limit of human hearing.
6. Multichannel Lossy - DTS, AC3, AAC more than 2.0

7. Multichannel Lossy Plus - E-AC3 with Atmos or similar lossy format with extensions

8. Multichannel Lossless - multichannel 16-bit PCM, 24/44.1, 24/48, DSD64. These days, typically music would be encoded in Dolby TrueHD, DTS-HD Master Audio, perhaps multichannel FLAC formats. Many multichannel DSD64 SACDs as well.

9. Multichannel Lossless Plus - multichannel lossless with extensions like TrueHD Atmos and DTS:X metadata for object-oriented sound rendering
10. Multichannel High Resolution - multichannel 24-bit, 88.2+kHz PCM, DSD128+, etc... These days we can see stuff like 24/96 multichannel FLAC, Dolby True-HD, DTS-HD MA which would qualify.
I think a categorical system as above would be a realistic breakdown of the types of audio formats available out there and will remind folks that high bitrate AAC/MP3/Vorbis can sound great even if not of bona fide "Lossless" quality.

Furthermore, notice that I've included surround formats as well for completeness. I believe, as audiophiles, all of these formats, including multichannel should be of interest even if one is currently not able to implement the extra channels. The "divide" between audiophilia and "home theater" is simply artificial IMO. Great to see that Roon is able to embrace multichannel playback (PCM and DSD only as far as I am aware, not the Dolby or DTS formats).

That's all for now. Have a great week and remember to enjoy the music. Listening to some Christoph Beck Quartet Reflections (contemporary jazz) tonight as I finish this post... :-)

PS: Remember ladies & gents, this is the last month to get your results for the "Do digital audio players sound different? (Playing 16/44.1 music.)" blind test in.


  1. Totally agree with the "low quality" thing in Roon.

    1. Yeah man...

      It bugs me when "quality" as a subjective characteristic is being judged just on the basis of the encoding format. It's one thing to call MP3 128kHz "low quality" (even though at times with certain types of music it's still not too bad), but 320kbps is another matter!

  2. You have a Mytek Brooklyn, if I remember correctly? Any thoughts (or measurements, better still) on the Mytek filters, particularly APDZ? Many thanks in advance

    1. Hi George,
      No, I do not have a Mytek presently. A friend a couple years back helped with using the Brooklyn for MQA decoding testing but I personally have not had the opportunity to measure the device...

  3. The REW phase responses will be very dependent on the timing reference used and any delays relative to the reference. If measured without a reference Estimate IR Delay will make a good guess of any time delay in the measurement by cross correlation with a minimum phase estimate, provided the measurement bandwidth is sufficient.

    1. Thanks John,
      Yes, that's the issue with my phase measurement above. I did my best to coordinate the Roon playback of the sweep signal through the Oppo with the computer running REW capturing the signal back through the RME...

  4. Coordinating is a little easier with the latest beta releases:

    There is still the issue of identifying and removing the time delay, but at least with a set of measurements using a common timing reference they can all be given the same adjustment.

  5. Hi Archi.

    Again a great post, that saves me also some of my time, not to do it by myself ;-).

    As mentioned relatively often, a “serious” mastering (revering to signal level) will not create inter-sample overs (as it has to be with all MfiT masters) as it takes this into account in the final limiter,

    and as mentioned not so often, a “serious” mastering (revering to the bandwidth) will not trigger any ringing of a DACs digital filter, as the final dithering / lowpass filter in the mastering process will attenuate anything close to ½ Nyquist.


  6. @archimago - Not sure if you you look at comments on older posts. Posting this question here since it fits the context of this blog post. I noted in a few of your posts that you use volume leveling and, sometimes, headroom management in Roon. If volume leveling is used then is headroom management also necessary?

    Roon seems to apply both so that if you have headroom management and volume leveling on then you'll get more volume reduction than might be intended. For example, I have convolution running for room correction that suggest -7db headroom reduction but I also have volume leveling running at -14ULFS. If I have both on then I'll get volume leveling to -14ULFS AND an additional 7db of attenuation. In your experience, is volume leveling, at a sufficiently low level, adequate to protect from clipping?

    1. Hi Doug,
      Great question which is important as well.

      Basically, if you're using volume levelling, then NO, there is no need for applying the headroom management. I suspect there could be situations where maybe there could be headroom issues. But this would be highly extreme and unusual!