Thursday 11 April 2019

COMPARISON: Roon DSP speed - Intel i5-6500 vs. Intel i7-7700K... (and the value of Intel Speed Shift!)


As mentioned, a little while back when I wrote about Roon, I was about to receive a "drop in" Intel i7-7700K CPU for my Server machine which runs Roon Core. I was able to find the i7-7700K used for a decent price and I didn't feel like dismantling the machine and upgrading the Z170 motherboard since the newest CPUs now need a Z3XX series board. Furthermore, for me, one of the least interesting "jobs" one has to manage as a technophile is reinstalling the operating system and software again... I try my best to avoid this mundane task :-(.

Note that if I were to rebuild my Server these days, I'd probably consider something like the very affordable Core i5-9600K with 6 cores. In fact, for most applications, this CPU will beat out the i7-7700K which I suspect would apply when using Roon for DSP as well.

I. A quick look at the Intel i5-6500 and Roon DSP...

For today's comparison, to keep the test relatively controlled, what I did was take a 24/88.2 SACD rip of Zubin Mehta & LA Philharmonic's Holst: The Planets as the source audio file. With this, I then used Roon Core's DSP system to upsampling either to PCM768 or DSD512 to my little SMSL iDEA DAC on my Workstation computer across the home ethernet (running Roon as an endpoint of course). For upsampling, I used the "precise, linear phase" setting.

Here are the results with the i5-6500 processor showing the Roon "Signal Path" and processing speed for 32/768 final output:


Remember that a processing speed of "1x" indicates that the machine can operate in realtime. For reasonable trouble-free headroom, I think a value of "1.5x" would be a good minimum to avoid playback issues.

As you can see, the little i5 was able to convert the 24/88.2 --> 32/768kHz at "1.2x". Sadly, it dropped down to "0.9x" when I activated the convolution filter for room correction. This is obviously not good enough to maintain stable playback. After a number of seconds once the buffer slowly whittled down, playback will stutter and Roon eventually warns of an error.

Things get even worse with the i5 when doing realtime PCM --> DSD playback:


Notice that it can manage 24/88.2 --> DSD256 at "1.3x". However DSD512 was clearly too much for the i5 to handle with only "0.7x" and "0.6x" speeds once we throw in the convolution filter.

Let's now see what sticking an i7-7700K in the machine can do!

II. Intel i7-7700K as Roon DSP CPU...

So I put the i7 CPU in the motherboard; took about 20 minutes to open the machine, remove the heatsink, drop in the CPU, reapply fresh thermal paste, and reattach the cooler. No change to BIOS (running CPU at default speed). No change to the Windows Server 2016 OS. Fired it up and had a look at a simple DSP task; conversion from 24/88 to DSD64. Here's a side-by-side screenshot with the i5 on the left and i7 on the right (click image to zoom):


"2.8x" for the i5-6500 and "8.2x" for the i7-7700K! I smiled. :-)

Clearly, DSD64 isn't much of a challenge for the i7. How about when we look at PCM768 and DSD512?




As you can see, the PCM768 results are reasonable. "2.5x" with just straight upsampling, and "1.8x" with convolution applied. But I was rather disappointed by the DSD256/512 results... "1.7x" for DSD256 with convolution is clearly better than the i5's DSD256 without convolution. But the DSD512 conversion was only "1.1x" and it only squeezed by at "1x" when I turn on the convolution DSP!!! Sure, it's an improvement over the i5 but not good enough for trouble-free operation... Something seemed odd.

III. i7-7700K with "High Performance" power plan...

Something I noticed when watching the Server machine's CPU performance while Roon performed the DSP was that using the usual "Balanced" power plan, the computer's clockspeed remained unusually low:


Interesting that while the PCM-to-DSD512 conversion is barely achieving "1x" performance, we see that the CPU is hardly being taxed! In fact, it's only running at 1.09GHz when maximum non-turbo speed is actually 4.2GHz. Furthermore, none of the CPU graphs show any of the cores are working hard at all!

Clearly, the i7 can do better than this...

Remember that modern CPUs throttle down for energy saving which is the basis of that default "Balanced" power plan. The logical step then was to force the machine to run at full clockspeed all the time. We can easily do this by changing the power plan to "High Performance":


Notice that now the CPU speed is running at a full 4.2GHz (up to 4.5GHz turbo). And look what happened to the upsampling performance:




24/88.2 --> PCM768 now at "4.6x", PCM768 with convolution "2.7x".
24/88.2 --> DSD256 with convolution "3.6x".
24/88.2 --> DSD512 "2.1x", with convolution "2x".

Now that's more like it!

IV. What's going on? And how can I save energy while keeping speed up?

Clearly, there's something funny with the Intel "SpeedStep" system (or these days known as EIST "Enhanced Intel SpeedStep Technology"), which varies the CPU frequency and voltage when under reduced load. It's as if the OS/CPU is not responding well to the DSP demands and latency takes too long before revving up the engine to meet the realtime need.

I then noticed something interesting available for Intel processors these days since the 6th Gen SkyLake processors and improved in the 7th Gen. Something called "Intel Speed Shift Technology". You can read about it here (and the 7th Gen improvements). Basically, it's a much quicker system that can respond to CPU demands by increasing clockspeed with significantly lower latency (speed ramps up by 10-15ms). It's meant to manage these "bursts" of activity; you can imagine for Roon DSP, what it has to do is process maybe a few hundred milliseconds of PCM-to-DSD data at a time to fill the buffer, do so at a steady pace while streaming the data across the ethernet. Presumably the standard SpeedStep system and OS are not able to respond quick enough or detect that the CPU should maintain the higher speed in order to fulfill the realtime streaming needs.

By default, my Gigabyte motherboard had Speed Shift off. Simply go into the BIOS to enable:


Since late 2015/early 2016, Windows 10 and Windows Server 2016 already have the drivers to support this feature.

So what did this do?




All's good in the world of Roon DSP again.

24/88.2 --> PCM768 at "2.9x". With convolution "2.5x".
24/88.2 --> DSD256 at "3.1x". With convolution "3x".
24/88.2 --> DSD512 at "1.9x". With convolution "1.9x".

Here's what's happening to the CPU when idle vs. Roon DSP active:


Nice... When idle, the CPU is able to throttle down below 1GHz and it's quick enough to detect the need when Roon DSP needs some oomph, in this snapshot revving up to 3.86GHz to satisfy the demand for upsampling to DSD512.

V. The take home message...

1. It looks like the Intel i7-7700K worked out well as a CPU for Roon DSP. No problem upsampling to PCM768 and DSD512 including adding convolution processing for room correction. As discussed in my previous Roon overview, if you're going to use these higher samplerates, you're going to need a processor with decent speed.

2. It's interesting to see that Speed Shift made a significant difference! While most of the time, I imagine Speed Shift would only make a small difference, here is an example where the low latency and rapid response really pays off. Make sure this is turned on this if you're running a recent Intel machine for Roon.

By the way, I haven't had the chance to see if turning Speed Shift on with the older i5-6500 would have improved the DSP processing speed much. Possible, but given that most benchmarks suggest that the i7-7700K operates about twice the speed of the i5-6500, I believe at best the i5 would still be only marginal for DSD512 upsampling with convolution turned on.

Anyone know what the latency is like for AMD's processors in comparison?

3. I don't think I would need more DSP power in my system, but to check on the overhead I have, I decided to turn on essentially everything offered by Roon and see how the i7-7700K handles the load. Here's a signal path originating at 24/88 with headroom adjust, samplerate increased to 352.8kHz, crossfeed, an insane 10-band parametric EQ, some Audeze headphone setting, convolution filter and convert to DSD512:


All this and still maintaining a processing speed of "1.9x". I think the i7-7700K is adequate to satisfy my needs for awhile. :-)

------------------------------

Big events on the audiophile calendar over the next month starting with AXPONA 2019 this weekend. Coming soon in early May is Munich High End. Will be interesting to read of any new innovations coming our way.

Interesting that the MQA people will be presenting at AXPONA. Let's see...

MQA-CD is done (a rather silly concept). MQA Live last year went after the broadcast market (I don't get it - if customers are paying, why skimp? Just stream lossless 24/96!). Then later last year and early 2019 they spent some energy going after the in-car market (Yeah... Pseudo-hi-res in vehicles sounds innovative!). And we know that Tidal Mobile has MQA-decode for iOS and Android apps now. Yippie.

What's next??? In any event, could trigger interesting discussions.

Heading off to Toronto for the weekend so posting this early. I'll be bringing a copy of the Bob James Trio's Espresso for the plane ride - sounds great originating from 24/96 recording and available as hybrid SACD. Beware, resolution reduced and cheaper MQA-CD also available.

Have a great weekend and week ahead! Enjoy the music...

PS: 80's synthpop sound lives:

6 comments:

  1. Very interesting. I owned a "golden" i7-7700K chip that hardly got hot at 5.0Ghz. kinda bummed out I sold it when I see how good the 7700 still holds up!!

    ReplyDelete
    Replies
    1. Impressive. 5GHz stable, I presume on air would be truly a "golden" chip!

      I haven't tried overclocking at this point... Probably won't for the purpose I'm using it for as simply my file/web server. Roon DSP likely will be the most computationally intensive task for the machine!

      Delete
  2. Great write-up, Arch! I love it when someone scratches the surface and figures out why something isn't working as well as it should, and then fixes it. Too many people accept mediocre results uncritically and figure the only solution is "more bigger."

    ReplyDelete
    Replies
    1. Thanks Allan,
      Yeah, I think in a world of never-ending product cycles and (at times forced) obsolescence, too little time has been spent on optimization for the things we already have! Or in the world of audiophilia, perhaps recognizing that some things are already not just "good enough" but even beyond what we could really ever perceive.

      Hope all's well down in Oregon :-). I'm waiting for some sun here!

      Delete
  3. Thanks for publishing these tests. The faster processor would also be advantageous in a multi-room system. At trade shows, we (Benchmark) use a single i5 Roon server to drive 4 or 5 headphone listening stations that are each playing independently. All of the DSP functions are turned off, but the 5 output streams add up to a significant load on the i5, especially when playing higher sample rates. The i7 would be a good choice for any system supporting more than one or two audio streams even if you don't anticipate using the DSP features.

    My recomendations:
    1) Select the i7 if you are using DSP
    2) Select the i7 if you are implementing a multi-room system

    John Siau
    Benchmark Media Systems, Inc.

    ReplyDelete
  4. Thanks for the great review. Is this not a old analogue file upsampled? to what ever ?? high res format?? " 24/88.2 SACD rip of Zubin Mehta & LA Philharmonic's Holst: The Planets as the source audio file." I have the original LP which was a real treat, but sold all my LP equip. Like many I use Tidal streaming and MQA treatment to Old analogue, I find pleasing. Better than original. I agree if it was recorded 24/96 and above, there is no need for MQA. Hopefully full HiRez streaming option will be a option for all Services. For now Tidal HiFi and Qobuz are good options.

    ReplyDelete