Sunday, 1 November 2015

MEASUREMENTS: Windows 10 "Audio Stack" / DirectSound Upsampling



I know, I know... Windows audio mixer sucks... (From the perspective of perfectionist audiophiles.)

But having just done some measurements with Linux and PulseAudio with some of the upsampling algorithms, I wondered just how well the default Windows 10 audio mixer performed as an upsampler...

As a refresher, here are some plots of the digital filter measurements I showed with Linux last time... The first one is the default hardware ALSA output with the Light Harmonic Geek Out V2 using the excellent built-in hardware digital filter. In Windows, this is the same as using the direct ASIO driver. Next is with the speex-float-1 upsampling, and finally the src-sinc-fastest algorithm.




So, Windows mixer output was set to 24/96 as above. Using foobar set to output the test signals with the Geek Out V2 via DirectSound:

Here's Windows 10's digital filter overlay graph folks:


As you can see, whether output as 24-bit or 32-bit to the Geek Out V2 DAC made no difference. Not surprising since presumably all Windows is doing is converting whatever internal format the calculations are being done in to the 24/32-bit integer value for the DAC. According to this page, the internal "audio stack" operates at 32-bits floating point at least since Windows Vista and 7.

Well, what can I say, the upsampling quality is far from the "ideal" which should look more like the "ALSA direct" graph above. Remember this is upsampling of a 24-bit / 44kHz signal into 96kHz fed to the DAC... There's a ton of aliasing and distortion products in the 19 & 20kHz signal. Wideband white noise with peaks at 0dBFS also suggests significant "intersample overs" - the signal doesn't even get back close to the noise floor from 22.05kHz to 48kHz. I verified that this is not the result of clipping from the ADC recording side.

Why does the digital filter graphs look like this? Check out the impulse response with Windows upsampling:
Actual analogue output from the Light Harmonic Geek Out V2 DAC of a 44kHz "impulse".
It looks like all that's being done by the Windows upsampler is just linear interpolation to "fill in" the extra samples.

But wow... Isn't that cool?! Imagine you saw that exact Windows 10 impulse response printed in the pages of a glossy audiophile magazine for a US$15,000 DAC. I suspect many would be impressed, right? After all, no pre- or post-ringing! (May I suggest someone have a look at the quality of anti-aliasing and frequency response with the emm Labs DAC2X :-)

Of course, as demonstrated by the digital filters overlay graph, despite the nice looking "form" of the impulse response, functionally it is very poor at actual anti-aliasing and limiting intermodulation distortions.

Finally, how well does it measure with the RightMark audio suite?

Remember, compared to the extremely challenging signals used in the digital filters overlay test above, RightMark signals are more typical of most standard audio tests. On the whole, like with PulseAudio, the numbers are certainly respectable. Comparing direct hardware (ASIO) with DirectSound Windows mixer set to upsample to 24/96, we see more intermodulation distortion with the 44kHz Windows upsampled signal.
IMD+N vs. frequency.

Also, we see a high frequency roll-off with Windows software upsampling:
Frequency Response
In summary... Software-based upsampling using the Windows 10 mixer is clearly a qualitatively compromised solution compared to just using the Geek Out V2's hardware directly or what was shown with Linux PulseAudio algorithms previously.

Despite the clear limitations, there is a big benefit to this kind of upsampling - it works fast. This is probably a good thing in the Windows world because the OS runs on so many types of machines ranging from lowly single-core netbooks/tablets/handhelds to full-function multicore desktop workstations. However, for the audiophile "power user", it would be nice if there existed an "Advanced" option where we could choose "high quality" samplerate conversion even if doing this might increase CPU utilization. I guess the fear would be that people with very slow machines might complain if they turned this on by mistake; hassles for the IT support guy maybe.

As an aside, in Linux PulseAudio you could also use the algorithm src_linear which I suspect is the same as what Windows is doing. Probably even faster yet, one could try src_zero_order_hold; resulting in "non-oversampling" squared off waveforms (could be fun to experiment for those wanting to hear a simulation of NOS).

Speaking of NOS... Something worth thinking about is that this simple interpolation upsampler in Windows has characteristics similar to "non-oversampling" DACs. The early roll-off in high frequencies is very similar and likewise the poor aliasing suppression is similar as well! It's not unusual on audio forums for audiophiles to claim that they "prefer" to listen to these NOS devices (usually based on old DAC chips). But yet given the similarities, one never hears of anyone saying they "prefer" to upsample using Windows mixer and playing their music off the DirectSound driver. Nor as far as I have ever seen, anyone raving about that beautifully "ringless", low "time smearing" Windows 10 impulse response. Go figure... :-)

[Oops. Spoke too soon... Here's someone with a preference for DirectSound and Windows 7 upsampling.]

------------------

Overseas now for a few weeks... Happy listening!

19 comments:

  1. Well done. And yes, the Windows Up-Sampling looks a lot like a NOS DAC behavior, with nearly no suppression of Alias at all and so with tons of Alias distortions, but nicely Impulse graph. And those test signals above, do give you a greater picture of what is going on, compared to only the RMAA signals and analysis. Juergen

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Hi Archimago, Thanks for great article.

    Windows 7 or later has at least 3 different Sample Rate Converters.

    DirectSound SRC: IIRC It was introduced for sound effect playback for gaming application on Windows 95 era. I remember I was using Pentium 75MHz at that time :) As you said, performance was 1st priority at that time. Also this kind of slow roll-off filter provides lower latency so it is potentially beneficial for games.

    Audio Resampler DSP (Resampler MFT): Introduced on Windows 7. This SRC provides higher quality linear-phase sharp-roll off filtering. It also provides adjustment interface of conversion quality for someone want to choose to compromise conversion quality for lower CPU load. Ideal to use with MediaFoundation and WASAPI. It seems now Audio Resampler DSP and WASAPI is used on games instead of DirectSound.

    IAudioClockAdjustment : Introduced on Windows 7. This is non-integer factor SRC and used for special purpose. It is used when two audio clocks are existed on the audio chain and these two clocks does not synchronized. For example, it can be used for building block of application to monitor USB asynchronous microphone sound using a headphone connected to another USB asynchronous playback device and two clocks are not synchronized using word sync. If recording device provides PCM data on 44100Hz and playback device requests PCM data on 44099.9Hz, one sample on every 10 seconds must be discarded. IAudioClockAdjustment provides non-integer factor conversion (on this example, 44100 to 44099.9Hz)

    I measured performance of DirectSound and another obscure SRC, MME SRC before:
    http://community.phileweb.com/mypage/entry/2721/20120630/31560/

    and also looked into Resampler MFT conversion quality:
    http://community.phileweb.com/mypage/entry/2721/20120804/32126/
    http://community.phileweb.com/mypage/entry/2721/20120811/32241/
    yamamoto2002

    ReplyDelete
    Replies
    1. Thanks for the details yamanoto2002! Appreciate the details... Great stuff which I don't think the Google Translate really does justice to!

      Audio Resampler DSP looks pretty good... Since it's there already, I wonder if Microsoft could somehow allow us to utilize that algo some day to "high quality" resampling instead,

      Delete
  4. Hi Archimago, It seems Groove music (default music player of Windows 10) and Windows Media Player 12 (default music player of Windows 7) uses Audio Resampler DSP with HalfFilterLength=30 so resampling quality is better than DirectSound. yamamoto2002

    ReplyDelete
  5. Hello there,
    in case you read this (or for anybody else), ALSA (on Linux) is much more configurable than you might realize, it just doesn't have any UI for it. PulseAudio was created to fill the usability hole and for "corner cases" like Bluetooth headsets, which were historically problematic with ALSA (due to pairing, etc.).

    When using it "directly", make use to use the "hw" or "plughw" plugin, to be sure no rate / bitdepth / endianess conversion is happening - the default is to use the "dmix" plugin, which *does* convert everything to a predefined rate.

    This can be done with something like

    echo 'pcm.!default "plughw:CARD=mycardname,DEV=0"' > ~/.asoundrc

    where mycardname is the name in [ ] from /proc/asound/cards, ie. "Intel" or "PCH" for the builtin card.

    Note that with this configuration, only 1 application can use the card at a time (obviously, as the parameters (rate, ..) need to be set specifically for that app). This means that you need to ie. shut down PulseAudio or otherwise make it stop using the card.

    The point, however, is that this is just the beginning - ALSA supports a myriad of plugins, from channel mixing/remapping, resampling, format conversion, EQ, volume adjustments, running external LADSPA plugins, to ie. copying the samples in a file. These plugins can even be cascaded. See http://www.alsa-project.org/alsa-doc/alsa-lib/pcm_plugins.html for some of them - note, specifically, that the "rate" plugin has a "converter" option, which allows you to specify the quality of conversion if you want to trade speed (CPU usage) for quality:

    samplerate_best
    samplerate_linear
    samplerate_medium
    samplerate_order

    (libasound2-plugins package on Debian/Ubuntu).

    If you're interested, just try Google for "alsa asoundrc" to see some examples. If you mess up and want to revert it to the defaults (defined somewhere globally in /etc), just remove/move your customized ~/.asoundrc file.

    For "normal PC use", I run rate+dmix (samplerate_best + multi-app sound), for audiophile use, I export a specific env variable, which makes the application use plughw directly.

    Could be interesting to test the samplerate converters. ;)

    Jiri

    ReplyDelete
  6. Hi Archimago,

    Are you aware of any way to avoid DS resampling the audio without having to resort to ASIO/WASAPI? Some programs obviously don't like to use shared mode but I can't stand having all other audio cut off for exclusive mode and then having to restart applications that had their audio cut off to get it back again.

    Would, for example, using a resampler in foobar2000 to set all audio sample rates to 44.1KHz and setting the sample rate to my device at 44.1KHz eliminate this resampling? I assume most other audio sources will be at 44.1KHz as well.


    As a side note I noticed when I booted my PC that the wordlength display on my Benchmark DAC2 indicated a sample rate of 48KHz until I logged in and the 44.1KHz setting I have on the sound driver kicks in and changes it.

    ReplyDelete
    Replies
    1. Hi chakku,

      I'd like to answer to your question. Following info is based on my observation and it could be wrong

      > Would, for example, using a resampler in foobar2000 to set all audio sample rates
      > to 44.1KHz and setting the sample rate to my device at 44.1KHz eliminate
      > this resampling? I assume most other audio sources will be at 44.1KHz as well.

      If you resample PCM to match samplerate to shared samplerate before sending PCM to Shared mode,
      resampling artifact is gone, but another sound altering effect, "Limiter APO" is still there. sound difference by Limiter APO is subtle though

      Also if you set shared sample format to 16-bit PCM, dithering is performed and noise floor raises, setting shared sample format to 24-bit PCM dithering is not performed

      Delete
    2. Thank you for the response yamamoto2002, apologies for not replying earlier I never saw a notification from this.

      I recently found that I can use the ASIO output in foobar2000 via. the driver software for my Benchmark DAC2 and when used in tandem with my SoX resampler plugin on foobar to resample all my audio to 44.1kHz, I can actually have other audio streams from Windows play at the same time without it being muted, it is only if I disable the resampler and play audio that is a different sample rate, where the ASIO drivers will adjust the sample rate to match when other audio streams are disabled.

      I believe this ties in to the comment by Hifihedgehog below regarding the audio drivers bypassing Windows sample rate conversion.

      Delete
  7. Is there a windows 10 player that detects the right sample and rate in the file and adjusts the active driver accordingly to avoid up or downsampling ?

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. Some audio drivers will bypass Windows's built-in sample rate conversion engine and use their own sample rate conversion systems, in hardware or software, if that is what you are asking. That is why some people buy more expensive sound cards, like the Creative Sound Blaster X-Fi of yesteryear and ASUS Xonar Essence, which demonstrate reduced distortion in the sample rate conversion process. For example, note what Xonar Essence's manual describes here of its "double floating-point filter":


      "The sample rate determines the number of audio samples per second that the Digital-to-Analog Converters (DAC) and S/PDIF digital interface will output. The Xonar Essence STX card can support sample rates up to 192KHz (44.1K, 48K,96K, 192KHz). Usually audio CDs and MP3 files are 44.1KHz; DVD-Video uses 48KHz; DVD-Audio or other HD media may contain 96KHz or 192KHz high-definition audio content. Please select the corresponding sample rate for your playback sources to get the best audio fidelity. Even if your setting differs from the audio source's sample rate, the Xonar Essence STX engine will do super high fidelity sample-rate-conversion with a double floating-point filter, which can reduce total harmonic distortion (THD+N) by -140dB."

      Delete
  8. This comment has been removed by the author.

    ReplyDelete
  9. They already fixed it in Windows 7
    http://support.microsoft.com/kb/2653312

    But they committed the same "crime" again in Windows 10? I am so disappointed. I am not using Windows 10 so I cannot test it myself but did they fix it now?

    ReplyDelete
    Replies
    1. Hotfix KB 2653312 fixes only MME SRC problem and DirectSound SRC (API for gaming app) is not changed at all by applying hotfix. I guess the reason of DirectSound SRC is not changed is, better linear phase lowpass filtering has its shortcomings, it introduces additional latency (filter delay) and playability of gaming app becomes worse

      Delete
  10. Hi Archimago,
    I have just plugged my USB cable on my laptop. Unexpectedly the audio driver for the Audiolab 8200CD had to be re-installed.
    Strange ! I said to myself …
    Then I realized that since I have upgraded Windows 10 to the “2016 Anniversary Edition” perhaps some driver had to be re-initialized.
    I started listening a well-known track and I jumped on my seat !
    Acoustic image had widened both horizontally and vertically, and so the depth. The basses are deeper and better controlled.
    I’m guessing that I’m not positively biased toward the “new” OS: when I started listening I had completely forgot the OS upgrade.
    I changed track, again and again: same feeling.
    In some posts I have read “do not upgrade to the Windows 10 2016 Anniversary Edition”. Why ?
    Thanks

    Teodoro Marinucci

    ReplyDelete
  11. Thanks for providing this informative information you may also refer.
    http://www.s4techno.com/blog/2016/07/24/restart-httpd-server/

    ReplyDelete
  12. You do not synchronize the recording frequency with the playback frequency
    Look at graphs of this article: http://www.aimp.ru/blogs/?p=312

    ReplyDelete
  13. Hi,

    Firstly thank you for an excellent blog, I have enjoyed reading your objective and scientific approach to audio - refreshing.

    A somewhat related request - have you considered doing tests of the Windows volume control, similar to your tests of the Microsoft Windows resampler. There is a lot of myths floating around about digital (in particular the Windows) volume control and its impact on audio quality and for those of us using computers as the main source there are a lot of scenarios where it is the most convenient way to adjust volume.

    If you listen to 16 bit sources and have a 24 bit DAC then setting the windows mixer to pad the output to 24 bits (but don't do rate resampling as per this blog post) you get 24dB of attenuation before you start to lose resolution, assuming your source and amp are well matched and you don't need large amounts of attenuation for a reasonable volume level. Add to that the room noise floor is typically -70 maybe -80dB at a push the lower 16 bits will get lost in the ambient noise anyway, then a digital volume control - even the Windows one is a doable proposition for an audiophile without losing any practical resolution as 24 dB of attenuation is a workable range (at least in my environment).

    Also, with regards to this post & the Windows mixer, if you were to do these tests but using an integer multiplier (ie. 44.1Khz to 88.2Khz) I would assume that there would be a lot less resampling artifacts given the much easier resampling needed (basically linear interpolation should suffice).

    ReplyDelete