Saturday, 24 July 2021

Mega-taps upsampling: Remastero's Pan Galactic Gargle Blaster (PGGB) software. (Broadly on audiophile software & the noise boogeyman.)

This article came about after I received an E-mail from an audiophile friend who saw this Audiophile Style thread in praise of "math and magic". It links to a piece of software by a site called remastero, and the program itself is called "PGGB" (Pan Galactic Gargle Blaster), obviously referring to The Hitchhiker's Guide To The Galaxy with the main author named Zaphod Beeblebrox (who in the book is also the ex-president of the Galaxy). Cute, and of course the number "42" features prominently here and there.

In the past, we have talked about "audiophile" software that supposedly affect sound quality. Years ago, we talked about bit-perfect players (Windows, Mac) and really how "bit-perfect" is simply "bit-perfect" regardless of what software is used. We discussed questionable programs like JPLAY. Then there are the OS tweaks like Fidelizer. Neither JPLAY nor Fidelizer made any difference in my testing or listening.

That is not to say software doesn't make a difference at all. With the computing power we have these days, we can certainly perform highly precise filtering and DSD-PCM transcoding - like with HQPlayer

The idea with PGGB is that this is software that will take (in batch) various tracks you have and convert these to upsampled versions like 24/384 or 32/705.6 or even higher. In the process, applying very strong filtering (eg. on the order of >200M-taps sinc filter for some of the tests we'll run here, very impressive big number, right?). Furthermore, the website states that the software can apply settings for various levels of "transparency", apply HF noise filtering, uses noise shaping, adjusts gain monitoring for intersample overs, deal with convolution filters, and an apodizing setting. That's a bit of stuff so I won't promise that we'll hit on all these here. My intent is to at least have a good look at the foundation of the upsampling effect and the EQ function.

So with that E-mail, my friend told me he downloaded the software (he started with the "Whittaker-Shannon Edition", but more recently the "Equalizer Edition", v2.0.42 as shown above) and has been running it on a 1-month trial license converting some of his favourite tracks. He seemed to enjoy the software and noticed some differences in sound so wondered if I would have a look at this and/or suggest some testing.

Well, since we live a bit of a distance apart, I thought about this for a bit and sent him some of my test signals and music tracks to see if he could run it through his machine and upload the data to me to have a peek. We E-mailed and chatting back and forth over a week, sharing files and ideas on what to test. This article is a culmination of the test results and ideas.

I. Let's talk "audiophile" audio processing...

To start, I think it's important for us to consider as "audiophiles" just what kind of processing we can do that would be beneficial.

Are we talking about processing that would add certain "euphonic" benefits to the audio? For example, a vinyl DSP plugin like iZotope Vinyl might be enjoyable by some but I don't think those of us interested in "high fidelity" would consider doing something like this (or even vinyl playback itself) as beneficial.

For certain situations like with headphones, we might want to process audio through a cross-feed DSP to get the sound "outside the head". A little while back, someone suggested I try 112dB Redline Monitor for example. (BTW, here's Linkwitz hardware if you want to build one.) Over the years, I've been an advocate of room correction DSP; the difference this makes is also obvious.

If we look at the website, PGGB purports to stay "true to Nyquist-Shannon sampling theorem". It claims to use long filters and states "the longer the filter, the higher the reconstruction accuracy and the more transparent the sound". Sure, that can be true. And linking this with subjective experience, it claims "what this means to you is better depth and layering, improved resolution, a cleaner leading edge, and more accurate timbre". Again, sure, accuracy and resolution of the sound correlate but at some point, that link breaks down as to how much this matters "to you". What I'm implying is the concept of diminishing returns. For example, the audible improvement in resolution going from 8-bits to 16-bits is obvious. But 16-bits to 24-bits is typically marginal at best (assuming you can even hear a difference). Yet both are +8-bit increments.

So then, if the goal of PGGB is to maintain "accuracy", this automatically locks us into an implied type of sound. Given the excellent resolution of today's DACs, logically, the effect of this algorithm is unlikely to change the sound too much if we're already starting with good resolution content (ie. good 'ol 16/44.1 is already great resolution). We should not experience significant frequency response changes for example. Likewise, we should not hear changes to the dynamic range of the music unless we actually think the software expands dynamic range somehow (this would not be faithful to the source, right?).

So, the bottom line is that unless we're saying that the DAC has very poor digital filtering to begin with, logically, it would be wise to caution again unrealistic expectations of massive improvements using PGGB. In other words, it's probably best to avoid the word "magic". Some might call this "closed minded", but I see it as simply being realistic before we examine anything, especially if we're thinking of spending money. Open minded enough to give it a chance with testing and listening, but not so open minded as to have our brains spill over to embrace fantasy (as per this).

Let's start our exploration rationally with examination of the facts then!

II. Objective Measurements: Digital

While I agree that ultimately, it's about "how something sounds", I have always felt that subjective descriptions are not that useful most of the time unless I'm sitting in the room with a friend and we can both "compare notes" having heard the same sound. This is even more important for evaluation of something like this in that the software touts all kinds of technical claims like mega-tap filtering, huge levels of precision, "accuracy", etc. but interestingly, I don't see any details on the website like even simple measurements or diagrams of the filters implemented.

I started by downloading the software but did not activate it just to see what it offered and how to proceed (screenshots are of activated trial from my friend's computer). We can see that it's a batch processing program where you select directories ("Input Folders") for it to process, and a target directory ("Output Folder") for the "upsampled" / "remastered" data like this:

Notice in that picture, the Yosi Horikawa track is being processed using 234M taps - that's a big number!

My friend is using a 12-core Intel i9-10920X with 32GB of RAM and claims the conversion did not take too long, something like 5 minutes to convert around 20 minutes of audio by his estimate. I'm sure this fluctuates depending on the complexity of the filters the program chooses to use. That's still likely a faster CPU than what most people have though!

The program provides some settings to supposedly fine-tune the sound. For our testing, we decided to try 2 settings which we'll call "DEFAULT", and "ALT" to see what kinds of differences this makes:

DEFAULT is basically what the program started with and ALT is with changes made as you can see. I'm not sure how "Natural" varies from "Front Row", and I can't say what "Transparent" is or what "Dense" sound means if both are aiming for "accurate". The guide doesn't provide clear technical explanations for these vague subjective terms either.

As you can see, the input audio signal is being upsampled to "705.6/768"kHz, 32-bits, and automatic gain has been applied for both settings. You can see the amount of physical memory, logical cores and max threads (not sure why the limited number of threads) available on the machine.

Since this is about upsampling and filtering, let's start with the impulse response shall we?

As described on the website, indeed very long tap-length sinc filtering has been applied. The converted impulse response file says 11M-taps specifically for the above signal. Notice morphologically that it's a typical linear phase filter for both settings with the expected long symmetrical pre- and post-impulse ringing.

So how well do these filters function in the frequency domain? Let's bring out the "Digital Filter Composite" (DFC) graph I've been using for years. Note that the difference here is that I will derive these in Adobe Audition using a 65k-point FFT and there's no "Digital Silence" plot which I normally show in my measurements since this is done digitally and silence would be "infinitely" quiet:

We see the result of that very high tap-length digital filter. I tried to keep the scaling the same for both graphs. Notice that the "ALT" setting shows lower dynamic range from signal peak to noise floor. I'm not sure which of the settings caused this and didn't try to specifically chase this down ("Dense" presentation maybe?).

I noticed a couple of other interesting findings on the DFC. Let's zoom into the "brick wall" sharp corner up around 20kHz:


Apologies for the axes not exactly lining up. Basically, what we see is that the filter cuts off right around 21kHz. Despite the different options used, I was a little surprised that the "corner frequency" did not change at all; I would have thought that maybe words like "Transparent", "Dense", changing "apodizing" might correlate in some way. Given that this is originally a 44.1kHz signal, we're actually losing the content from 21-22.05kHz. I agree that this is not audible, but just the same, we are missing something in this "maximal transparency" upsampling process.

The other thing I noticed was that the difference between the 0dFS and -4dBFS noise signals is certainly not 4dB as intended. The program obviously added an automatic gain to the signal, in this case the -4dB signal has been boosted by +3dB relative to the 0dBFS. To show you what the level difference should be like unprocessed, here's an overlay with the original wideband white noise and 19 & 20kHz sine:

As you can see, the PGGB "remastered" version is missing the remainder of the audio signal >21kHz. Also, the levels of the wideband noise signals have been automatically shifted so they're closer than the intended original 4dB delta.

BTW, don't worry about the 19 & 20kHz peaks being narrower for the "Original" signal. This is simply because the FFT remained at 65k-points for both 44.1kHz (original) and 705.6kHz (PGGB processed).

Okay, so the PGGB filter does have some effect, but does this matter to the sound quality of music?

There is one way to find out that's quite easy to do objectively... Let's use Paul K's DeltaWave Null Comparator currently at version 1.0.70b!

Years ago, I created a short test file consisting of a few seconds of Pink Floyd, Rebecca Pidgeon, The Prodigy, and Rachel Podger for use with the old Audio Diffmaker program which I called the DiffMaker Audio Composite (DMAC). Since some of the material originated as 24-bits, I created the file as 24/44.1, in total about 30 seconds of music.

So let's take that 24/44.1 file and upsample it with something common like Adobe Audition 2021 (version 14), and compare this output with the same file processed by PGGB. Does PGGB upsampling with all its mega-taps and other "magic" processing differentiate much from Adobe Audition?

For completeness, here are the settings - notice that we used the "DEFAULT" PGGB preferences but left Gain set to 0 as there's no point fooling around with volume control since I know the test track will not clip. As for Adobe Audition, let's use the highest 100% "Quality" upsampling to an integer 32-bit, 192kHz WAV file. Speedwise, Adobe Audition upsampling to 192kHz took <2 seconds on my AMD Ryzen 9 3900X, my friend reported 30 seconds using PGGB upsampling to 705.6kHz with his Intel i9-10920X:

What was the result of running the upsampled files through DeltaWave Null Comparator? Here's the difference (delta):


Basically there was no difference! Within the audible spectrum up to 20kHz, the delta across the frequencies is below -140dB. The only notable area of variation is that chunk from 21kHz to 22kHz where PGGB hits the brick wall at 21kHz while Adobe Audition keeps the high-frequency content intact to 22kHz. Notice that the music overall doesn't contain much content above 21kHz which is why the delta was still below -80dB on average.

Since Adobe Audition upsampling is done with linear phase, and likewise PGGB is linear phase, the "Delta Phase" plot does not show any difference (nor is there variation in group delay):


Finally, for completeness, let's show the narrow, and low-level error distribution between the 2 files, and also Paul's "PK Metric":


Notice that on the PK Metric graph, we see lower correlation at 30-34 seconds (ie. higher delta). This segment of the DMAC file contains the high-pitched chimes and ringing from the start of Pink Floyd's "Time" which has more 21+kHz content compared to the other music segments. Since the 21-22kHz content is filtered out by PGGB but retained by Adobe Audition, that portion looks "more different" when compared. Seeing this pattern I think is a nice reminder of just how sensitive a tool like DeltaWave is! Obviously human hearing would not be able to perceive >21kHz content (unless one claims to have super-human Golden Ears of course ;-).

With a PK Metric score as low as -119dBFS, I believe we can say with certainty that the PGGB version sounds identical to an Adobe Audition upsample. Imagine that friends, 26M-taps processing used for this file containing actual music, and no meaningful difference in sonic quality - make sure to let that sink in when you might be tempted and impressed by big filter tap numbers!

[For those curious about how the PK Metric is calculated, here's what Paul K told me in a discussion recently: "PK Metric starts with this raw error value and then computes a value that's similar to partial loudness by applying equal loudness curves, ERB masking and evaluating this on a 400ms intervals along the whole track." So the value is a "perceptually weighted" error signal. I'll certainly be looking in the days ahead to use the PK Metric more in situations like this.]

III. Objective Measurements: Analogue from DAC

I trust we're painting a clear picture already of what's going on with the PGGB upsampler. To be more complete, let's also run some measurements with an actual DAC. Since I finished measuring the Topping D10s with the ES9038Q2M chip published just last week, the set-up was still hooked up so let's run some RightMark tests using direct playback compared with PGGB upsampling using either "DEFAULT" or "ALT" settings at 24-bit 352.4/384kHz (maximum samplerate for the DAC). I'll be using the little Raspberry Pi 3 B+ "Touch" with Volumio as bit-perfect playback software.

16/44.1 RightMark Test:


Starting with the 16/44.1 test, as you can see, there's little difference whether I sent the signal directly to the DAC or if I upsampled with PGGB. The only difference worth commenting on (not that it's audible) is the difference in cornering frequency as we have already learned looking at the digital data. As I mentioned previously in my Topping D10s review, it implements a sharp linear filter that cuts off around 20kHz which is even earlier than PGGB's 21kHz corner (as can be seen zoomed in, note that I see no difference in the lower frequencies down to 10Hz).

24/96 RightMark Test:


As for 24/96, here's where we see something more interesting which we have not seen up to this point. Using a higher 96kHz sampling rate, the DEFAULT vs. ALT settings will change the corner frequency with the ALT setting actually pushing the filter higher.

Frequency response, linear scale. With DEFAULT setting, corner at 30kHz, ALT setting up at ~45.7kHz.

Exactly which setting caused the change in corner frequency was unclear to us initially, but with some testing, realized that it was the "HF Noise Filter". This doesn't seem intuitive. I find this again speaks to the lack of clarity as to what these setting descriptions are supposed to mean without providing technical definitions. Anyhow, here's what happens when we changed "HF Noise Filter" through the 3 options from Full to Moderate to Minimal:


Looks like this setting changes the noise shaping amount so the "Full" setting will result in lowest ultrasonic noise overall, but suppression right after the "brick wall" is not as deep, it also changes the cut-off frequency used - Minimum = 40kHz, Moderate = 30kHz, Full = ~45.75kHz. By the way, does this sequence make sense? Also notice the slight changes in amplitude level with Gain set to "Auto".

Interestingly the PGGB filter appears to add some noise, disrupts crosstalk and changes IMD+N at the corner frequency. Specifically regarding that noise spike at the "brick wall", let's double check that in the digital domain:


Using a higher resolution WaveSpectra FFT we can indeed see that there's a sharp spike right at ~21kHz when upsampling this 44.1kHz signal before the very deep attenuation with noise shaping causing noise floor to rise again beyond 40kHz. I noticed that the amplitude of that noise spike varies depending on the signal being upsampled, at times up to -80dBFS or so. This should not be a problem given that it's out at 21kHz and low enough to be obscured by a recording's noise floor, nonetheless, ideally it should not be there. Good to see that what RightMark found in the DAC output was indeed present in the digital signal itself, showing the sensitivity of the test and accuracy of the Topping DAC playing what was being fed into it.

One last thing. Notice that the summary numbers for noise level and dynamic range tend to be lower with the upsampled PGGB playback. I believe this is a result of the DAC running at 384kHz. Over the years, I've noticed that most DACs perform best at 24/96 and pushing them to 24/192 or higher will result in very slightly noisier performance. Obviously, differences vary by device and you can only know by measuring the behaviour of your DAC. Be aware of this and don't just think that bigger samplerate numbers like 384 or 768kHz somehow also results in better performance! The noise floor is often actually higher.

IV. Subjective Listening

Before talking about my own impressions, I asked my friend to give me his thoughts not knowing any of the data above:
"Hey Arch, fun gettin' this stuff to ya. 
Okay, here are my thoughts on this program. 
Like we discussed, it takes alot of computer power and memory to run. When running PGGB, I see all cores being used at different times in the process and the machine dips into the VM [virtual memory], requiring more than 32GB RAM especially with longer tracks. So far, I've resampled about 5 albums. The install also downloaded the MatLab runtime which took up >3GB disk space. I don't love the UI which can be a bit slow even on a fast machine.
It sounds good to me. I notice in A/B listening that the PGGB files do seem to be louder than the original sometimes and this will affect what I think of the sound. I have seen some of the comments on the AS thread but I'm not noticing as much as some of those guys say. Using my Chord Hugo TT2 with upsampled music to 32/768, it's good but I really don't think I hear a huge difference in dynamics or transient resolution. Chord already talks about using a long '98,304-taps' filter for the TT2. I'll keep listening but I'm just not sure these huge files are worth keeping even if they sound a little better!"
As you can see, he's kind of sitting on the fence, equivocal about the benefits. He notices that there has been change in gain applied (when set to "Auto") hence the converted tracks could sound different due to relative amplitude.

For listening, I asked my friend to upsample 4 familiar tracks for me. To give you an idea of the size difference he speaks of, here are the tracks, with the PGGB converted files highlighted in yellow - notice the tap-lengths used by the program varied from 150-234M-taps:


The Rhiannon Giddens track is 16/48, the Eiji Oue is 24/88.2. The rest are 16/44.1. Since I use FLAC, I would not be able to compress those 32/700+kHz files. I suppose one could just use PGGB to convert up to 24/384 which FLAC can handle. I believe there is the perception that higher samplerates result in "better sound" which IMO is a bit silly. I think for most people, >1GB/song would be excessively large and unwieldly, more likely than not, a waste of storage.

So how do the processed files sound? Well, they sound good (as expected) to be honest with you. I listened to them with my main system using the RME ADI-2 Pro FS R Black Edition for 705.6/768kHz playback. The Yosi Horikawa track (off Wandering) was best appreciated with headphones since it's a binaural recording; I had a listen with my Sennheiser HD800. Nice and precise sound. Excellent attack on the "pops" and bounces of what sounds like ping pong balls off hard surfaces. 3D soundstage excellent.

I've really enjoyed Rhiannon Giddens' albums lately and this track comes from Freedom Highway. A good country-blues song with clean vocals, instrumental accompaniment, and a clean-sounding male rap segment in the middle. Great rhythm, no problem with the layering of various instruments and their place in the soundstage.

Dua Lipa's "Love Again" (off Future Nostalgia) is one of those contemporary pop songs we might hear on the radio. Low dynamic range but fun. The upsampled version sounded clean as one would expect with software that compensates for intersample overs on these loud tracks.

Finally, Eiji Oue & Minnesota Orchestra's 1996 recording of Stravinsky on the Reference Recordings label is great, particularly "Infernal Dance..." which is powerfully dynamic with plenty of low-level nuance. Again, it sounds excellent upsampled here with PGGB through my RME DAC.

I guess the question folks might be wondering then is - "Do you hear a difference compared with standard 16/44.2, 16/48, and 24/88.2 playback? Would you use PGGB based on what you hear?"

To be honest, I don't hear much of a difference doing A/B comparisons. The RME's filters (particularly "SD Sharp") sound great already. Admittedly, unlike most of my reviews where I listen first before measuring, this time I already was aware of the objective data above so I might not be as sensitive to any "magic" ;-).

As someone who routinely downsamples hi-res material, there's just no way I'm going to be keeping 32/768 WAV files in my music library other than maybe a few as test tracks! (WavPack can handle 32-bits and these high samplerates with 40-50% compression, but player compatibility remains an issue, for example Roon does not accept WavPack.) 

One last thing to note, we can see the change in volume levels after upsampling:


Notice the difference between low and high DR files. After conversion, Tracks 1-3 have their peaks pinned to 0dBFS with reduction in the RMS level to account for the intersample overs that are almost bound to happen with these dynamically compressed, low-DR, "loud" tracks.

However, on a relatively soft track like Track 4, the program has allowed a small gain in amplitude by +0.13dB. So my friend's impression that PGGB increased the amplitude of his files suggests that he enjoys less compressed music like classical and other acoustic genres (indeed, he told me he listens to mostly classical).

V. PGGB-EQ Handling of Room Correction Filters

In the latest "Equalizer Edition" of the PGGB software, there's now the "PGGB-EQ" feature which is a convolution module that will process room correction filters. By default, room correction filters are imported and "optimized" by PGGB although you can change this to "Use as-is" to prevent PGGB from altering the FIR filter.

One of my room filters, "optimized" for PGGB, "imported and on".

I noticed that there has been controversy around how EQ optimization is being implemented as in this long discussion with Audiolense designer Bernt Rønningsbakk of Juice HiFi. I appreciate how Bernt has taken the time to present his perspective with data.

Although my friend with his Chord Hugo TT 2 did not try the EQ function himself, another reader provided me with some data for comparing the results of PGGB-EQ "Optimization" using an original room correction filter created in Acourate. Here is a graph showing the frequency response overlay between Acourate and after PGGB-EQ processing:


As you can see, indeed PGGB-EQ replicated the frequency response of the original filter. There are a few minimal differences below 200Hz reflective of the much higher tap-count with PGGB; nothing of concern and in an actual room measurement with a microphone, I suspect these differences might not even be measurable.

What has changed significantly is the filter's phase / time-domain characteristic:


PGGB has stripped out the time-domain correction component (red flat line!). This is no surprise (as in the Juice HiFi discussion) and the PGGB-EQ guide does state that the "optimized" recalculated versions of the filters change tonality / frequency response only.

I wanted to double check if PGGB applied EQ to both channels independently using one of my filters in the "Equalizer Edition":

My 2-channel room correction filter applied to a REW sweep. As we can see, PGGB-EQ does apply frequency equalization to the channels independently. 10Hz start and 20kHz end frequencies as per settings. And confirmation that phase correction is off.

I'll leave you to decide if this removal of phase correction is a "good" thing. Personally, I'm with Bernt and I know Uli Brueggemann of Acourate also believes this is a bad idea. In my experience, time-domain correction improves the sense of expansiveness in the sound stage as well as more precise positioning of voices and instruments. So by doing this, PGGB-EQ seems to be discounting these potential benefits, making the room correction a purely frequency-domain affair.

Be mindful that since the complex filters created by AudioLense and Acourate are usually meant to include time correction, removing this could result in anomalies like pre-ringing and the sum of the two channels with EQ applied could result in unintentional cancellation effects, making things worse! I wonder what kind of testing was used by the PGGB author/testers to confirm that the filter "optimization" resulted in an improvement (and not in fact deterioration).

VI. In summary...

Well, as far as I can tell, PGGB looks like an accurate upsampler which performs as promised in the objective sense. The impulse response shows us that the program is using a long tap-length linear phase filter; the "brick wall" steepness of the filter is confirmed on the DFC graphs.

With "DEFAULT" settings, 44.1kHz material is cut-off at around 21kHz, 48kHz around 23kHz, and 88.2/96kHz material around 30kHz (as discussed above, dependent on "HF Noise Filter" setting and can be up to >45kHz). I can see that the software respects and manages "intersample overs" and will not clip which is good. The auto-gain function does change the average output level with highly compressed audio particularly pushing average amplitude lower to prevent intersample peaks (which could easily reach +3dBFS and uncommonly +6 to +8dBFS even).

The question is whether one is subjectively able to hear a difference (apart from gain differences) which is the value proposition if one were to buy a license. Alas, I see/hear no evidence of any "magic" here and whatever special processing is being done seems to be low-level and inaudible. In fact, the DeltaWave results suggest that the differences between PGGB and upsampling with Adobe Audition to 32/192 using music data is indistinguishable. Likewise with my own listening, I failed to hear a difference in my sound room.

As for PGGB-EQ "optimization", it literally functions like a pure "EQ" rather than the full capabilities of modern DSP room correction with time-domain alignment of the frequencies at the listening position. The workaround is to select "Use as-is" when importing your FIR filter but make sure you supply filters for each sample rate you'll be using as there will be no automatic filter resampling (not tested to see how well this works; I see there is an option to use the "Optimized" filter though without time correction if matching filter not found).

I accept that as audiophiles we can be passionate and obsessive people. However, as I wrote a few years back, there is a point of "Good Enough". There is always a point of diminishing returns especially with mature technologies like today's hi-res DACs. Over the years, I've seen the evolution of digital filtering such that the results (as you can see on the "DFC" graphs in many DAC reviews) are already excellent with many devices.

I think it's rather ironic that as a "more objective" guy, I believe that 16-bit's 96dB dynamic range and 24-bit's 144dB range at best may sound subtly different with actual music. Yet on forums, seemingly with more "subjective"-leaning opinions, we can find expressions of the belief that calculating to 300dB dynamic range (50-bits) precision can be audibly meaningful! This is a convenient example of co-opting technical/"objective" numbers to claim subjective correlation that I believe would not survive any type of controlled listening. In the same way, some will be impressed by numbers like Chord's "1,015,808"-taps WTA upsampling or PGGB's hundreds-of-millions of taps filter even though there's simply no reason to attribute direct sound improvements to just these numbers.

This is the extreme opposite of the usual subjectivist "measurements don't matter" stance! Yeah, I know this is supposed to be a hobby and meant to be "fun", however, these polar opposite perspectives create for a messy hodge-podge of undisciplined dissonant beliefs which IMO can be used to make a case for almost anything (especially products selling for hundreds or thousands of dollars)!

I remember back in the days when I was reading primarily subjective-oriented viewpoints, I would be lost in never knowing whom to believe thanks to these kinds of disparate ideas. I propose that this is unhealthy for the mind and soul of the hobbyist and hobby itself. Subjective claims that borrow from specs and use big numbers are just as crazy as those who feel measurements make no difference and also just as inappropriate as portraying a product as "best" using any single measurement. Find balance folks!

In general, audiophiles, IMO be careful of unusual or esoteric software claims. Simple, straight forward bit-perfect playback to a high-resolution DAC with a decent built-in filter is basically all we need for high fidelity sound these days. So long as the filter looks good enough on the DFC graph - good post-Nyquist frequency attenuation, minimum imaging artifacts, minimal overload with the 0dBFS wideband signal, overall low noise floor - and sounds good to you, then the functional outcome has been achieved without need for a "mega-taps" war among products. [BTW if I do want to obsess over filters, I might try something like Ms. Goldilocks. ;-]

If your DAC implements poor filtering (or even no filtering like NOS DACs), upsampling software like HQPlayer (previously discussed) is flexible if you want to get fancy and try the various algorithms and DSD with realtime playback. If you're using something like piCorePlayer, feel free to play with the settings I talked about years ago. Also, the Roon filters are very good. Comparatively, PGGB is less feature-filled and it's meant more as a batch converter with extreme levels of ("ludicrously long") processing that I believe are simply not beneficial and wasteful of energy and storage space.

For completeness, I see there's also a cloud-based option for those who may not have the hardware to process the audio themselves (see PGGB.IO for the fee they charge). Although I personally fail to see the benefit of this software and I have to be honest with you about this, nonetheless, Remastero is offering a free trial so by all means give it a shot if you're curious about how this works out on your own system.

Thanks again to my friends (who wanted to remain anonymous) for the nice conversations and sending me the results of the PGGB conversions tested here!

Over the years, readers have E-mailed me about other unusual software such as Bughead Emperor from a fellow in Japan (here's a thread) that seems to be no longer developed. Seriously folks, until there's some basic evidence that a program like this makes a difference (the term "remastering" has also been used by Bughead), many comments appear to be unsubstantiated testimony at best or possibly even delusional. The author likewise seems to have some strange ideas when he was running his blog awhile back beyond audio-related matters. Thanks but no thanks.

Speaking of remastering audio, I was E-mailing a pro-audio friend the other week about music "remastering" tools. If you really want to try making changes to the sound, consider software like Soundtheory's Gullfoss (check out the YouTube video), or iZotope Neutron 3 with their "Sculptor". Both of these packages are <US$250 (PGGB is asking $500 for personal license BTW). This is what true "remastering" looks (and sounds) like; there will be clear frequency and intensity changes to the audio, not basically flat upsampling with a little bit of levelling variation as we've seen with PGGB. Lots of math, plus creativity for sure when remastering, but no magic needed. In the context of actually substantially changing the sound, I don't think the term "remastering" is appropriate for what PGGB does.

--------------------

Recently, I was online and one of the vocal audiophiles who keeps talking about "noise" and distortion in audio systems admitted that he hasn't even used SSD drives for music storage. First of all, whenever you hear about "noise" in audio forums, make sure to ask "What noise?" is the person talking about. Is it acoustic noise of fans, hard drives spinning and head seeks? Is it radiated noise like RF interference when the DAC or amp is placed too close to the computer? Is it conducted electrical noise somehow permeating the system and finding its way into one's DAC unless galvanically isolated?

Like "jitter", "noise" is another of the Core Audiophile Boogeymen often talked about but poorly defined even though it should be easier to measure if it's polluting the audio signal. Unless one understands the specifics of what is being spoken of, and even better yet, can identify the source of the noise being referred to (eg. mains 50/60Hz hum, ground loops, USB 8kHz PHY noise, CPU/GPU load noise...), often it's hard to discuss solutions without getting stuck in vague speculations or people recommending audiophile snake oil (like this perhaps). Notice the obfuscation used sometimes when you see snake oil companies claiming that a product reduces noise, yet there's no clear demonstration of what noise is being suppressed. We often see this with expensive cable company literature. I take this example from John Swenson as a classic; notice the word "noise" appears 56 times on that page, yet not a single graph to show what he's talking about or how this can be fixed with the UpTone REGEN / Sonore Rendu products (notice not a single diagram on the EtherREGEN white paper either, "noise" used 50 times!).

Anyhow, for audiophiles who fear noise, do give SSD drives a try. A modern, reliable high quality hard drive like say the 2TB Western Digital Gold Enterprise uses about 6-8W of power even when idle. In comparison, SSDs idle at <50mW typically, and at most might need 3W when heavily reading/writing. They're acoustically silent, need less power, less heat, in turn less potential electrical noise if one is worried about such things. As 8TB SSDs drop in price and hopefully even larger capacities become available, I continue to look forward to the day when my Server computer can transition out of spinning hard drives!

A few months ago, I mentioned the inexpensive Silicon Power 2TB TLC drive which I use with my gaming computer. This week I grabbed a 2TB Samsung 870 QVO as well for backups and music storage on my work machine, going for <US$200.

Not the fastest SSD out there (QLC memory tends to be slower once the cache is used up), but for backups and storage over SATA (much slower than NVMe), this should be good with Samsung's reliability and is still much faster than any spinning hard drive. Endurance rating for the 2TB drive is 720TBW which is just fine for many years if you're not doing daily heavy data-writing!


I've seen discussions about RAM (total amount, ECC vs. non-ECC, etc...) apparently affecting sound quality (like this). Even parameters like CAS latency having an effect. Of course you'll find articles on SSD sounding awesome. I suspect there is a natural correlation between folks obsessing about expensive computers as audio players/streamers, worried about all this stuff (without actual evidence) who are also proponents of software like PGGB. I would suggest being mindful of the "culture" around all of this and be picky about discerning facts from fictions.

Until demonstrated otherwise (measurements, controlled listening results), when it comes to audible sound quality differences, "Bits Are Bits" should be one's default position these days given the level of audio resolution even with inexpensive products.

Stay sane, dear audiophiles.

--------------------

I hope you're all doing well and enjoying the music! I do have a few other discussion items planned for later this summer. For now, I'm off on vacation... ;-)

So long, and thanks for all the fish.


21 comments:

  1. Regarding filter lengths: https://www.audiosciencereview.com/forum/index.php?threads/chord-quest-vs-rme-adi-2-dac-fs-tap-count.22124/post-734918

    ReplyDelete
    Replies
    1. Nice one Mans,
      Thanks for the heads up and a good reference for Rob Watts and others. LOL, WTA 1M-taps is child's play considering what PGGB is claiming here.

      You really should add articles like that to the Troll Audio site which would be an excellent place for accessing the info... Way too easy to get lost in message threads!

      Just to reiterate an important point in your post:
      A 1000-tap low-pass filter with a cut-off frequency of 21kHz is adequate for -300dB stopband rejection.

      -300dB is obviously a huge number already. I think that's a pretty nice "landmark" tap number for audiophiles to keep in mind so as not to go ga-ga over MEGA. (Of course, even that 400-tap setting is more than enough.)

      Delete
  2. To be an audiophile one must be concerned about reducing mainly non-existent noise, thus software as above, audiophile racks, power conditioners, Wiccan cables & power cords, and spurious jitter-eliminators. However, this must be balanced by introducing actual audible noise: vinyl playback artifacts, tube-noise especially via the ever-desirable SET amplifier, ported speaker chuffing, and the coming audiophile Renaissance of eight-track tape. Some noises are more equal than others. best, TerryNYC

    ReplyDelete
    Replies
    1. LOL Terry, good one... That's a very astute observation!

      Actually, I would suggest that the software above probably increases noise as well by virtue of the amount of USB data transfer and running the DAC at those 8fs and 16fs type speeds (as the RightMark results show). Something that proponents don't talk about in favour of impressing us with those big numbers!

      Delete
  3. Thanks again for discarding more nonsense. As for the reduction of real (acoustic) noise, I have recently replaced the ageing and noisy desktop computer in my home office that is also used for streaming music while I am at work. The new one is based on a 10th generation I7 Intel NUC with plenty of memory and a large SSD drive, in a fanless Akasa Turing FX case. It saves me quite a bit on my electricity bill, and the complete silence is so wonderful. It took a long time for fanless computers to become powerful enough for almost anythng, but that time has now come.

    ReplyDelete
    Replies
    1. Absolutely Willem,
      I remember the days of brutally noisy machines that sounded like hair dryers and one had to go out of the way to get massive heatsinks and such just to achieve a little bit of tranquility.

      Much much better and welcomed improvement these days where even the supplied stock CPU fans sound not too bad in a decent case!

      Enjoy!!!

      Delete
  4. Forgive another naive question from a recovering subjectivist, but is there any reason why software implemented taps (PGGB) would be any better, or worse, or different, from hardware implemented taps (Chord)?

    ReplyDelete
    Replies
    1. Hey Don,
      I suppose there could be differences in the precision of the mathematics doing the job in the various implementations.

      32 to 64-bits internally would be plenty...

      PGGB is doing the calculations independent of playback so has all the time in the world to get the job done with precision. Something like the hardware M-Scaler needs to get the 1M-tap work done in realtime.

      HQPlayer will do the processing during playback as well so you'll need to balance your computer speed with the settings used. Quite often folks need hefty machines and even offload processing to the GPU.

      Obviously, I think this is in general audiophile esoterica... Fun to play with I suppose as "icing on the cake" to "perfect" the audio system ;-).

      Delete
    2. Thanks! I'm looking forward to your thoughts on the Mojo.

      Delete
  5. Stereo Lab is another program that does upsampling, though that's the least of its interesting capabilities. I'd be interested to know what you think of it.

    http://www.phaedrus-audio.com/phaedrus_stereo_lab.htm

    ReplyDelete
    Replies
    1. Cool looking software Jonathan!

      Time permitting will have a look... I notice that it's Mac software. I typically am a Windows guy so will need to make a special effort to get this running ;-).

      Delete
  6. This is scientific level, it is beyond me why you don't charge for reading this.

    ReplyDelete
    Replies
    1. Greetings El O,
      Nah, no need to take anyone's money! Happy to just skim some advertising dollars off AdSense or if folks buy something off Amazon, get a few pennies ;-).

      More interesting I think to engage with the audiophile hobby and explore the beliefs, and philosophies behind the claims.

      Delete
  7. ‘I'll leave you to decide if this removal of phase correction is a "good" thing.’
    It looks like he’s converting the filter to use linear rather than minimum phase. Since the theory is that room correction is seeking to invert minimum phase errors in the room/speaker interface, it needs to use minimum phase filters. So this one’s easy: it’s a bad thing, as he’s lost sight of the fundamental purpose of the filtering. Linear phase is _not_ always better. Whether you’d hear the difference is another question, of course.

    ReplyDelete
    Replies
    1. Yup, well put Charles.

      To say it another way, a single speaker driver is a minimum phase system. If we are to "fix" temporal errors in our typical multi-speaker crossover systems / rooms at the listening position, those phase corrections are necessary.

      True, whether we hear the difference may be different for each listener. I know that many folks have a strong belief that time-domain performance is just as important or sometimes even state that it's more important than frequency performance (I don't generally believe this is true). Then surely these folks must question PGGB's handling of the room correction filters!

      Delete
    2. I measure my speakers myself using REW and made a filter in rePhase to correct both the phase of the speakers and the frequency of the speakers/room interaction. Just for fun, I made 2 filters in rePhase: the first one that corrects the frequency only and the second that corrects both the frequency and the phase. Listening to music, I compared these two filters. I definitely say that the phase correction, at least to my trained ears, is very noticeable, spatially. Without the phase correction, the musicians sound more forward (they step up to me closer) and not as layered (echeloned in depth) as with the correction turned on.

      Delete
  8. Greetings March Audio,

    Indeed this is an important topic to discuss further. Maybe let's take this on in another post ahead this summer ;-).

    ReplyDelete
  9. I know you have measured piCorePlayer many times. Have you ever measured Pi based mpd players such as MoOde? I was shocked that mpd sounded different to piCorePlayer on the same Pi into the same DAC. I generally take the position that anyone claiming these differences is deluded so a test would be most welcome.

    ReplyDelete
  10. Hi Unknown,
    I have not specifically measured the difference between different Pi player. Admittedly I have not checked out mpd; is there any indication that it's not bitperfect like the ones I have used over the years - Volumio, piCorePlayer, RoPieee?

    I have not noticed any difference with those 3.

    ReplyDelete
  11. As for as I know Volumio is also based on mpd unless the squeezelite plugin is installed. They should all be bitperfect but there are so many claims of better sound from mpd despite this. I did try running some tests myself using RMAA and couldn't see any difference in frequency response so I suppose that says it all. After all people also claim USB cables sound different 😀

    ReplyDelete
  12. Hi, you linked to WaveSpectra in your article, however the site has long since gone offline. Having spotted this, I mirrored the entire site, most of which I recovered from the 'Wayback Machine' - the link is here: https://gtkc.net/wavespectra-and-wavegene-mirror
    Cheers, and keep up the good work!

    ReplyDelete