Today's post is a bit of a continuation from last time's look at the different types of upsampling anti-imaging playback filters using my Raspberry Pi 3 "Touch", piCorePlayer and SoX. As you can see from that discussion, across the audiophile equipment spectrum, manufacturers utilize all kinds of digital filter settings in their gear. Each company ends up choosing compromises between how much frequency roll-off, how much imaging, how much temporal/phasic anomaly each would accept. And of course no matter what a company chooses, there are ways of advertising the decision as "good"; whether it be on the basis of frequency spectral accuracy, temporal accuracy, or just claims from pure subjectivity - "it just sounds better"!
The end goal of audiophilia is a bit like the modern interpretation of Goldilocks (and the Three Bears)... We're all trying to figure out for ourselves what is "just right" as we wade through the commercial and mainstream audiophile literature, unofficial blogs and forums, mix-and-match speakers with amplifiers, try out different accessories perhaps, and the like. So too it seems with digital filters and all the variants attached to the DACs we buy.
Remember that the only reason we're even talking about this is because of that 44.1kHz (and to a lesser degree 48kHz) samplerate such that the Nyquist frequency is at 22.05kHz; relatively close to the usual 20kHz upper limit of hearing acuity that the younger ones among us might be able to perceive. This is literally the only reason for all the hand wringing and millions of spilt keystrokes over the years around filtering by audiophiles (the few who still obsess over this...)
These days, we essentially have 2 major options for filter "types" among the DACs out there... Linear phase (the default for most mainstream DACs, Chord) or Minimum phase (Apple, MQA, Pono) - pick your "poison" :-). Of course within each phasic variety we have different levels of steepness and allowance for ultrasonic imaging. We intuitively know that due to the biological phenomenon of auditory masking, maximum phase (where the group delay is pushed forward so "pre-ringing" is accentuated) is not desirable. But is there another choice?
Yes, there is of course... We can try to figure out a "just right" state with intermediate phase settings. Accepting that maybe there's some value to ensuring that pre-ringing isn't an issue even with some of the worst audio recordings out there, while maintaining awesome frequency and temporal accuracy - let me show you my choice for the filter that I listen to daily with the Pi 3 streamer...
As I mentioned previously, I sold the excellent Oppo Sonica DAC. As a result, I'm back to my "tried and true" TEAC UD-501 DAC in use since 2013. This is just fine because the TEAC can upsample to 352.8/384kHz which will defeat the built-in filter of the dual internal TI PCM1795 DAC chips.
Archimago's suggested "Goldilocks" filter settings...After some listening, tweaking, and analysis, here's the upsampling setting I've been using in the last couple months for "reference" listening:
Max sample rate:
As you can see, I have no fear of steep filtering. I want technically accurate, flat frequency response all the way to 20kHz at least. This is why in the piCorePlayer setting above, I've set "passband_end" to 95% (20.95kHz with 44.1kHz sampling) and "stopband_start" to 105% (23.15kHz with 44.1kHz sampling). A little bit of imaging to about 23kHz isn't a problem; it's attenuated to a certain degree with the sharp filter slope by 22.05kHz as we'll see later (not to mention further attenuated by one's tweeters and ears).
As usual, I've put in a -4dB attenuation, and 28-bits precision is more than enough. Notice however that the "phase_response" is 45; an "intermediate" setting between 0 minimum and 50 linear.
Here is the Digital Filter Composite (DFC) graph with these settings recorded directly from the TEAC UD-501:
Nice, right? A relatively steep filter extending beyond 20kHz. Flat looking frequency response up to 21kHz. No significant imaging (a few noise peaks and minor irregularities in the ultrasonic noise floor using "real life" hardware). Remember, for my DFC graphs I include wideband noise of 0dBFS that will trigger any evidence of intersample overload; much more demanding than the typical -4dBFS wideband measurement you see in reviews like in Stereophile.
Let's zoom into the transition band with the 0dBFS signal:
As you can see, with these settings, we have a very flat frequency response effectively to 21.5kHz. By Nyquist at 22.05kHz, there's ~6dB attenuation. The settings allow a small amounting of imaging beyond the Nyquist frequency which is well contained - close to -30dB suppression by 22.5kHz. I doubt my dog/cat/bat is going to mind. My rationale for a little bit of imaging is that this reduces the filter length (less impulse ring duration), less computational power needed, while effectively extending the frequency response beyond 20kHz. In other words, if you are human, "golden ears" or otherwise, this filter is guaranteed to be transparent within the frequencies of interest and extracts essentially everything a 44.1kHz signal has to offer.
Bottom line, humans will not hear the imaging artifact. Heck, even my worst DR0-2 recording - Iggy Pop's 1997 remaster of Raw Power - doesn't contain higher than -60dB content from 20kHz up...
So, what does the impulse response look like?
This is what happens when we go for an intermediate phase setting; even a relatively small change in the "phase response" parameter from 50 to 45 with SoX. The impulse becomes asymmetrical with pre-ringing still present but significantly reduced compared to an equivalent linear phase setting. Concomitant to this change would be that the post-ringing is also slightly stronger and extended in duration than linear phase... But remember, post ringing if present isn't typically an issue due to auditory masking.
Remember that a strongly anti-imaging minimum phase reconstruction filter introduces phase shifts in the audio spectrum; something we typically do not want to see because that is indicative of a temporal anomaly (even though humans generally are not sensitive to this as discussed last time). So, how does an intermediate phase setting biased towards linear phase as suggested here look in terms of phase shift?
As you can see, not bad at all! We're basically dancing around linear phase 0° and relatively little change all the way to 20kHz compared to minimum phase of the same filter steepness.
For completeness, I can demonstrate that there are no issues at all with typical DAC measurements of frequency response, noise, distortion...
And of course these days with asynchronous USB transmission, we need not worry about jitter even when upsampling using an inexpensive Raspberry Pi 3 connected to the TEAC UD-501 DAC transmitting 24-bit/352.8kHz audio data, standard generic 6' USB cable.
In case you're wondering, I'm even using "CRAAP" undervolt and underclock settings with the Pi 3 "Touch" for these RightMark 6.4.2 Pro and jitter measurements :-). So even if the speed/voltage tweak doesn't affect sound, at least you know you're saving energy and the Pi is producing less heat.
Speaking of underclocking, notice that it still doesn't take much processing power:
10.5% of the Pi 3's processing resources used which includes running the touchscreen GUI (jivelite). That's pretty well a peak number as most of the time it's streaming the playback with 7-10% processor utilization.
Summary...There you go, my take on selecting preferred settings taking into account some of the parameters that we can play with in piCorePlayer/SoX. What I'm listening with these days as I aim to be a rational audiophile wading through the literature out there and the claims in audiophile-land.
What I've shown here is my take on "Goldilocks" digital filtering. Settings which will:
1. Not compromise on frequency response - flat to 20kHz and a little beyond (with 44.1kHz sampling rate). IMO, it's better for the digital filter to be accurate and not have it roll-off early as some kind of "tone control" which we see in some very short impulse response filters (like PonoPlayer). I would rather be deliberate and either use an EQ for this purpose or DSP room correction with a target curve that rolls off the highs if desired.
2. Achieve excellent anti-imaging properties. What's the point of using an anti-imaging reconstruction filter if it's leaky and spills all kinds of ultrasonic noise which can in turn result in audible intermodulation distortions? I refuse to compromise on this (no thanks, MQA - as per the example from Beyoncé).
3. Allow plenty of overhead to prevent intersample overloading during playback (simply -4dB attenuation in SoX does the job - this is not a significant loss of resolution for a high resolution DAC these days). If there is one factor I would love to see in future generations of DAC chips, it would be this being accounted for by default. (Over the years, Juergen Reis has commented about this as well.)Want to "see" this filter in "action" if one ran into some poorly engineered albums containing nasty 750Hz square waves?
4. Minimize pre-ringing potential whenever we run into poorly engineered albums containing "illegal", poorly bandwidth limited samples. For example, "modern", synthetic, highly dynamically compressed and clipped music containing square waves. Intermediate phase setting achieves this.
5. In achieving (4), the filter will not cause significant timing/phase shifts. This is achieved by biasing the intermediate phase setting towards linear phase.
"Goldilocks" to my eyes (and ears) :-). Notice the precise leading edges ("attack") with the intermediate phase and linear phase settings. Notice the more rounded, less acute slope of minimum phase upsampling. All the while, notice a clear reduction of pre-ringing by going intermediate phase and pushing a little more of the energy into the post-ring. But by not completely pushing all the energy into the post-ring timeframe as with minimum phase, we also see that the ringing settles down quickly through the square wave "plateau".
In a similar way, we can check out the filter's handling of something a little more challenging - some rogue 3kHz square waves:
This time, notice that I underlaid an image of the very simple essentially non-ringing "NOS-like" cubic interpolation filter from SoX (see last week for more information on this "ultralax" NOS-like filter). Remember that the cubic interpolator is non-bandwidth limited and in this example, can be used as a comparator to examine the time-domain accuracy of the waveforms. We can clearly see the temporal effect of minimum phase in this example. Notice how the leading edge of the square waves are temporally shifted forward in time! Again, my intermediate phase suggestion takes the middle ground - less pre-ringing and staying much more like the temporally accurate linear phase with good, clean rising/falling edges although there is a very small temporal shift forward.
Something I've noticed when playing with filters is that for minimum phase settings, because there's no pre-ring "release" of energy, the post-ringing amplitude tends to be more intense. In the 3kHz waveforms above, I only used -2dB attenuation with SoX and notice that there were a few clipped samples with the minimum phase filter setting, whereas this was not the case with linear and intermediate phase. I think this is a good reminder that DACs that use minimum phase filtering need to be a bit more careful to provide overhead for intersample overload (like MQA where the filters indeed overload!).
Finally, here's what 10kHz "square waves" look like with a 44.1kHz samplerate (-3dB overhead attenuation):
Well guys and gals, give this filter suggestion a try and tell me what you hear/think! Like I said, this is my attempt at a filter setting that is "just right" - suppressing the "detestable" pre-ringing with bad recordings, while maintaining excellent frequency and temporal domain accuracy with good recordings.
If you have a favourite setting, feel free to share your thoughts and settings.
Next time, let's look back at some audiophile history, look at some "real" music samples and think a little more about digital filters and contextualize the "detestable" ringing.
Hope you're all enjoying the music. Wishing you all a joyful, healthy, prosperous, euphonic and wisdom-filled 2018...
A friend's son is doing a grade 9 school project looking at perceptions around "healthy foods". If you have 5 minutes, he'd love to have your submission for a survey to be used in his science fair project. I'm sure he'll appreciate input from around the world! Always good to promote critical thinking, analysis, and scientific inquiry for the next generation:
Only audiophilia could have dreamed up such a non-issue to worry about! (and make money from the consequent FUD).ReplyDelete
None of us will ever hear the difference between filters that are halfway to meeting the requirements of being actual reconstruction filters. On that basis I will never consciously deviate from the one, true, mathematically-correct filter that the originators of digital audio created: linear phase. Everything else smacks of a failure to understand how digital audio works; turning it into just another branch of seeking our preferred 'euphonic distortion'.
Yes, ultimately you are correct TRA. In fact, I've written most of what I need to express for next week and we are in synchrony on this :-).Delete
In an ideal world, there would be no variation except for slight changes in steepness / rejection amounts for a linear phase filter!
Hi Archimago. I am the poster of this post:ReplyDelete
There is an attachment (TestFiles.7z) at the end of the post. Could you do me a favor by testing the highest non-clipping volume of SquareSweep.wav in the archive, with some of your equipment, like your Teac, Oppo, SMSL... by using their built-in default filter without any external resampler? No need for DFC graphs, numbers (dBFS) are okay. Thank you very much and happy new year!
Sure, I'll run it through the TEAC at least and let you know...
Clearly an insanely intense test of intersample overloading!
Thanks. Don't know if you can read Chinese or not (since you posted a blog about a sighted listening test in a Hong Kong film) but I made a similar SquareSweep file 11 years ago in a Taiwan forum:Delete
The post talked about avoiding clipping when using a resampler. At that time there were still a lot of AC97 soundcards only support 48k based rates but not 44.1k based rates. Resamplers are crucial to bypass those subpar OS/driver/soundcard DSP resamplers. Now in 2018 people use resamplers for what? For bypassing or simulating those deliberately crippled DAC filters? To shill some "revolutionary" audio formats? To "redefine" Nyquist? How depressing.
In fact, those "relaxed" filters are pretty common on ancient game consoles and emulators to conserve processing power, for example the Super Nintendo:
...and Playstation 2:
How funny people want their $$$ DAC to use filters like that.
Very cool work Dtmer!Delete
I had a look at the forum post from 2006 as well... Alas, having spent most of my life in North America, let's just say my ability to read Chinese requires the intellect of Google Translate these days :-).
Yes, very good point about the tragic use of various poor resampling algorithms! It's sadly taking the audiophile world backwards with objectively poorer fidelity using these "relaxed" poorly antialiasing filters.
Thanks for the link on the old video game systems. Hilarious and again tragic that the "high end" is actually finding inspiration from the retro and repackaging what amounts to the type of quality one used to work with because of technological limitations!
I downloaded and tried your 16/44 square sweep with a few of my DACs. This will cover the 3 main "families" of the better DAC chips out there... Basically how much attenuation to see absence of overloading:
SMSL iDEA (ESS ES9018Q2C DAC) = -4.75dB
AudioEngine D3 (AKM AK4396) = -5dB
TEAC UD-501 (TI PCM1795) = -6dB
Boy, that's a nasty signal that really brings out the limits of the DAC's internal filtering! Each of the DACs connected to my Surface Pro, sending bitperfect through WASAPI with different amounts of attenuation while monitoring the FFT. The attenuation values were what was needed to see no evidence *at all* of overload even though for example by -5.5dB the TEAC UD-501 was very close.
That SquareSweep file needs around -5.75dB attenuation with SoX -b 95 using linear phase to prevent any clipping when upsampled to 352.8kHz.
One heck of a test!
Thanks for the test. So unless explicitly specified (e.g. 3.5dB for Benchmark DAC2/3), no one should expect their DACs to have added headroom. It seems that DACs equipped with reasonable filters do have similar highest intersample peaks since the margin of error is not too big (~1dB) even for a pathological test file. In real music I guess the difference could be reduced to about 0.5dB only.Delete
I usually reduce 3dB in foobar when listening to loudness war songs, as suggested by TC Electronics and Alexey Lukin from iZotope.
Quick question: What is the purpose of including all sample rates in the Max Sample Rate field? Does this prevent resampling?ReplyDelete
Right, you don't need to include all sample rates there. It just lets piCorePlayer know what's available. If I just wanted to upsample to the highest integer value, I'd just include "352800,384000". That will do the job.
Thank you very much for detailed description and suggestions, Archimago!ReplyDelete
Slightly changing the phase is definitely an interesting idea. But Linear Phase is the safe standard.
I have been recently in doubt whether allowing aliasing is good. Yes, it makes the sound of resampled files at 44.1 a tiny bit "more attractive", filter passbands can be longer, but in my listening tests I finally realised that this is not the part of the original sound in many cases. I know that is > 21 kHz etc., but on the other hand these aliases are new, post-mastering and completely unnatural elements of the sound. I suspect they can even harm the timing of original signal. Therefore now I use resampling without any aliasing.
It is very subjective but I found out that it is better for my playback chain to have slightly higher passband than 95, e.g. 95.4 in SoX or 95.2 in Ferocious Resampler. Ultrasteep passbands like 99 or 99.5 sound also nice on many albums (tested last week) but I do not use them now, since their effect is not so predictable for me and during offline conversions they require a lot more time.
This is not SoX's case with online upsampling, but when doing offline upsampling to 24 bit 88.2/96 kHz, I found out that it can be beneficial to apply e.g. violet noise TPDF dither (bigger noise shaping not neccessary). Since the conversion is done in 32/64 bit float (regardless of input bit depth), when we save upsampled files, we reduce the bit depth from 32/64 float to 24 integer. I know that we are hovering around -140 dB, but since the files are then upsampled futher by DAC, the quantization errors can a little bit multiply during subsequent DAC upsample. And 24 bit dither does no harm.
I welcome your approach to audio that you do on this blog and hope the fruitful discussions and experiments here shall continue.
Just to add to my previous post: I was from some discussions on the internet and also SoX's manual under impression that ringing occurs mainly within the filter's "working range" - in this case 95-105, in my case 95.2(95.4)-100. But you are showing examples that it influences even lower frequencies. Do (can) the signals you use occur in normal music ?Delete
Thanks for the feedback Honza.Delete
Yeah... The way I see it is this, just like in life we likely have many jobs we could be good at, or there not being the "perfect" soulmate to marry :-). So too these filter settings.
It really depends on what we're listening to. If I listened to harsh metal/rock/pop all the time, then maybe there's something to be said about an early roll-off filter like say the PonoPlayer. But if we want to be rigorous and achieve the best frequency and temporal resolution, then there really is no option but to stick with linear phase and go steep.
I'll go into detail more next time. But lets resolve in 2018 as audiophiles to remember that:
1. The "ringing" from a typical linear phase filter is at Nyquist! We are generally not going to hear it when it happens since this is up at 22kHz! So even if it's present, it's really not a big deal.
2. Good recordings do NOT ring! There should be no square waves like I showed above for the filter to trip over when we're listening to music *meant* to be reproduced in high fidelity.
We'll talk a lot more about this next time and I will show some more waveforms and how they look when processed through filtering... IMO, I do not mind mild compromises like the intermediate phase setting above as an acknowledgement that ringing does occur and we can do small things to limit it. My issue is more with companies over the years who have done crazy things with filtering that actually make temporal inaccuracies worse or messes up frequency response or ignores aliasing!
Thank you for your reaction.Delete
When ringing is primarily above the passband (or near Nyquist), it is not a problem with frequency/hearing and can only slightly harm transients from time point of view. And I think that we are used to it, all 44.1 kHz CD recordings have it. To minimize it the best approach would be to go to 48/88.2/96 kHz.
From what you write it seems to be that your approach to filtering is reasonable, interesting options to try.
But as an alternative I would suggest, as I wrote, not to introduce aliasing at all. In the past I used aliasing, it very slighlty nicely colorized the sound, making it more bright, I even used combination 92/9 in Resampler which sounds really great with very slight aliasing. But recently I stopped to avoid steep filters and thus there is not many reasons left to keep aliasing on, and I do not like the fact that it introduces completely new frequencies into already mastered recording.
I have been fan of earlier roll off filters but in the light of this knowledge I do not prefer them anymore. The only benefit we can get from them is lower ringing, but much more frequencies are "touched" by them and from the theoretical point of view they are far from ideal brickwall.
Yes I agree that this is potential issue only with 44.1 KHZ. Even at 48 kHz when we introduce ringing e-g- above 23 kHz it is completely out of range. We can remember that DSD massively noise shapes above cca 25 kHz and no harm is perceived.
One more controversial idea, while SoX upsampling to highest possible frequency is perfectly OK, one could consider also upsampling only to 88.2 just to bypass DAC's filtering at 44.1 kHz (which we usually cannot adjust and can be of lower quality) and with good quality DACs will the DAC care for the rest reasonably well.
I emphasize that those discussions are primarily "academic", actually the differences are pretty subtle and other readers can try what will fit best for their playback chain.
On the other hand I also agree that if we want to find a different than "standard" setting what you write in this post is suitable. Because you are dealing with both "problematic" elements very reasonably.Delete
1) you allow aliasing but only to the extent that should be completely inaudible (given the frequency and volume)
2) you adjust the phase the slightest possible way to change the impulse response on test signals in a positive way
While generally now I am not convinced if those adjustments are neccessary, I know that even my opinions in this area have been changing and especially on some playback chains it may be valuable to have multiple "good" options available to try.
BTW, two nice "pictures of ringing energy" are hereDelete
Digital Filter and Intersample OverloadReplyDelete
Hi Archimago: Very nice post. Clear and Clean thinking. Nice.
Hi Dtmer: Could you offer your square wave sweep files in an another download than Zz format? I would be interested also to test this files, but when I try to extract those files, my Mac tells that he misses the checksum for the wave files, (but was able to extract the PNG files). Thank you.
Happy New Year man!
Here is a link to the flac encoded SquareSweep.wav
Notice the compression of flac is much poorer than 7z since it is not a typical audio file :P
***Be careful not to accidentally play the file while downloading***
2 great articles on digital filtering (as always). Now I switch between Chord and others on the cheap. If only we could simuate the dCS ring DAC using this technique.
Have you taken a look at the foobar SOX plugin? Its GUI is somewhat different than the parameters Picoreplayer uses. Could you elaborate on how your Goldilocks parameters translate into the foobar/SOX GUI so the Windows crowd can join the party?
Thanks a lot for sharing your ideas. They are really a breath of fresh air.
I wish you and everyone around here all the best for 2018!
For foobar sox Archimago's settings mean passband 95, allow aliasing and Phase 45.Delete
Alternatively I suggest from my experience passband 95.4, no aliasing and Linear Phase, or the same with passband 99 (for believers in ultrasteep filtering - debatable).
The differences are very subtle, if perceivable and the effects also depend on whole playback chain.
369473 Intersample Overs with 5.24 dBr True Peak to Peak RatioReplyDelete
Hi Dtmer HK
Wow, what a test file. Thank you for that. Besides my main work, I do also some recordings and masterings, and I am an Apple certified mastering engineer (and so have measurement tools) and can report, that your files does have the above mention numbers of Intersample Overs (and this with a file length of just 10 seconds) and also with the above mentioned ratio of True Peak (Intersample Peaks) to Native Peaks and that your file does have no, absolutely no Native Peak clipping. So everything is Intersample Overload clipping. Wow.
Glad to know that the dBTP your software reported is pretty close to real world hardware. Which plugin/tool did you use to obtain that value?
I realize there are some debates about "true peak" since the "truth" depends on the filters being used. Usually the steeper the filter, the higher the dBTP. ITU BS.1770 has a reference upsampling filter to estimate real world hardware performance but different software plugins still report dBTP differently.
Again, thanks Archimago for carefully picking converters from different manufacturers to make a more diversified comparison.
Hi Dtmer HkDelete
Yes, different ITU BS.1770 tools do report different True Peak values, as they are using different oversampling filters to create the wave for analyzing. The lowest value for your files gave MusicScope (from XiVero) and the highest value gave LCAST (from MeterPlugs). Audition (from Adobe) gave with “golden” mid and this is where I have the True Peak to Native Peak ratio from. So – 1 dBFS Sample Peaks (meaning no sample does clip and every sample is 1 dB lower than the maximum level) but + 4.24 dBFS True Peak (giving the ratio of 5.24 dBr).
The total numbers of intersample overs I do have from the Mastered for iTunes App package from Apple. Very nice tool, as is does analyze every sample for sample peaks (native peaks) and does analyze the intersample overs and does create an output wave showing in one channel the wave form of the source and on the other channel the occurrence of the intersample overs.
There are some mastering tools that I do use, where you can adjust, what ratio of oversampling filters you are simulating to test for oversampling overs and even more, you can simulate different lossy codecs, that even after converting into OggVorbis, or AAC, or MP3, you do not have any native and no intersample overs.
The Limiter from iZotope Ozone is very good, very transparent and reliable. With that you can be sure, that you do not have any sample overs and no oversample overs. And believe me, that even with high quality mastered recordings, you do need Limiters (and not only for highly compressed “music”).
PS: Are you living in Hong Kong? As I am visiting HK every year for the HK AV Show and do some vacations afterwards.
Yes I live in Hong Kong, an Asian who speaks Cantonese. I visited the annual AV show several times, especially in the 90s when there was no internet access and my knowledge was skewed by printed media. The last time I visited the AV show was already in 2008. I don't like the idea of forcing visitors to buy their CD as part of the admission fee since the selected songs are either not my cup of tea or I already have the same song.
While my working experiences are audio related they are not the really "audiophile grade": mobile games and toy music composer/sound designer, audio restoration technician for obsoleted media like open reel tape, 78 rpm vinyl, DAT and so on.
你好 Dtmer HkDelete
我是從德国来的. 我只会说一点点国语. ;-)
Perfect, thanks for covering this topic too! I'm always using linear phase for every EQ and resampling.ReplyDelete
BTW, are there any plans for FB/Discord group to have more open discussion about these topics?
Minimum phase is not completely evil for EQ, synthesis and related stuff, since they are supposed to affect audible frequency, where ringing could be audible. In such cases the filter type should be based on different usage or subjective preference.Delete
DACs and resamplers on the other hand are supposed to filter at near nyquist, minimum phase is useless for such applications.
Exactly Dtmer. The fear about ringing at such high frequencies are clearly overblown!Delete
SUBIT: Discus and FB sound interesting... Alas, my sense is that there are many venues for discussions these days. Getting Discus into the comments section here should be easy.
Do you guys want that?
Archimago: I meant Discord channel :)Delete
So here's what I don't understand.ReplyDelete
By introducing a frequency-dependent phase-delay, you (and, to a greater extent, the minimum-phase guys) are screwing up the time-domain behaviour. All in the name of making the impulse response (and other "illegal" (non-band-limited) signals) "look better".
But you're not just screwing with the reconstruction of "illegal" signals. Your phase-delay applies to perfectly fine band-limited signals as well. So you're screwing up the time-domain for them as well.
Why is this a good trade-off?
Yup, you're right about introducing some phase delay that would affect well produced material... :-)
Why you ask? Because I thought it'd be a nice way to open up discussion in a practical way. Basically to show that in 2017/2018, we can easily address the issues all around and that there's *nothing* to fear if we are rational and accept that human ears / minds can tolerate some "imperfection". To remind us that there is no great mystery that only the "high priests" of audiophilia should dictate or that the audiophile press even knows what they're talking about or after all these years been able to advance the level of education apart from towing the Industry's message.
We can get the job done for "free" easily whether we just do a standard linear filter or one incorporating the little compromises I make - no need to hype up FPGA's, proprietary algorithms, expensive tweaks, or claims of "de-blurring" based on "neuroscience" with no real evidence to back things up. Thinking about this and playing with it is within grasp for any of us. An inexpensive device like a Pi 3 has enough horsepower running open source software provided by folks who truly deserve praise for donating their time and efforts.
1. We can easily and accurately achieve flat response beyond 20kHz and a little aliasing isn't going to hurt if we don't want to strain computing resources.
2. The ringing is no big deal but if in some people's opinion the pre-ringing deserves to be suppressed, here's a way to do it to a significant degree such that it'd be ridiculous IMO for folks to still complain (more to come on this with the next blog post).
3. Phase anomalies are likewise not a big deal either! Otherwise people would be screaming about all those steep minimum phase products from Apple all these years and Meridian's "apodizing" filter from 2009 should have resulted in outrage from all the Golden Ears out there. I believe if the audiophile press were doing their job and honestly trying to educate audiophiles, at least *somebody* in the Golden Ear hierarchy should have voiced stronger opposition regarding phase shift/time delay.
Notice the craziness here. Meridian/MQA is now claiming that time domain is of supreme importance: "This all suggests that the time-domain acuity of the human auditory system has been more important than frequency-domain acuity" (JAS article 2015). But they were the ones promoting a relatively steep minimum phase filter in 2009 that actually made time domain worse!
Like I said, I'll have more this coming week...
First of all, let me say that I think it is incredibly cool that we can implement these filters (cheaply!) in software on a Raspberry Pi. So thank you for that.Delete
"Meridian/MQA is now claiming that time domain is of supreme importance ... But they were the ones promoting a relatively steep minimum phase filter in 2009 that actually made time domain worse!"
There seems to be a lot of confusion about what constitutes good time-domain behaviour. Pre-ringing is supposed to be evidence of bad time-domain behaviour (it isn't), and minimum-phase is supposed to be an improvement. It isn't; in fact, it's objectively worse time-domain behaviour.
So, while intermediate-phase is better than minimum phase, I am loath to let anyone get mistaken impression that it "improves" the ("bad") time-domain behaviour of linear-phase.
"I believe if the audiophile press were doing their job ... *somebody* in the Golden Ear hierarchy should have voiced stronger opposition regarding phase shift/time delay."
It does seem to be true that many people find the sound of minimum-phase filters more euphonious. This would hardly be the most egregious example of people finding objectively poorer performance more pleasing to the ear.
In this case, at least, I think I can hazard a guess as to why. A lot of people like the "sweeter" sound of tubes which roll off at high frequencies. Here, the high frequencies don't get rolled-off; they get (partially) "masked" by the decay of the musical notes. But the subjective effect is similar: a sweeter sound that many people will prefer.
Indeed you're right, I *don't* believe the intermediate phase setting improves time domain, it just suppresses a bit of the pre-ringing when it happens in some tracks. That's the compromise.
Yes, I suspect the "sweetness" of the roll-off is all they hear if/when some folks feel minimum phase filtering is better. But this is ONLY if they're listening to typically poorly antialiasing slow roll-off filters. A steep minimum phase response like what I showed above would not cause the frequency roll-off. Therefore it's only the Ayre/Ponoplayer "Listen" type filtering that would achieve this.
"But this is ONLY if they're listening to typically poorly antialiasing slow roll-off filters. A steep minimum phase response like what I showed above would not cause the frequency roll-off."Delete
With a steep minimum-phase filter, the high frequencies aren't rolled off, but they are time-shifted in a way that makes them (usually) less audible (overlapping with the natural decay of the musical notes).
Indeed, that's the usually-stated rationale for the minimum-phase filter: it doesn't get rid of the "ringing"; it just makes it less audible.
What the minimum-phase filter actually delivers is a "sweeter" sound, with audibility of the high frequencies diminished (even moreso for the slow roll-off minimum phase filters).
I am also not fan of phase/time changes and personally use linear phase only. But fro those who want to modify pre-ringing, Archimago's option phase 45 is probably the best alternative. Minimum phase is overkill, the benefits of shifted preringing to postringing do not justify other changes it causes.Delete
Thanks for this great blog! This is great fun, playing with these digital filters on piCorePlayer -> usb -> Oppo Sonica DAC -> XLR -> amp: Hypex NC400 -> bi-wire -> speakers: B&W CM8. PiCorePlayer is upsampling to either 705.6 kHz, or 768 kHz.ReplyDelete
First impression of intermediate phase filter (Goldilocks) after listening just one day is that it lets the music sound real and beautiful. A real joy to listen to.
Nice gear man!Delete
Glad to see you're making full use of the 700+kHz sampling rate that the Sonica DAC affords!
I am sorry to disturb these thoughts, but I think that reasonable quality DAC should handle rates 88.2/96 the same or very similar way to 705.6/768 kHz. It is clear that filtering at 44.1 (and to some degree 48) kHz has compromises and it can be better to adjust it in software rather than rely on one chosen "hard wired" implementation in DAC, but subsequent upsampling should be done in hardware the same way as in software.Delete
On the other hand if DAC accepts higher rate on input, why not send them - it can be also OK, if the USB pathway is free of jitter and other artifacts.
Hi Honza, I agree with you. The DAC plays highres material (24/88, 24/96) very fine, no need to change (or "upgrade" or "improve") that. Here I'm listening to 16/44 material to test if the different upsampling algortihms (using SoX, raspberry pi, usb) are better than the algorithm in de DAC. The Sonica doesn't have a filter selection, but using the raspberry one can try e.g. minimal phase, linear phase, etc.Delete
One thing I found, is that the "Chord-like settings" (from this blog) when upsampled to 700+ kHz is computational intens. The cpu-utilisation of the raspberry goes to ~30% with higher peaks. Not a number to worry about, BUT it sounds worse. Using the same filter at 300 kHz solves that. Then the next question is: should I upsample to the highest bitrate, or what is the lowest bitrate that is good-enough? In my test the upsampling is done to bypass the DAC's digital filter. Another thought: would it be possible for SoX to use the raspberry pi's GPU for filtering?
My personal opinion is that there's no need to sw upsample original 88.2/96 recordings.Delete
For 44.1/48 kHz I think that it is enough to upsample to 88.2/96 or 176.4/192 kHz, if the DAC supports 24 bit or 32 bit input (which most do). Upsampling to higher rates can be done but not neccessary, if it sounds worse, I'd for sure upsample to 88.2 (or 176.4) only. Some USB transports work worse with higher bitrates.
Chord-like settings (99+) may work well on some playback chain and especially on some recordings which have little content above 21,5 kHz. Personally I use 95.4 cutoff for SoX, 95.2 cutoff for Resampler.
I do not know SoX's build that could use GPU.Delete
If Rasbperry's performance is not use higher model or use some Atom/AMD Sempron PC.Delete
Agree guys, hi-res 88+kHz material does not need resampling at all but typically in a DAC still goes through the internal algorithm to whatever times upsampling the chip does.Delete
If you're running into issues twentyfour bit with that extremely steep filter, indeed back off to 352.8/384kHz or just relaxing things to 97% maybe. I remember pushing upsampling to 768kHz a few months back with the Pi 3 and the Sonica DAC while running Jivelite with the GUI to the point of causing distortions and sound disruptions.
A more powerful SBC like the ODROID C2 running squeezelite should have more processing power as well for those 99+% filters running >700kHz! IMO overkill :-).
This comment has been removed by the author.Delete
Yes it does go through internal algorithm, but I think that with good quality DAC that algorithm's results from 88.2/96 kHz (original or upsampled from 44.1/48) sample rate are not significantly worse than those that are directly software upsampled by SoX. E.g. I think that two-stage upsampling from 44.1/48 does not make things worse, provided that DAC supports 24 bit or 32 bit input.Delete
If somebody is in doubt with 88.2/96, he can upsample to 176.4/192 kHz, at that rate I think we can be sure that the results will be nearly identical with higher rates. But some DACs perform better at 88.2/96 at input than 176.4/192, depends on implementation.
Hi, Arch. Happy New Year!ReplyDelete
Have read pretty much all of your posts on digital filters including the DIGITAL FILTER TEST and this most recent post on the “Goldilocks” filter.
Currently using a Manhattan II DAC which unlike the Man I has 7 filter options:
FRMP (FAST ROLLOFF, MINIMUM PHASE);
SRMP (SLOW ROLLOFF, MINIMUM PHASE);
SRLP (SLOW ROLLOFF, LINEAR PHASE);
FRLP (FAST ROLLOFF, LINEAR PHASE)
APDZ (APODIZING, FAST ROLLOFF, LINEAR PHASE);
HBRD (HYBRID, FAST ROLLOFF, MINIMUM PHASE);
I have started off with the SRMP filter based on reviewers comments re filter character…
…but based on your latest post I should consider the FRLP or APDZ options.
Based on the graphs presented in the aforementioned review, can I impose on you to offer some thoughts on these filters?
Lacking a Goldilocks option, is there some way to achieve something close to this performance using the software you use with a “No Raspberry Pi” DAC? I am currently using a PC-based music server running JRiver Music Center.
I would use BRCK or FRLP and sw upsample to 88.2 or 176.4 kHz beforehand.Delete
Hi Frank and Honza,Delete
Agree with Honza on this, I would try mainly BRCK and FRLP as the most accurate of the options that would not intrude on the frequency response. The APDZ frequency response (Fig 12) "eats" into the audio spectrum unnecessarily.
I wonder why they didn't measure BRCK to show the impulse response and FFT. Seems discriminatory. JA probably has some kind of neurotic aversion to the term "brickwall" :-). Presumably it would look very much like Chord's impulse response.
I know that JRiver has implemented SoX upsampling with the newer versions... Maybe you can advocate for them to allow for custom parameters so tweaking like the intermediate phase setting and variable attenuation can be incorporated!
Thanks for the recommendations, Guys, and the heads-up re the issues with APDZ.ReplyDelete
This musing is one of the first times I have seen mention of the SoX app. I will check out JRiver for whatever info they have but would be interested in a good source for me to learn about SoX?
Looking forward to reading the just-posted 2nd musing of 2018! I have to say that I check daily for new musings!
Let's not forget this filter:ReplyDelete
Yes, absolutely AJ :-).Delete
I hate it when filters are IMPOSED upon us! "Rage, rage against the dying of the high frequencies"...
Hello Arch. Great blog. I've followed some of your posts on ComputerAudiophile as well (my username is buonassi).ReplyDelete
If one were to try and model out your "goldilocks" filter in iZotope instead of SoX, do you know what those settings would be? I have to think that the phase setting would equate to .90, since iZotope's parameters are 0.0 for Min phase and 1.0 for Lin phase. 0.95 Nyquist is a pretty easy match. Steepness (for the rolloff) I have estimated at 14 to 16. Doing this seems to approximate the rolloff curve and rejection of aliased images down to -80dB, very similar to that of your white noise graphs shown in this article.
Anyway, I'm not sure if you have iZotope and would like to comment on how close this is to your SoX settings. If not, don't go out of your way as I'm confident I've come close enough. Not that it really seems to matter. My ears can't pick out the differences in many of these settings anyway (excepting the phase settings).
For some reason, using headphones, I can readily hear differences in Min/Lin phase. I used to attribute my preference of MP filters to the dreaded "pre-ringing" of LP filters, but now believe that what I'm hearing is the effects of phase distortion (and I like it for whatever reason). It changes the timing of the highs relative to the lows as I've learned. If my understanding is correct, the high frequencies will arrive shortly before the lows, but I haven't received a clear answer on this as of yet. Can you perhaps validate this for me? Is it the highs that are shifted forward in time, or are they delayed (relative to lows)?
This comment has been removed by the author.ReplyDelete
I up-sample on-the-fly using SoX and have recently been playing with minimum versus linear phase - so this post is particularly interesting. With 44.1kHz files, I find linear phase slightly "digital" and "cold", but with minimum phase sound-stage suffers slightly. Both types of filter sound better at higher rate. There's no such thing as a free lunch, but your suggestion might be most of a lunch for pennies, a clever optimum. I'll try it over some time and report back! Thanks for backing up with measurements as usual - very helpful!ReplyDelete
From all we know about our hearing and music.
1) Ringing at close to Nyquist should be irrelevant - even pre-ringing! It simply isn’t audible so why does anyone care about this?
2) Phase relationships in music and musical instruments are critical. This is timbre! It changes pitch if you monkey with phase. The human ear is ultra sensitive to changes in pitch. Linear phase is the only rigorous choice for an audio filter!
Why have we abandoned we’ll understood science for some pretty pictures. The manufacturers and magazines who show the picture of ringing are lying to the uneducated audience when they talk about this stuff - it simply isn’t audible and it isn’t relevant. However the deleterious effect of phase shift is extremely well understood! The intellectual sloppiness or outright dishonesty in the audio industry is frightening!
Is itpossible that industry experts are befuddled and conflating this well known effect from a high Q filter.ReplyDelete
In this example the filter is a high pass Q filter used to clean up a snare drum (very typical).
This filter is basically a brick wall to LOW FREQUENCIES. In this case the effect of pre-ringing is clearly audible because the pre-ringing” occurs at highly audible low mid range frequencies.
So minimum phase is an important tool for mix engineers that use multi-track audio and quite literally are crafting the sound the artist ends up conveying through the recording process.
Jeremy, I share your sceptical stance overall but there is a small doubt in my mind. Why bother with anti-imaging filters at all when these images are all above audible frequencies?ReplyDelete
So I have some sympathy for this type of argument: can we freely argue images at much higher frequency are harmful (for reasons for tweeter/amp IMD, or some other reasons, you choose) whilst arguing that all ringing at near-audible frequencies is harmless?
Note according to this distortion argument, having line equipment, amps and tweeters with zero distortion would mean that nothing above audible frequencies would be relevant - not images, not ringing - nothing. But do we live in that world?
Archimago's suggestion is great because we're only incurring 10% of the phase distortion of a minimum-phase filter, which is hopefully not disturbing in the pass band, yet we're getting surprisingly less disturbing-looking behaviour in the transition band (disproportionate "bang-for-buck"). That's why I'm running with it for now.
I'm happy with it so far.
After a couple of years running with it, and musings of my own, I settled on intermediate phase (-p 25). Half the phase distortion of minimum phase but hardly any visible pre-ringing.ReplyDelete
And -v of course. At 44kHz anyway, I also use -b 89 which is basically flat to 19kHz (-b value being the -3dB point).
How this "Intermediate filter" compares with "Hybrid fast roll-off filter"?ReplyDelete
Hello, I have create a docker image for squeezelite and recently I have added references to your page, along with examples and instruction on how to apply your "Goldilocks" page.ReplyDelete
I would be happy if you'd want to have a look (GitHub and Docker Hub):
Thank you for your attention.