Archimago's Musings: The "Measurement Of Amplifiers Rig" (MOAR): Standard Tests, Loopback results, and the AMOAR Score.

Saturday, 16 November 2019

The "Measurement Of Amplifiers Rig" (MOAR): Standard Tests, Loopback results, and the AMOAR Score.

While I'm not a huge fan of acronyms, sometimes one just needs to make something up for ease of reference; hence the "Measurement Of Amplifiers Rig" or MOAR ;-) for short to refer to this system which I'm going to attempt to characterize today:

Running loopback tests.

As you can see, it's a conglomeration of hardware components which will allow me to collect amplifier measurements relatively quickly and I believe with good quality results based on my own testing over the last number of weeks (in fact I've been mulling over much of this for months even before the Linear Audio Autoranger was fully up and running). A hobbyist project to be sure, and one that audiophiles and DIY folks can come up with variants of and try out.

Of course, a measurement system isn't just about bits of hardware but also the software that can be used, the thought behind various tests we can run, and most importantly, a standard procedure that can be followed to ensure that devices are measured consistently and can be reliably replicated here on my test bench and elsewhere with similar equipment. As with most testing, I wanted to find a way to quantify performance, consider what is reasonably "true to life", borrow from tradition for comparison purposes, as well as emphasize aspects I find important. Hence this post will hopefully provide adequate details around a "standard battery" for what will be posted here in the days ahead.

I. Fundamentals

Before starting to discuss the core measurements I'm planning to perform, I think it's important to highlight a few foundational points. The ultimate goal is to assess performance, but since this is not meant to be an exercise in taking a shotgun approach to measure everything under the sun, here are some key items that bias what I'll be doing:

1. Focus on the "First Watt". Loudspeakers are generally efficient and 1W of power into most 4-8Ω impedance speakers these days played back in normal listening rooms at a normal listening distance should already be of good volume. There's of course nothing wrong with measuring distortion at 10, 50, or 100+W, but that's not the output level the amplifier will be pushing for the vast majority of the time for most of us.

2. Include how the amp handles impedance variations, and for resistive load testing, let's target 4Ω. As discussed a number of months back, speakers are reactive loads and impedance varies quite a bit (also we saw last week, at times huge variation between speakers). While we don't need to get extreme about it, an amplifier's damping factor is worth measuring and it would be nice to examine the subsequent frequency response of an actual reactive load connected to the amplifier. As for a power resistor load, these days, except for those speakers advertised as being highly efficient and easy to drive, a large proportion of hi-fi speakers will dip down to 4Ω or less at points in the audio spectrum. While I do have some 8Ω power resistors available, I'm mostly interested in 4Ω performance across the audio band.

3. As per this article from back in 2017, remember the naturalistic data suggesting that most people (~70%), do not need more than about 25W into 8Ω, or about 50W into 4Ω of power to play the music they listen to with their speakers in their rooms at preferred levels. Unless one has a huge room or have very inefficient speakers, it's likely good enough to measure up to 150W into 4Ω or so. Sure, I'll push the system to check if an amp reaches its rated power but my main intent is not to blow up my amps even if capable of extraordinary wattage (first rule of amp testing - search for the limits of the device but never let the magic smoke out! :-).

4. Try to stick with inexpensive / "free" software. As a hobbyist, for the most part, I'll be using free or donation-ware for this. Many of the test signals can be constructed using any audio editor including Audacity. As you'll see, Room EQ Wizard is phenomenal in so many ways for the task at hand! (I do have access to SpectraPLUS-SC which is commercial software and can be quite convenient. I might use that at times if needed.)

5. Try to keep parameters consistent for simplicity and choose settings that provide high resolution results. While the RME ADC is capable of up to 768kHz samplerate, unless specified, in my measurements I'll stick with 96kHz for a bandwidth of 48kHz and 24-bits for most test signals. Blackman-Harris 7 windowing setting for consistency where available, 128k points FFT is more than adequate for this bandwidth to provide excellent resolution.

6. Use dB instead of % for distortion results. I agree with this article from Benchmark advocating that it makes sense to talk about THD(+N) and IMD as measured on the logarithmic dB scale instead of a linear % scale for interpretation of results. This makes intuitive sense and elsewhere such as on Audio Science Review, folks are using SINAD (SIgnal-to-Noise And Distortion ratio), the reciprocal of THD+N expressed as dB as a summary comparison value.

There will also be times where I'll just talk about "volts" (RMS) instead of conversion to "watts". When doing so, I'll let the reader know the load characteristic like whether it's a 4Ω resistive load which is my standard as per point (2) above. Remember that audio amplifiers are generally voltage amplifiers - what we see going out of an amplifier into a certain (loudspeaker-like) load should ideally be a "high fidelity" scaled version of the input voltage; the proverbial "wire with gain".

7. No frequency weighting to be found here. Yeah, I know that human ears do not perceive loudness equally and measurements like A-weighting can be used. We're audiophiles here! No need to add complexity or mess with expectation of anything but a flat response.

Okay, so with those 7 items in mind, let's talk about a "standard test methodology" including the hardware set-up and what can/will be measured...

II. Test Hardware Components & Connections

Loopback - the RCA cable running from under the Autoranger is being fed from the passive volume control which itself is fed from the RME ADI-2 Pro FS. (Showing a 21 multitone test signal on the computer screen. You can see the two green 300W 4Ω non-inductive power resistors behind computer.)

Apart from the laptop machine, as you can see in the image above, the system consists of 3 separate boxes - RME ADI-2 Pro FS DAC/ADC, Linear Audio Autoranger MK II (10kΩ), and Douk/Nobsound passive volume attenuator. To keep noise as low as possible, I'm using XLR/TRS balanced connections as much as possible with short cables (typically 3'). The connections to the boxes look like this:

As shown in the block diagram above, there is the "XLR / RCA Loopback testing" red arrow which is the hookup in the photo (I'm using the RCA pre-amp out to Autoranger). Because the Autoranger has both single ended and balanced inputs, with BNC-to-RCA and XLR-to-RCA adaptors, I can use them as the left and right inputs, switching between them one at a time since the Autoranger is inherently mono.

III. Range of Tests

Let's talk about the various tests that can be performed, parameters used, and show what the loopback results look like for baseline performance of the system. I've broken the tests down to 4 major groups identified as (A) to (D):

A. General Amplifier Characteristics & Frequency Response
To start off with, let's get some general characteristics about the amplifier being tested.

1. Determine amplifier voltage gain:

This is useful to know if you're mixing and matching various amplifiers since different devices will amplify the same input voltage by different amounts. I've read that the THX standard reference is +29dB for unbalanced inputs and +23dB for balanced; useful numbers to keep in mind as points of comparison.

The test is easily done by feeding in a fixed input like a 500Hz sine at 0.01Vrms and measuring voltage that comes out of the pure power amp or at 100% volume for an integrated amp.

Amplifier voltage gain Av = Output V / Input V, expressed in dB.

2. Determine amplifier damping factor (DF) into 4Ω:

Remember, unless specified, I'll be aiming at a 4Ω load in these measurements for consistency. Damping factor helps overcome back EMF from the load (speakers) and in that way "control" drivers better whether it's in stopping residual movement or overshoot. Theoretically, the higher DF is, the better although there will be differences of opinion and there's no need to go extreme. Typically, high DF is also related to large amounts of feedback used in the amplifier which is another controversial topic among audiophiles (like this guy's opinion). There is a paper from Floyd Toole back in the day (1975) called "Damping, Damping Factor, and Damn Nonsense" you can check out. In it, he advocates a DF of >20.

DF = V(loaded) / [V(unloaded) - V(loaded)]

V(loaded) is the voltage at the amplifier terminal when connected to my 4Ω dummy load, and V(unloaded) is without the load attached. Put another way, damping factor is also the ratio of the load impedance / output impedance of the amplifier.

The amplifier's output impedance will vary with frequency so I'm going to measure the damping factor at the output terminals at a few frequencies - 20, 100, 300, 1k, 5k, 10k, 20kHz - at ~1V output from the amp using my 4Ω load. This should be more than adequate for characterization.

Since passive speakers are low impedance devices that can dip sub-4Ω, output impedance of the amplifier can be minuscule, speaker wire and poor connections add resistance that can have a significant impact on DF. For my tests, I will use short 2' lengths of 14G OFC copper wire (<0.005Ω/m) with good quality banana plugs which should add minimum resistance. Measurements will be at the amplifier end just past the banana plug terminals. Here's a calculator online you can try out to account for speaker wire effects on DF.

3. Frequency Response into 4Ω and speaker loads

Regardless of what one might think about the importance of time-domain performance, there is no doubt that flat frequency response is an essential component of good fidelity. It's now time to attach the amplifier to the Autoranger to capture some data.

Using REW, we can now set the DAC/ADC to 24/96, and run a measurement of the frequency response which will also give us some other data about the amplifier like phase characteristics. Here's what this looks like with my loopback (20Hz-30kHz) - DAC output level of 1Vrms into 4Ω load:

Good to see essentially flat frequency and phase responses (minor -0.25dB from 50 down to 20Hz). I believe one of the criticisms of Class D amps is a phase shift; we'll see how these and other amplifiers measure in the days ahead.

While it's all well and good to have flat frequency response with the 4Ω power resistor, in reality speakers have variable impedance across the audible frequency, sometimes this can be quite substantial. Instead of using a simulated load like this, I'm going to measure frequency response off an actual speaker load. Here's a pair of old Sony SS-H1600 bookshelf speakers with its impedance and phase characteristics shown that I can use as a standard for my measurements:

We can use these 25+-year-old speakers (which I got back in my college days!) as an actual load for the amplifier being tested to see the effect of the impedance on frequency response. Notice the swings in impedance between 6Ω to 56Ω within the audible band along with some difficult phase angles near 45° though nothing terrible. We also see a typical gradual increase of impedance at higher frequencies due to driver inductance. Notice the double bass peaks as typically seen in ported speakers. We should see with high damping factor amplifiers that frequency response will be "reigned in" with better control resulting in flatter power vs. frequency graphs. When measuring this, a reasonable level like 0.1Vrms output will be used since the sweep will be audible through the speakers; no need going deaf or blowing speakers up with too much power :-).

We see that both the right and left speakers are equivalent in impedance in the event we need both channels driven for stereo amp testing:

R + L speakers almost identical impedance.

B. Single-Tone Tests of Harmonic Distortion & Noise
Using the Autoranger to handle voltage switching, we can now measure the amplifier at various power levels. First, we do this with single-tone tests that will give us information like the harmonic distortion amount and noise.

1. THD(+N) vs. Frequency graphs at various "First Watt" levels (into 4Ω).

Based on the "First Watt" and "Free Software" principles, I'm going to standardize on measurements using 64-bit Room EQ Wizard's (REW 5.20 currently beta 28) excellent "Stepped Sine" feature at 4 output power levels - 0.25V (16mW), 0.5V (63mW), 1V (250mW) and 2V (1W); not necessarily every one of these levels for every device of course. These measurements are performed with the Realtime Analyzer (RTA) module and here is a sample output of what results look like for the RME loopback going through the volume attenuator (and Autoranger at 1Vrms nominal output):

FFT for THD+N and Stepped Sine settings.

A look at 1kHz, 1Vrms, THD+N measurement - RME loopback through the MOAR.

The RME is set to 96kHz hence 48kHz bandwidth. RTA smoothing off, and FFT 128k provides good resolution without being excessive. Input level to the ADC in the example above is -7.6dBFS but plotted as dBc ("dB relative to carrier signal") to keep the 1kHz peak at 0dBFS. Since the system is fully battery operated, you see that there is no mains hum (which would be 60Hz here in Canada). Notice the labeling of harmonics all the way to the 9th order. High and low pass filters are activated so that the calculations are limited to the audible band from 20Hz to 20kHz.

Here's what the loopback "Stepped Sine" graph looks like at 1V output from the DAC:

Lots of data in there and probably the summary graph on the right with only THD+N, THD, and noise floor across the audio frequency band would be good enough for most situations. Nice to see that THD+N is not frequency dependent and remains flat across the audible band. As you can see from the RTA screenshot above and implied in the "Stepped Sine" graph, with the RME DAC (AKM AK4490 inside), there is a relatively predictable "cascade" of harmonics with 2nd order highest followed by 3rd and 4th and the higher harmonics are further down in amplitude. This predictability is good to see coming from this DAC as signal generator for testing purposes. The detailed graph on the left gives us an idea of how the different harmonics compare. If we see amplifier measurements where there's a ton of high order stuff, characterized by higher order harmonics rising above the 4th, 3rd, and 2nd harmonic levels, this is possibly a sign that the amplifier is not as "euphonic" sounding on account of harmonic distortion.

[As an aside, this relatively predictable "harmonic cascade" seen with AKM chips is different from ESS DACs. Even though ESS chips overall have lower THD results, the harmonics are often unpredictable in amplitude depending on the DAC output level. I was made aware of this in my discussions with Matthias Carstens of RME and have observed this with ESS Sabre DAC measurements also (see the "THD+N Matrix" graph here from the Oppo UDP-205 with ES9038Pro).]

2. THD(+N) vs. Frequency graphs at constant "High Power" levels (into 4Ω).

Similar to the graphs and results above using the "Stepped Sine" function, we can push the amplifier beyond the first watt and look at the quality at higher constant output power. Depending on the capabilities of the amp, we could look at RMS voltages of 4.47V (5W), 6.32V (10W), 10V (25W), 14.14V (50W), 20V (100W), 24.5V (150W), 28.3V (200W), 31.6V (250W) and 34.6V (300W) +/- both channels driven as needed (see section 4 below for 2-channel amps).

3. THD(+N) vs. Output Power

You've all seen these graphs showing the harmonic distortion levels spiking as the power limit of an amp is reached. With my system, data capture won't be automated like an Audio Precision but it would not take long to grab some points and graph out the shape of the distortion-power curve.

Using the loopback, keeping the volume attenuator to 0 (unity gain) while varying amplitude of the RME DAC output for either the unbalanced or balanced signal, and allowing the Autoranger to switch and keep output voltage at ~1V nominal, here's a graph of the THD(+N) vs. DAC output voltage using either the BNC single-ended input or the balanced XLR:

I'm using the RME's volume control to adjust the DAC voltage output by 0.5dB. The RME has a setting for automatically switching the analogue output levels when using the volume control. This is an excellent feature that keeps the dynamic range high through a large range. Nonetheless, you can imagine that at the lower levels with the single-ended output, some resolution will be lost.

While I'm showing data for both the balanced and single-ended inputs, it's the single-ended input that's going to be used with amplifier testing. As you can see, THD+N with the single-ended input is able to achieve better than -95dB from 0.25V on, and it's better than -100dB from 0.5V onward. While there will be some variability to how low the System can measure THD+N depending on the DAC output level to the amp, the Nobsound attenuation amount, and volume control of the amplifier to be tested, the bottom line is that the System is able to achieve a THD+N of around -105dB at 2Vrms into the Autoranger or 1W into 4Ω.

4. For 2-channel amplifiers: THD+N both channels driven & Crosstalk

For those times that I'm testing a two channel amplifier, we can check to make sure distortion and noise are not severely affected when both channels are driven. Easily done by hooking up both channels to 4Ω loads and remeasuring the THD+N for each channel. As a standard, I'll make sure to at least check at 1W and likewise go up to 10W, 50W, 100W, etc... We can then compare these results with the numbers above of single-channel performance. Inadequate power supplies likely would be the cause of increased distortion and noise if we see issues when both channels are driven.

Another 2-channel parameter to check would be crosstalk. I'll just do a simple test where I've created a 2-channel signal with a 0dBFS 4kHz sine on the left, and 0dBFS 300Hz signal on the right. We can examine each channel and see how much of the opposite channel's signal is seeping through. For example, with the RME loopback:

We can take an average of the two crosstalk values; in this case R-seeping-into-L is -82.8dB, and L-to-R is -84.1dB, so it's an average -83.5dB for the loopback test. There is little difference between the two sides and the two frequencies; if the difference is quite large, it may be worthwhile exploring crosstalk frequency dependence. I believe that many amps should be able to perform beyond the -83dB value above and I'll just report "better than -80dB" when I see evidence of this.

By the way, with 24-bit signals, the RME DAC is capable of better than -110dB crosstalk as previously measured. The reason we're seeing -83dB here is because of the passive Nobsound preamp. We typically do not need to emphasize the crosstalk value too much once it's quite low like say -50dB or less. In the real world, natural sounds are never an "all or nothing" split between left and right ear. For perspective, remember that a good turntable cartridge can achieve something like -30dB crosstalk between channels, and reel-to-reel is somewhere around -40 to -50dB between tracks.

C. Multi-Tone Tests - including Intermodulation Distortion & Distortion+Noise

Next, we can examine the amplifier using multi-tone tests. These are important because harmonic distortions can be perceived as "natural" since sounds in the real world generally contain varying amounts of harmonic content, or even "euphonic" to some people since added harmonics can also make sounds seem "fuller". On the other hand, multi-tone derived distortions generally sound "bad". Intermodulation distortions are not harmonically related products of the fundamental input frequency and exist as sidebands that when not masked, can sound out of place or create a sense of loss of resolution (potentially like what severe jitter does).

There are various Intermodulation Distortion (IMD) tests we can use depending on which standard we want to follow that can probe different frequencies. I'm very impressed that REW has incorporated so much of this in the Signal Generator module and also the RTA module is able to identify the main intermodulation signals for analysis. Here then are a few loopback results to show the resolution limits of the MOAR system.

1. SMPTE Intermodulation Tones

This is the classic 60Hz, 7kHz SMPTE IMD test with 4:1 amplitude ratio. The 60Hz tone is at 0dBFS. We focus at the 7kHz signal and calculate the amplitude of those 60Hz intermodulation sidebands:

60Hz fundamental not shown in order to focus on IM content around 7kHz.

Not bad, -101.4dB IMD at 1Vrms loopback.

2. ITU-R (CCIF) Intermodulation Tones

19 & 20kHz 1:1 tones we typically see in Stereophile's testing among other places:

Notice that this -100.4dB CCIF/ITU-R IMD result includes the 2nd order intermodulation products at 1kHz and 39kHz even though it is significantly outside the audible spectrum along with the lower level 3rd, 4th and 5th order products.

3. Linkwitz Intermodulation Tones

Here's one I've seen over the years used by Siegfried Linkwitz with 1kHz and 5.5kHz tones in 1:1 amplitude ratio. We can see this test throughout his amplifier measurement page and it does make sense to use these frequencies at the heart of the audible spectrum around where the human ear/brain is the most sensitive.

As you can see, 2nd, 3rd, 4th and 5th order intermodulation products are spread through the audible spectrum and it could be nasty if amplifiers produced large amounts of these distortions. The loopback measures well at almost -107dB.

4. Transient Intermodulation Distortion (TIM/TID/SID - Slew Induced Distortion)

[Updated November 21, 2019 - related to discussion below with Miska to increase the bandwidth of this test to push the amplifier test signal further to just below 100kHz bandwidth.]

Okay, this next one is controversial and one of those tests which even if performed poorly by an amp doesn't necessarily imply that the device isn't "high fidelity". A good, "low TIM" amplifier has both high bandwidth and high slew rate. I'll use an old signal recipe based on The Tektronix Cookbook of Standard Audio Tests (1975) description and based on the work of Otala but accelerated to 2x to increase the demand on the amplifiers. This accelerated signal uses a combination of a 1kHz square wave with a high frequency 12kHz sine wave in the ratio of 5:1 at 24/192 (96kHz bandwidth). As shown in the book, a device with high TIM will show elevated sidebands which in this accelerated version will be around the 12kHz signal. Here's the DAC loopback result:

I've put red arrows on each side of the 12kHz tone where we would expect to see a series of 2kHz spaced sidebands in the event of significant TIM. As you can see, the DAC loopback is totally clean. For a simple objective measure, with the peak 1kHz tone at 0dB, we can put a marker at the peaks of the 2kHz sidebands around 12kHz. In the case of this loopback test, we don't see any sidebands present and the noise level is below -130dB (192kHz sampling, 128k FFT).

Looking at the time domain, we're basically asking the amplifier to reproduce this signal cleanly:

Specifically important is the rapid transition between the 1kHz square and continuation of the 12kHz sine waveform at higher voltage output.

When testing amplifiers, the recommendation is to test this out at 50% rated power. To be honest, I'm not sure it's worth straining an amp too much for something like this which doesn't represent real music, so maybe up to 100W (20V into 4Ω) would be "good enough" to detect whether there may be any practical concerns.

5. Triple-Tone Total Distortion + Noise

Music obviously isn't just one or two tones but we do have to be practical about test signals to keep them simple enough to measure and interpret.

How about a slightly more complex triple-tone signal then? By doing this, we can take a look at the total distortion plus noise result which would contain harmonic and intermodulation products, plus whatever abnormal spuriae produced by the amplifier.

This test is performed as usual at 96kHz samplerate, bandwidth 20-20kHz hi/low-pass. Here are the Signal Generator parameters in REW with a snapshot of the RTA settings:

Notice that I have highlighted a few key parameters including the custom triple-tone setting with frequencies at 48, 960, and 5472 Hz. The reason I'm using these numbers is to keep the frequencies at an integer multiple of the base 48 sampling rate to reduce low-level rounding errors (likely just being pedantic since I don't think there will be any issue if I were to use 50/1000/5500 instead). The signal is at -6dBFS amplitude (no clipping, average RMS signal amplitude -9dB), analysis will be done with 128k FFT length and Blackman-Harris 7 windowing as usual. Compared to a sine wave at the same peak voltage, the total RMS amplitude of this tone is actually -4.66dB, which means that 2V using this signal through a 4Ω load is not 1W, but more like 340mW.

Here's what the waveform looks like. Notice plenty of zero-crossing which would exacerbate effects of crossover distortion in poor Class AB designs:

Here are the results from the left and right channels of the RME DAC loopback through the System:

As you can see, the RME DAC output is very clean as expected even after going through the passive pre-amp and Autoranger. The 3 tones provide a number of opportunities for distortions to show up. Arguably, we can say that perhaps we should throw more tones into the mix; why not 5 tones or 7 tones? Remember though that the more tones we throw into the signal, the lower level each tone will be within the 0dBFS digital limit so there are compromises here in order to maintain high signal level to noise floor. Despite all the combinations and permutations of distortions possible, notice that all of the distortion peaks are below -110dB of the primary signals with a noise level below -130dB - an amplifier that can do this would be remarkably impressive! Both channels from the RME DAC achieve better than -103dB in Total Distortion + Noise between 20Hz to 20kHz.

For your consideration, here is an example of what a poor amplifier would look like showing all the possible distortion products one might find from a triple-tone test like this across the audible spectrum (yes, this is an actual amplifier's output!), 2Vrms into 4Ω:

Tons of harmonic and intermodulation distortions and a rather unfortunate -18.2dB TD+N! As you'll see below, I will use this triple-tone test as a basis for my "Distortion Factor" amplifier grading score at 2V into my standard 4Ω load.

Depending on the amplifier, we can run each of the IMD/TIM/multitone tests at various output voltage levels like 0.5V, 1V, 2V, 5V, 10V, 20V, etc... across the 4Ω load based on what kind of power rating we're expecting from the amplifier. Obviously, we can also pick and choose which of the IMD tests we prefer to use.

D. Square Waveform Morphology and Very High Frequency Noise

Finally, let's have a peek at square waves through the digital oscilloscope. The Autoranger obviously is not used for this. Here's what a 2V square wave output looks like from the RME DAC to passive attenuator to RCA output into an inexpensive Hantek oscilloscope:

Notice tiny amount of overshoot. Otherwise clean looking "non-aliasing" (non-ringing) square waves from 24/384 file.

Using a 24/384 "non-aliasing" square wave, we can also get a sense of the balance from the right and left channels which overlap precisely here. The more squarish the waveforms look, the better the bandwidth of the device (loss of bandwidth will result in rounded corners). Loss of low-frequencies will result in a downward tilt (check this out for reference). With a 384kHz samplerate, this means there are frequency components up to around 190kHz for the amplifier to reproduce although the amplitude of components above 150kHz drops off significantly. Here's the FFT:

Finally, especially for Class D amplifiers where one could expect significant amounts of high frequency switching noise, we can use the Rigol DS1104Z oscilloscope to have a look at the amount of content up to 1.2-1.5MHz while playing a 24/192 signal with 5kHz and 93kHz tones for reference, 2V output:

As you can see, the Rigol is limited with its FFT resolution (shown above using Blackman windowing for best amplitude resolution). But it'll give us an idea of the presence of significant amounts of strong ultrasonic noise. As you can see, there isn't any major high frequency noise coming out of the RME DAC up to 1.2MHz that we need be concerned about.

Summary:

I suspect this post might be a bit dry for some of you :-). Nonetheless, the discussion is necessary before I embark on some amplifier measurements using the MOAR. It gives me an opportunity to show you what I'm up to and also provides a reference for myself in the days ahead to maintain consistency.

I'm thinking that it would be good to create a summary score to help rate the amplifiers. For example, I think one can appreciate quite a bit about an amplifier by showing the frequency response along with something like this 3-factor "AMOAR Score" for each device:

Archimago's MOAR (AMOAR) Score (4Ω) = Average Damping Factor /
Triple-Tone Distortion Factor (at 2V in dB) /
Power Factor as Volts @ 0.1%/-60dB THD+N

(Avg Damping Factor = average DF at 20, 100, 300, 1k, 5k, 10k, 20k Hz)

Having 3 factors in the score I think would be useful in reminding us of the multiple dimensions we must think of when determining whether an amplifier is adequate for our purposes.

The first number (Average Damping Factor) reminds us to consider our speakers, the impedance, and overall how hard they are to drive. While generally we can say that bass frequencies demand more "control" and typically woofers crossover below 300Hz where often some of the lowest impedance and difficult phase angles reside, I think DF across the audio spectrum is important because there are situations where loss of treble control can lead to harshness. If we're going to be pairing the amp with difficult-to-drive low impedance devices dipping below 4Ω, I think it's a good idea to aim for DF >50 into 4Ω to make the amp more "load invariant".

The second number, the "Triple-Tone Distortion Factor" tells us how clean the amplification is at 2V into 4Ω ("First Watt" quality) using the multitone signal. Most of the time with reasonably efficient speakers like ~85-90dB/W/m devices, the amp will be spending its time delivering <2V to your speakers at normal listening levels in a typical domestic sound room. When companies want to hype up good distortion numbers like "down to 0.001% distortion", be mindful of what power level they're using and into what kind of load. A 75W into 4Ω amp might be able to produce 0.001% THD+N (-100dB) at 50W (14V) which becomes the advertised headline number, but so what if in reality at 1W, the distortion is relatively poor at 0.2% (-54dB)? I'd rather focus on the more important 1W result than the rarely-used 50W performance when making comparisons.

For the Distortion Factor, I think it's also important to go beyond harmonic distortion and emphasize that high fidelity involves freedom from all distortions plus noise which can only be examined with more than a single tone. This is why I believe using the triple-tone TD+N result is better as a measure of fidelity. With the DAC loopback of the signal at 1V, the system's Distortion Factor is measured at better than -103dB; this is about the limit of what I can do with the current setup (the System can actually do a bit better than this with some tweaking, but we'll cross that bridge if needed). Realistically, for home hi-fi use, I suspect amplifiers achieving a score of -75dB would already be excellent when we consider also the ambient noise levels in our homes (probably something like 25-50dB SPL depending on where you live, time of day, and amount of sound insulation!). Of course, let's not forget the distortion added by speakers themselves. A score close to -100dB would be amazing of course.

Finally, the last number, the Power Factor, indicates the number of volts an amplifier can deliver at the top end before it hits 0.1% THD+N (-60dB) into 4Ω. This will give us an idea of how much power is available for the needs of the high fidelity enthusiast since 0.1% should still sound quite good even if a little "colored".

So, in practice, I could end up with amplifier AMOAR Scores that look like this:

UltraAudio Linear Amp = 120x / -90dB / 21.5V

SuperFi HyperTube = 15x / -45dB / Insufficient

FleaWatt Amp = 25x / -74dB / 3.2V

ShoutOut MegaPower PA = 80x / -53dB / 33.5V
GuruMeditation Ideal Wire 250W Amp = 300x / -100dB / 31.5V

Notice that the "SuperFi HyperTube" amp is incapable of 0.1% THD+N through any substantial portion of its usable range, hence the "Insufficient" designation; this, along with the high Distortion Factor value of -45dB at 2V should be taken as a firm warning about the sound quality of such a device. While an amp such as this might be considered "audiophile" by some or even "high-end" if it's expensive, a "classic" or otherwise desirable by others (eg. made by some esoteric guru), I would not expect it to be "high fidelity" or "transparent"!

Remember that while measurements allow us to compare devices, they point to "ideal" engineered outcomes. It doesn't mean that all of us must seek after these idealized results because the threshold of perceptual "transparency" often can be achieved for one's ears / brain / speakers / room even at modest objective performance levels (remember the limitations of hearing and perceiving). Just like pure subjectivist audiophiles really should expunge from their minds the worries of mythical claims that "Japanese amps in the 80's with feedback measured great but sounded bad" (see Bruno Putzeys' take on this in the "Storyline 3" section to this article), so too, remember that objectivists should not need to declare a -98dB THD+N amp as significantly better than one that measured -88dB. The point of diminishing returns I suspect begins many dB's before this (as per this "Good Enough" discussion).

Having said this, I threw up the results of the hypothetical "GuruMeditation" amp above as a device that one might aspire to. It has great damping factor for highly variable and low impedance speakers. The -100dB triple-tone total distortion and noise at 2V into 4Ω should satisfy any audiophile who desires purity of reproduction with "blacker than black" background from which the music arises assuming the amp has flat frequency response. Finally, the GuruMeditation device can maintain 0.1% THD+N into a full 31V (~250W into 4Ω); this almost certainly would satisfy any audiophile except those with remarkably power-hungry speakers and huge rooms!

Just like my DAC test bench results have expanded over the years to include tests like jitter, the "Digital Filter Composite" graph and impulse response waveforms, this collection of amplifier measurements could expand as software and hardware tools become available.

Obviously, I don't necessarily need to measure everything I've listed here with every device that I come across. I suspect most of the time a handful of tests will already give us quite a lot already. But the core parameters as per the "AMOAR Score" I think would be good to know for comparison as I build a little database of tested amplifiers.

-----------------------------

A look at the Nobsound NS-05P passive attenuator...

Notice that with the loopback tests above, the passive Nobsound NS-05P attenuator was set to 100% (ie. no attenuation). I will use this device for power amps with no volume control and to fine tune for the sensitivity of the amplifier (a general rule of thumb is that pro amps typically have a sensitivity of +4dBu / 1.23Vrms at which the amp will drive the signal to full output voltage).

For completeness, let's have a look at using this control to make sure it doesn't introduce excessive distortions and unexpected amounts of noise beyond the amount attenuated.

In the pictures above, we have a better look at the attenuator. It has 2-channels in/out, is capable of switching from XLR-to-RCA and vice versa. For the loopback tests I performed in this post, the passive attenuator is taking XLR input from the RME DAC and outputting to RCA en route to the Autoranger which then converts the signal back to balanced output to the RME ADC.

Let's take 3Vrms output from the RME DAC, send it through the attenuator, and tell the Autoranger to "hold" from doing any switching as we turn down the Nobsound preamp volume. Below, you'll see what happen as we attenuate the signal by 10, 20 and 30dB:

Not bad. Overall, with each 10dB attenuation, I'm not seeing any new noise showing up in the noise floor as I increase the amount of attenuation. At -10dB I see there's an exacerbation of the 3rd order harmonic, but otherwise nothing too objectionable. If we look at the THD+N measurements, it's increasing at the expected amount of around 10dB with each step in the attenuation as the fundamental signal is dropped and the noise level concomitantly creeps up.

When put to use, there is also the interaction with the Autoranger which when we attenuate down, will also compensate to try to keep the voltage fed to the ADC at 1Vrms nominal. So if I allow the Autoranger to switch, it will immediate give a +18dB boost to the -30dB signal resulting in this FFT:

Looks good. Compared to the -30dB tracing with the Autoranger defeated, we can peer into the harmonics whereas previously they were embedded in noise. So, even with a 3Vrms signal attenuated down to 0.095Vrms (-30dB) using the Nobsound NS-05P, we can still achieve better than -90dB THD+N with the Autoranger boosting the signal.

For the Autoranger to successfully achieve this indicates that it must be very quiet itself.

-----------------------------

Whew... That's a long one with quite a bit of experimentation using the equipment while preparing the write-up.

As you can see from the text and graphics above, I need to give a big shout out to John Mulcahy and contributors to the Room EQ Wizard software! What an impressive tool for the home audiophile that can be used to measure a host of parameters especially for amplifiers and speakers. Bravo!

One REW bug I need to mention is that the triple-tone signal generator often will not activate the custom frequency settings until I manually change the frequencies individually. I'll try to leave a note on this in the REW beta feedback. [Issue fixed with reinstall as per comments.] I am looking forward to what else the developers have up their sleeves.

Hope everyone's enjoying the music :-).

39 comments:

Techland17 November 2019 at 01:40
Yes, a bit dry but very interesting! BTW, Dr. Toole seemed to have missed the Augspurger paper, The Damping Factor Debate, from 1966. It explained why damping factors above 20 are meaningless. Because the formula to calculate the damping factor ignores the voice coil or speaker resistance. The effective damping factor is really only around 1.3 for a df of 16 and up. Read here:

http://www.lansingheritage.org/html/jbl/reference/technical/damping-factor.htm

This explanation seems exactly proven by Toole's measurement results.
ReplyDelete
Replies
Chuck17 November 2019 at 05:28
You mightr want to look at my "measuring amplifiers" thread on AVS https://www.avsforum.com/forum/155-diy-speakers-subs/855865-measuring-amplifiers.html
As you can see it is from 2007 but my findings still hold true. It is nice to see somebody that is actually measuring amplifiers for numbers and not based on ears. All you need to do is increase the wattage capability of your load and you could then accurately measure the big amps to see if the manufacturers are telling the truth!
ReplyDelete
Replies
Dipolaudio17 November 2019 at 12:54
As I agree that in average 1W is sufficient (average RMS power), peak levels are about factor 10 (voltage peaks) higher, i.e. power peaks a factor 100, i.e. 100W. Crestfactor of 10 correspond to 20dB. EBU R128 recommends 14--20dB.

A measurement of voltage and current (peak, realtime) on a Linkwitz LX521.4 gives rather high values on real music.
The power peaks are for each of the 4-ways indicated.

Fleetwood Mac, CD Rumours, "Second Hand News", 112dB SPL Peak in 2.5m listening distance. This is very loud!

SL LX521.4 Current A Peak Voltage V Peak Power VA Peak
Woofer (2x parallel) 9.0 27.9 251
LowerMid 5.7 38.9 223
UpperMid 6.1 30.0 183
Tweeter (2x series) 3.6 29.0 105

U and I may not be in phase VA=Voltampere

Note: upper-mid was thermally overloaded and got too hot after this piece (ca 3-4min)
ReplyDelete
Replies
Mikhail17 November 2019 at 14:01
Archimago, thanks for starting this effort! I'm pretty sure with your meticulous approach we will see very interesting results.

Distortion measurements and their correlation with psychouacoustics is indeed a complicated topic. Bob Cordell said recently that the purpose of achieving low THD and IMD in a power amp across its working frequency range is to ensure that the amp design is right and it behaves as the author intended, that is, there are no parasitic oscillations, components overloading etc.

Actually, Dr. Earl Geddes has published some interesting results on audibility of distortions based on psychouacoustic researches and studies, you can check them at http://www.gedlee.com/Papers/papers.aspx. In the paper "Auditory Perception of Nonlinear Distortion" he proposes his own metric for it.
ReplyDelete
Replies
Brewster18 November 2019 at 11:21
Hi Archimago, Looking forward to see what you come up. I expect the debate will be not about the measurements, but if the differences are important (in a double blind test sort of way).

I have two comments:
1) Watch out for non-linear power resistors. I learned from others that Parts Express sells a popular non-inductive power resistor as a load but apparently changed vendors as the new ones produce odd results compared to the old one. That caused me to retest my load and I ending up switching to a different design. Details here: https://clk.works/2018/05/power-resistors-for-amplifier-testing-the-solution/ (this is part 2 which shows what I did, the original discussion is https://clk.works///2018/02/power-resistors-for-amp-testing/ )

2) I'm not sure how your setup filters out the harmonics found with class D amplifiers. They tend to throw off ADC inputs in a variety of ways. AP makes a box just for that purpose, but I figured I would try my hand at making my own https://clk.works/2018/05/class-d-amplifier-measurement-filter/ I can't suggest what I hacked together as helpful. It needs to be built properly on a PCB and fully shielded. (A friend did a design that is closer to the AP one and properly shielded, it works great).
ReplyDelete
Replies
Dipolaudio18 November 2019 at 13:12
what you ever wanted to know about:

Fleetwod Mac, Rumours: Second Hand News. 2CD Edition, 2004
CD1, track1, Second Hand News
JRiver MC25:
Volume Level R128=-13.2LU
Peak Level Left=-0.1dB, Right=-0.1dB
Dynamik Range R128= 4.4LU
Dynamik Rang DR=8
BPM=116
Duration=2:56

Fleetwod Mac, Rumours: Second Hand News. 2CD Edition, 2004
CD1, track1, Second Hand News
Soundforge Pro11
Left Channel Right Channel
Cursor position (Time) 00:00:00.000 00:00:00.000
Sample value at cursor (dB) -Inf. -Inf.
Minimum sample position (Time) 00:01:14.155 00:00:43.586
Minimum sample value (dB) -0.101 -0.101
Maximum sample position (Time) 00:00:43.065 00:02:03.656
Maximum sample value (dB) -0.101 -0.102
RMS level (dB) -13.156 -13.343
Average value (dB) -90.309 -90.309
Zero crossings (Hz) 1'480.13 1'616.32
Maximum true peak sample position (Time) 00:01:59.538 00:00:06.723
Maximum true peak sample value (dB) -0.000 -0.000
Maximum filtered true peak sample position (Time) 00:02:08.795 00:02:08.777
Maximum filtered true peak sample value (dB) -0.000 -0.000

Integrated Loudness (LUFS) -9.80
Loudness Range (LU) 4.50
Maximum True Peak Loudness (dBTP) 0.00
Maximum Short-Term Loudness (LUFS) -6.38
Maximum Momentary Loudness (LUFS) -4.83

Now what I posted yesterday reults in a power sum of 762VA, and the values below:
762.1 VA Peak
112.5 dB SPL Peak
-13.2 dB Voltage Peak to Voltage RMS
99.4 dB SPL RMS

So with a Peak SPL of 112dB we have a rms spl OF 99.4dB

SPL Peak to RMS is 13.2dB or in Voltage factor=4.6 ,in power: factor=20.9

Okay thats too loud :)

Now we target 85dB SPL RMS which is -14.4 dB less (voltage or SPL):
Now the total power sum is ca. 24VA (all four channels).

The BPM gives the rythm of the peaks. Duration of peaks may be estimated 1:10. So peaks are short but high.

In power amp measurement this must be condsidered.
Maximum peak voltage and maximum current, each independent of each other should be characterized.

Nice example of amp specs are Hypex amps. Look at their datasheets.

ReplyDelete
Replies
Dipolaudio19 November 2019 at 06:33
for completeness: I said:
Now we target 85dB SPL RMS which is -14.4 dB less (voltage or SPL):
Now the total power sum is ca. 24VA (all four channels).

This is the peak power. The RMS power is then 24VA/20.9= 1.2VA

Here we are at around 1W (Watt is not equal VA however for simplicity..).

Summary:
1W RMS is okay for moderate listening levels
Peak Power should be 20x..100x more assuming a crestfactor of 4.6 to 10.
The duty cycle should in the range of 1:5 up to 1:10. The cyle duration 80-200ms.

For the short peaks the power amp should deliver the voltage peaks and the current peaks (independent of the angle, real part, imaginary part, i.e. independent of the load, ohmic, inductive capacitive).

How to test? Testsignal is not a big problem but a load with defined angle is...
Solution: active programmable load (professionals have such a thing for various applications, tests, HP now Keysight and others make such loads, dont ask the price though).

here one Hypex datasheet.

https://www.hypex.nl/img/upload/doc/ncore_mp/nc502mp/Documentation/NC502MP_02xx_03xx.pdf
ReplyDelete
Replies
Dipolaudio19 November 2019 at 06:44
What i did to estimate the power capabiliy:
1) took the amp datasheet
2) Took the rated impulse power at 1,2,4,6,8 Ohms if it existed in the datasheet
3) Out of the impulse power I calculated Ipeak, Upeak (assuming ohmic 1,2,4,6,8 ohms whatever was available int the datasheet) so as Pimpulse=Upeak x Ipeak
4) I got an estiamte what to expect of the amp.
5) tested the amp to these limits

Peak power: test with short duration testsignaal otherwiese the amp will break down (thermal, overload)
ReplyDelete
Replies
Miska19 November 2019 at 14:38
For TIM measurements, I'm using the values from the Leinonen & Otala & Curl original measurement specification paper. 3.18 kHz square and 15 kHz sine, 1:5.66 amplitude ratio, filtered by first order low pass with fc=100 kHz.

My test signal is sampled at 768 kHz.

500 Hz + 6 kHz sounds like way too relaxed, and the 48 kHz bandwidth doesn't really push the gear. Harmonics of the square wave reach out quite high with the specified low-pass filter. Remember that the specification was originally for analog function generators.
ReplyDelete
Replies
Prep 7419 November 2019 at 20:55
Hi Arch

Great to see these amp tests. While on this subject, you might be interested in this you tube video by Ethan Winer where he challenges Paul McGowan from PS Audio to a debate. Paul claims (in another you tube video) that we cannot measure everything that can be heard coming out of amplifiers and that null tests are invalid. That was back in January and Paul still hasn't accepted the challenge...
https://www.youtube.com/watch?v=6rB2W0umdq0
ReplyDelete
Replies
Dipolaudio20 November 2019 at 09:05
the claim of Paul McGowan is a common myth for every audio device.
Its just nonsense.
Actually it is much more measurable than can be heard.
Tru is that the relation of mesurmenet to how its perceived is not so clear and statistical. Some perceice it different then others.

Have you done the experiment of base-tone versus overtone audio perception. Interresting. Its about chord which some people claim to go from low frequency to high and other claim the same chord as going from high to low.

here (in german):
https://www.oberton.org/test-grund-oder-obertonhoerer/
ReplyDelete
Replies
JohnM20 November 2019 at 09:16
Regarding "One REW bug I need to mention is that the triple-tone signal generator often will not activate the custom frequency settings until I manually change the frequencies individually. I'll try to leave a note on this in the REW beta feedback. I am looking forward to what else the developers have up their sleeves."

I'm a development team of one, but I'll be happy to investigate if you provide some detail on what you were experiencing/how to reproduce it.
ReplyDelete
Replies
gregdunn20 November 2019 at 12:37
I have been using REW for my measurements as well, and it's excellent for speaker measurement as well as for amps and preamps. My needs are modest, but I've found the Scarlett 2i2 is quite decent for an inexpensive test rig. When all audio enthusiasts have something so simple and easy to use for testing gear, perhaps we will get more measurements and less speculation. I can hope.

It's too bad there isn't an easy to use ABX hardware tester, or we could dispose of more of the nonsense claims by deceptive audio manufacturers and dilettantes. A friend and I did a preamp comparison test back in the 70s using a custom box. It wasn't truly double-blind, but we went to great pains to hide the identities of the devices from each other while conducting the test. Very small level mismatches manifested themselves as apparent audible differences; once they were eliminated, two vastly different preamps (one tube, the other op-amp based) were indistinguishable. They actually measured rather differently, but the ear only cared about level and frequency response variations that were large enough.

I've done a lot of ABX testing recently with audio codecs (whose transparency is hard to measure) and found that the ear's threshold for noticing differences is pretty high. With a 128k bit rate on a good codec, over 90% of the music is discarded, yet the ear struggles to hear any difference. And not just me, but many trained listeners.

So keep up the effort to make good consistent measurements and involve the community in a discussion.
ReplyDelete
Replies
gregdunn20 November 2019 at 23:54
Good point about defending insecurities! I think we can make ourselves believe many reasonable-sounding things as long as they're not disproven by objective facts. And when your livelihood depends on supporting the belief that certain differences are clearly audible (though perhaps only to the discerning listener!) lots of questionable assertions become unchallenged memes.

Just last week I bought a cheap optical->analog DAC for my office system and thought it sounded quite good - but it seemed to have no bass. So I pulled the device and tested it - to find that the bass rolled off about 20 dB from 1 kHz down to 20 Hz! So that was clearly audible. But the shock was that the distortion reached more than 3% in the critical voice and music bands. That was not evident on musical signals, even on good headphones when I was aware of the defect.

Don't worry, I intend to keep testing and listening - if I find anything interesting, I'll surely try to share it.
ReplyDelete
Replies
Dipolaudio21 November 2019 at 13:12
Quote: How you measure this?

"Finally, the last number, the Power Factor, indicates the number of volts an amplifier can deliver at the top end before it hits 0.1% THD+N (-60dB) into 4Ω. This will give us an idea of how much power is available for the needs of the high fidelity enthusiast since 0.1% should still sound quite good even if a little "colored"."

1) With what software? REW?
2) Test signal as not to burn the amp (Burst, some cycles, duty cycle?
For example signal 10sec long with multitone signal as burst (some few cycles.
Then Fourier-analysis?

The peak voltage and peak current measured (Oszilloscope?) Audio-Interface: RME has Digicheck for peaks for example
-The test should push the amp to the max but not overload (0.1% dist+N)
-The test should not damage the amp, hence short bursts with a duty cycle similar to music and not continous (seconds tens of seconds but never longtime).

How?
ReplyDelete
Replies
Dipolaudio21 November 2019 at 13:15
I know the software ARTA and accourate (audiovero).
Both measure with logsweep signal, ARTA also with PN-Noise signal.
None of them with cycled multitone burst however.
REW I do not know.
ReplyDelete
Replies
Dipolaudio23 November 2019 at 14:13
I did a test with a Linkwitz signal: sine 1kHz, 5.1kHz amplitude 1:1.
200ms long in a period of 375ms (=160 BPM).
I then windowed with acourate (audiovero), Blackman Optimal window.

I put periods togehter so as to have a 10sec signal.
I then feeded the amp and observed the distortion on ARTA (Relatime Spectrumanalyzer).
The waveform I observed and measured with a scope (siglent SDS 1202X-E: max values and peak to peak.

Load just: 4x 1 Ohm resistor of 50W in series so as to have a 4 Ohm load).

Looks to work as intended.

Adjust the voltage up to the point of 0.1% distortion.

ReplyDelete
Replies
Tom Jee8 April 2022 at 13:45
You could actually convert centimeter to inches by just using simple calculations. Conversions don't include any difficult steps of divisions or multiplications. You may either calculate it manually with pen and paper or use a calculator . cm to inches
ReplyDelete
Replies

Add comment