Sunday, 24 February 2013

MEASUREMENTS: Logitech Touch TT3 Mod.

Here's some data on my Touch with Soundcheck's Touch Toolbox 3.0 (http://soundcheck-audio.blogspot.ca/...oolbox-30.html) software mod applied.

Before I start, I just want to say that over the years I have not been involved in any of the discussions around this mod or the merits of it. Even though I do not necessarily agree with many of Soundcheck's comments on his site, I do appreciate his work in putting it together and creating the install script which was easy to run; it's always good to have hobbyists experimenting with stuff and it's interesting to see the feedback from users.

I followed his instructions to turn off the plug-ins, change to server-side decoding of FLAC in the "File Types" tab, "No Volume Adjustment", etc... As per instructions, WinSCP used to transfer the script and PuTTY for logging into the Touch.

I downloaded the script with 'wireless LAN deactivated' but noticed in the status screen that WiFi was labeled as "enabled" still (not sure if this means it's on or off!), so I used the 'tt -w' to turn off (or on) the WiFi status for good measure during a few of the measurements. Here's the screenshot of the "status" output for the "NoWiFi" condition in my charts (in retrospect, I think the modification turned off the WiFi, so when I 'disabled' the modification, it means I must have turned it on... not that it makes any difference as results show):


I also wanted to test the analogue output with "tt -o 1" but I could not get analogue output to work even though the status screen said it was enable. If anything, one would expect that turning off the screen, no wifi, server side decoding may drop the analogue noise floor maybe a few dB's... Oh well, I guess anyone who would go through this amount of modding would not be using the internal DAC anyways.

Procedure same as tests in previous posts, here's the result from the Touch+TT3 --> ASUS Essence 1 --> XLR cable --> E-Mu 0404USB:

16/44:
Name:  E1-TT3_16-44.jpg
Views: 296
Size:  30.8 KB

16/44 THD graph:


24/96:
Name:  E1-TT3_24-96.jpg
Views: 296
Size:  31.6 KB

24/96 THD graph:


The first column of each table is the stock Touch using the coaxial output (proper shielded coaxial cable of course).

Conclusion:
I see no difference with the Touch as digital transport despite screen off, no WiFi, server side decoding on these objective measurements. Subjectively, likewise, the Touch sounds great played through the Essence 1... Loving the sound of some Depeche Mode as I'm typing here with the mod still installed. I can see how not having the screen on and the ability to "touch the Touch" to make track selections would annoy me after awhile :-).

At least in my case with this specific DAC, the Touch Toolbox 3.0 mod made no difference to measurements even down to below -110dB noise floor.

MEASUREMENTS: Logitech Touch as transport!


The follow was a post I made of the Squeezebox Forum a few weeks back to hopefully answer the question of: How well does the Touch function as a digital transport?

Setup:
All the results here are made with the Touch with Triode's EDO firmware installed. For the sake of clarity, when I indicate a test was run in "EDO mode", this means it's the "digital only" mode, otherwise I'm referring to the default "digital + analogue" mode. For those wondering, there was no difference in measurements between stock firmware and with the EDO firmware installed.

The Touch was connected by ethernet to the basement music server --> either TosLink or coax out --> ASUS Xonar Essence 1 DAC --> XLR cables --> E-MU 0404 USB for measurement.

As previously measured, the balanced XLR output from the Essence One has so far given me the best electrical noise suppression and lowest noise floor, the Touch was used as the digital transport for this system. I then played the RightMark calibration, and test tones off the Touch and measured the analogue out from the Essence 1.

About the digital cables:
TosLink - cheap "VITonet" labelled plastic fibre cable - no idea where I got this, if I bought it, would likely be <$10 at local supplies store. Pretty thin and flimsy looking but gives a good tight connection with the ends. You can see this cable in the picture above just to the right of the Touch.

Coaxial - this gave me an opportunity to try a simple unshielded 3' stereo audio RCA cable (zip chord that I never used supplied with an old DVD player! forget any impedance matching or electrical shielding) vs. an actual shielded 6' coaxial cable "Acoustic Research Pro Series" I bought 10 years ago for ~$20.

I unplugged the Essence 1's USB cable from the computer during these tests to avoid potential noise pollution. Also, when testing the TosLink, the coaxial SPDIF was not connected.

First off, 16/44:
Name:  16-44_Audio.png
Views: 375
Size:  28.7 KB

Nothing to see here in terms of the Touch as transport! Essentially perfect measurements... The 1st column is just the audio played to the Essence 1 through USB 2 (note there's something wrong with the IM value here - I suspect it was a spurious error from this run). Whether I used TosLink, cheap RCA, actual coaxial to connect to the Essence 1 from the Touch did not matter.

Now, 24/96 (standard "digital + analogue" mode):
Name:  24-96_Audio.png
Views: 374
Size:  29.1 KB

Hmmm, not unexpectedly, the "RCA as coaxial" is the stand out here. Slightly reduced noise floor (~1 dB), reduced dynamic range (~2 dB), mildly worse stereo crosstalk.

How about 24/96 in "EDO mode" with the analogue output turned off?
Name:  24-96_EDO_Audio.png
Views: 375
Size:  29.7 KB

Essentially the same as the standard mode. It appears that turning off the analogue output circuitry (or at least silencing it) does not affect the test results in any meaningful way. Again, the RCA cable performance is inferior to a proper shielded coaxial cable. However one has to realize that at this level of performance, the difference is really so minor that it's unlikely anyone would be able to tell a difference from listening! To give you an idea, here's the THD plot, notice how little difference there is with the RCA cable (cyan) just rising over the others like around the 2-10kHz range.

Name:  24-96_THD.jpg
Views: 357
Size:  57.2 KB

Conclusion so far:
1. 16/44 performance is beyond reproach according to these tests.
2. 24/96 performance likewise is excellent. Starting to see the limitations of an unshielded cable connected to the coaxial SPDIF but even in such an extreme situation, the rise in noise floor is likely inaudible.
3. Turning off the analogue output does not appear to improve the quality of the digital transport.

Now it's time to talk 24/192 with the Touch with EDO plugin. Thank you Triode - amazing plugin/kernel!

Results:
Name:  24-192_EDO_Audio.png
Views: 368
Size:  30.1 KB

What I find impressive here is the fact that TosLink to the Essence 1 worked at 24/192!!! In fact, that picture of the Touch in the previous post playing the 24/192 John Coltrane's "Blue Train" (Classic Records HDAD release from 2001) was through the TosLink (you can see the 192kHz LED lit up on the Essence 1). This is why I'm very impressed by the components used in the Touch; kudos to Logitech and ASUS! Over the years, this is the first time I've been able to play 24/192 for hours without obvious clicks/pops/disruptions even with an inexpensive plastic cable. To show the TosLink result above wasn't a "one-off", here's a series of 4 runs with the TosLink done over about 2 hours - notice the inter-test reliability.

Name:  24-192_TosLink_consistency.jpg
Views: 370
Size:  42.6 KB

Like with the 24/96 tests, the cheap RCA cable is showing its limits even more with measurably increased noise floor (10 dB worse!) from the lack of shielding and possibly data errors as the speed of data transfer doubles. Another interesting phenomenon is that even the shielded coaxial cable has a measurably higher noise floor compared to the TosLink by about 2dB. Would an expensive coaxial cable improve this? Possible I guess if the shielding is excellent, but even so, would anyone ever notice at around -110dB!?

Another observation with this test is that the noise level, dynamic range, and stereo crosstalk are all WORSE than 24/96. Every piece of equipment here from the E-Mu, to Essence 1, to Touch are running at max. specifications so the system is running at it's limit.

Here's the THD graph showing the increase in noise with the coaxial interface - el cheapo RCA (cyan) is looking "bad" here:
Name:  24-192_THD.jpg
Views: 350
Size:  58.2 KB

Conclusion:
1. Thanks again to Triode for the EDO plugin/kernel. It works beautifully and in my system with TosLink working, the Touch measures within 1-2 dB in terms of noise level, dynamic range, and stereo crosstalk compared to a direct USB connection to the DAC for 24/192 playback!

2. If you can get the TosLink to work, you've set yourself free from electrical noise with "galvanic isolation" of the Touch and DAC. Again, the fact that I could get TosLink 24/192 to work reliably between the Touch and Essence One really is impressive and speaks well of the equipment.

3. A recurring theme in these tests is that of ELECTRICAL NOISE. Coaxial SPDIF cables need good shielding at 24/192!

4. 24/192 does not measure as well in my system as 24/96. Writers like Lavry and xiph.org ("24/192 Music Downloads Make No Sense") have already eloquently documented their opinions against 24/192 and I guess I can echo their concerns with the gear I'm using for these tests... Firstly, between 24/96 and 24/192, the difference is ultrasonic; do we demand high-end SLR digital cameras to also capture ultraviolet light? (Sure, you might want to do this for specific scientific reasons.) Secondly, other than a handful of albums usually from smaller labels like 2L, Reference and Linn, I have rarely come across truly native 24/192 (or 24/176) recordings. IMO, it also makes no sense to buy stuff like DSD64 converted to 24/176 such as many of the HDTracks offers.

Addendum: Feb 27, 2013
Thanks to slimdevices forum member "tpaxadpom" who measured the digital output with the AP2722 unit:
SPDIF RCA 377.3 - 324.5 ps
Toslink 1.604 ns

RCA much better in terms of jitter measurements.

MEASUREMENTS: Logitech Squeezebox Touch. [Updated 2013-06-22]


I got this unit late last year lightly used when Logitech announced the Touch's discontinuation. I notice it is quite hard to get one of these now and they're commanding quite an elevated price on eBay!

- Internal DAC chip: AKM4420

Setup: i7 computer system - same as Essence One. Analogue output from Touch going into the E-MU 0404USB for measurements.

Oscilloscope measurement of 1kHz square wave, 0dBFS off the analogue RCA output:
Nice clean waveform. 2.92V peak voltage. No significant channel imbalance noted.

Standard linear phase oversampling digital filter impulse response (16/44). Absolute polarity maintained.

RightMark Results:
Name:  Touch_Summary.png
Views: 983
Size:  29.9 KB

Mostly better results all around compared to the SB3. Frequency response within tighter range over the 20-20kHz spectrum, noise levels 1 dB lower in the 16-bit domain (as if it could go any lower!?), and about -4dB lower with 24-bit data giving the Touch DAC about 17.5 bits of dynamic range. Interestingly, the stereo crosstalk is a bit higher in the Touch vs. SB3 by about 6-8 dB (remember, it's still down at -90dB).

Notice no significant difference between WiFi and wired through the ethernet.

Frequency Response:






16-bit audio vs. 24-bit audio. Looks good. Less bass drop-off than the SB3 with my equipment.


Noise Floor:




 THD Graphs:




Jitter (Dunn J-Test, WiFi):
16-bit:

24-bit:


Analogue outputs look very clean.

Summary:
I guess the only surprise is that the stereo crosstalk is higher in the Touch than the SB3. Otherwise, audio quality seems superior - it can obviously handle 96kHz natively (up to 192kHz with the EDO plug-in as a digital transport), and flatter frequency response especially in the low bass could be audible.

Again, assuming the WiFi strength is reliable and you're not constantly rebuffering, I see no indication that sound quality is negatively impacted by going wireless.

Subjectively, I like the Touch's sound. These days, it powers my bedroom system with SONY amp and Tannoy mX2 bookshelf speakers I got about 10 years ago. For what it is, I can't complain about these analogue output results - very competent! Although you can get better measuring noise floor, dynamic range, etc. with an outboard DAC, the Touch in its stock form is already very impressive and it would be wise to do some A-B testing before thinking an expensive DAC will improve the sound much!

For what you get, the bang-for-the-buck from this little device is fantastic and a shame really that it has been discontinued.

MEASUREMENTS: "Slim Devices" Squeezebox 3 [Updated June 25, 2013]

Next up - my classic "Slim Devices" Squeezebox 3 (I believe this is one of the "first run" units; I was on a wait list at introduction):
- Internal DAC chip: TI/BB PCM1748E


Setup: i7 computer system with the analogue outputs of SB3 --> E-MU 0404USB. Details for this setup is same as Essence One tests.

Here's the oscilloscope plot of a 1kHz square wave at 0dBFS off the analogue outputs. Square wave have a slight downward slant. Peak voltage 2.64V. Nice channel balance (yellow = right, blue = left).

Standard linear phase digital reconstruction filter.

RESULTS:

First 2 columns are the stock SB3+stock wallwart connected to my basement music server by WiFi. Notice that 24-bit data does result in dropping of the noise floor by ~6dB. It looks like the good ol' SB3 internal DAC is capable of about 17-bit resolution when fed with 24-bits. Note that the old Stereophile review from 2006 did not measure 24-bit performance.

Second 2 columns are the same setup but with the ethernet (hooked up to my DLink gigabit switch 6 feet away). Essentially no difference compared to the WiFi.

Final column is with the SB3 over WiFi but the *SB Touch wallwart*. I see folks here talking about the crappy wallwart (true, the UNIFIVE wallwart looks and feels nasty compared to the one that came with the Touch!). I fed the 24-bit data and the result is essentially the same as the UNIFIVE. Based on these measurements, I don't see any evidence that the Touch wallwart would improve the stock SB3 performace.

Frequency Response:


Decent - obviously not as flat as Essence One from 20Hz-20kHz...

Noise Floor:


16-bit data obviously not as good as 24-bits. No difference between the WiFi vs. Ethernet groups.

THD Graph:


Dunn J-Test:
16-bit (16/44) -



24-bit (24/48) - 

CONCLUSION:

1. The SB3 can benefit from 24-bit "hi-res" audio. Whether you can hear the extra 5-6dB is your problem :-)
2. I see no evidence that a Touch wallwart would improve the performance over the cheap stock power supply. Who know is the multi-hundred $$$ or linear power supplies make a difference...
3. No evidence that running in WiFi mode will add any noise to the SB3 output.
4. The Dunn J-Test is demonstrating minimal jitter.

Saturday, 23 February 2013

MEASUREMENTS: E-MU 0404USB.

Since late 2012, I upgraded my computer workstation's DAC to the ASUS XONAR Essence One and this left my E-MU 0404USB free to be used as a dedicated ADC and measurement device.

For ease of use, I decided to "standardize" on the RightMark Audio Analyzer suite of measurements for these USB0404 measurements. At this time, the newest version is 6.2.5, available free from the website. In time, I might upgrade to the PRO version. What I can say so far is that RMAA is remarkably consistent with good inter-test reliability so long as one has the technique figured out with one's equipment. It's very easy to get things wrong during calibration for example. As usual, there are the caveats to keep in mind and the results I get here may not be comparable with others (good writeup here: http://nwavguy.blogspot.ca/2011/02/r...yzer-rmaa.html).

One nice feature about external DAC/ADC's like this one is the fact that it has LED's to indicate if signal is clipping. I've found this invaluable in measurements since it allows me to make sure I'm maximizing the measurable dynamic range.


To get a sense of what the E-MU 0404USB is capable of, here are a few numbers in RCA loopback mode - it looks like the 0404USB is not bad as a DAC itself! Internal DAC chip AKM AK4396 (same as Transporter).

Setup: AMD Phenom X4 Win 8 laptop on battery <---> USB2 <---> E-MU 0404USB (RCA loopback)
0404USB Driver: 1.40.00 Beta, L6.1.30.07 firmware.
USB cable used: generic high quality (<$20)
RCA cable: simple Radio Shack headphone phono to stereo RCA cable. As you can see in the image above, I have some generic RCA-to-XLR converters as well to plug into the analogue inputs up front.

Summary compared with Logitech Transporter (more detailed measurements of this device in a future blog post):
Name:  0404USB_Summary.png
Views: 182
Size:  30.0 KB

Notice that the Transporter measures better here but it is using XLR cables. If I had some balanced TSR to XLR's for the E-MU, I suspect the it would be even better.

Nice and flat frequency response at 24/96:

Here's the THD graph (24/96 again):

The jitter is really good for the E-MU also (it uses an asynchronous USB2 interface). This was done with the Dunn J-Test at 24/48 playing and analyzed with WaveSpectra 1.40E running a 131,072 point Blackman-Harris FFT. Note that I have seen various comments made in the forums that the 0404USB has bad jitter issues - well, I don't see it, in fact if I were to estimate, we're looking at <250ps using both the DAC and ADC in this case:
Name:  Jitter_24-48.jpg
Views: 182
Size:  33.3 KB

With digital silence playing, here's what the noise floor looks like:
Name:  Silence_(Plugged_into_loopback_RCA).jpg
Views: 182
Size:  32.0 KB
A few spikes noted, the tallest about -135dB down.

Noise floor zoomed into 0-100Hz range - notice essentially no 60Hz pollution:
Name:  Silence_0-100_Hz.jpg
Views: 183
Size:  31.8 KB

For such an inexpensive unit, I'm impressed! In the days ahead, I'll update this blog with the other DAC's and devices tested... I certainly do not claim that the 0404USB is near as good as dedicated measurement gear like the AP devices used by Stereophile. However, for the at home hobbyist, I suspect these kinds of tests will do nicely to help differentiate what's reasonably good from the bad.

From a subjective perspective, I've enjoyed using the 0404USB as my main DAC for about 3 years. The sound is precise and clean. Some would call it "analytical" but I honestly have no idea what that's supposed to mean since that sounds like a good thing to me :-). Basically, an honest portrayal of the digital audio with a good flat frequency response.

I do have a few gripes:
1. It's discontinued and the drivers were never great.  As noted above, I'm using the latest "beta" driver which is at least a couple years old. At least it's still compatible with Windows 8! On occasion, if I used the ASIO driver, it would mess up Windows mixer and I'd need to reset the sampling playback rate.

2. The headphone out is weak. It's fine with efficient headphones but my AKG Q701's were too much for the headphone amp.

3. Volume pot's got a bit noisy over the years.

Musings & Gear Measurements Coming...

With the MP3 test complete and results posted, I figure it's time to move on to other matters which I think could be interesting to the audiophile hobbyist.

I'm going to start with a rather simple, personal discussion around this audio hobby today.

As suggested by the blind MP3 test conducted, my personal philosophy towards "audiophilia" is one biased towards objectivism/empiricism. In my world view, even though ultimate joy in music is a subjective experience, the technologies employed to convey this beauty is through engineering and science which has with it clear goals and empirical methodology. I ascribe the beauty of music to the artist, not the engineer or manufacturer, and certainly not the equipment.

Unlike some opinions and editorials I've read in the past, from what I can tell the only real "gold" reference/standard in recorded music is IMO NOT the live event. Rather the first step in transducing sound into the electrical domain (ie. the mics or synthesizers themselves) remains the key. You can only reproduce the audio as good as what was initially fed in, and the standard that I want to hear is that "live" mic feed because an accurate reproduction of that is the best anyone can ask for. In a "direct to disk" recording, the recording chain is straightforward and that mic feed potentially can be heard at home; but in reality, most recorded music has gone through many steps in the studio and as a result, the "gold standard" becomes even more murky; dependent on the artist, mixer, recording engineer, and how they're hearing the final result with the studio gear.

My simple criteria for good gear:

Criterion 1. Are the objective measures good enough based on what we understand about hearing to show that competent engineers and manufacturers have done a good job in designing and producing this piece of equipment? In my way of thinking, accuracy is all that matters. There is no 'good' or 'bad' gear. Although there can be new discoveries in the audio sciences, I suspect there's nothing "earth shattering" left to find after all these years that cannot be adequately measured.

Criterion 2. Subjective listening tests with familiar music in a familiar room to verify that indeed it sounds good with music I enjoy.

The measurements I'll be posting are some attempts at Criterion 1... Many of the measurements first took shape on the Squeezebox Forum here:
http://forums.slimdevices.com/showthread.php?97950-MEASUREMENTS-Some-Squeezebox-numbers-to-consider


Tuesday, 5 February 2013

High Bitrate MP3 Internet Blind Test: Part 4 - SUBJECTIVE DESCRIPTIONS

Previous - Part III: Discussion

I know some of you have been wondering if the dataset contained descriptions of the subjective experience of respondents between the two sets. Compiled below are the respondents who commented with a subjective description - I have removed any identifying information in the spirit of maintaining anonymity and only included the responses where there was a suggestion of what was heard/not heard that may have allowed the respondent to choose. There were understandably few comments from those that did not hear an audible difference.

IMO, whether the commenter was correct in identifying the lossy compressed sample or not is not as important as the thoughtful consideration and comments in participating in what I believe is a difficult task.

As a reminder, the question was which 'Set' was felt to be 'inferior'. Set B was the actual lossy compressed option.

Those who chose Set A:


"Although I could easy hear the differences. It's very hard too discribe the difference. They are very small. There is some glare, a little bit harsness, a liitle less ambiance, ect. Most easy was it too recognize the differences on voices on the church track."

"Set B sounded more "there" to me, for each track. In particular, bass sounded more present, and treble sounded sweeter. Interested to see the final results"

"I could hear the most difference at the top end. Symbols/High hat and "s" seemed subdued in set A. However I am not sure if the conversion process to MP3 didn't enhance/reveil the top end. I struggled to detect any difference on the Megaherz tracks. I'll try again on a good HIFI later as I am sure my old scud of a laptop's sound card isn't up to much."

"I sort of tried to deduce. In term of preference, I have no preference over either (not my kind of tunes). Set A just sounds a tiny bit louder and given they seem to have been treated to be at equal loudness, I just thought a bit louder = lower dynamic range = must be compressed (the same way my AVR makes sound seem louder by reducing the dynamic range). :D"

"Much more difficult to tell a difference than I thought. Shorter song selections might have made it easier (hearing memory isn't very long). I hope my choice was correct! I just thought the B selections sounded better. Can't wait to see the results."

"Well, took me a while to tell the differences but indeed there are differences especially in terms of details, attack and decay. Easily identifiable on large loudspeakers rather than on headphones. I am quite surprised that they are very close. A better comparison than this would have been a high res studio master/higher bit rate vs MP3 320kbps. In the first place, the music information should have been there rather than just an upsampled CD/MP3."

"It comes with experience. Harmonics isn't easy to listen out for in the first try."

"Bass drum @ ~2 minutes of Church Distortion guitar and cymbals at start of Keine Zeit"

"Very difficult to tell the difference - in fact, it seems that it's just an impression that set B sounds better, slightly clearer, maybe. But listening to he samples more and more seems to even them out. Can't tell if there is more bass or treble here or there. It's just an overall impression. Thanks for putting this survey together. I'm curious to see the results! All the best!"

"I noticed a difference within a minute or so on my computer which has a small Hi-Fi system with bookshelf speakers. I confirmed it in my listening room and my wife concurs. By the way, you did not ask age, but for reference, I have 65 year old ears."

"Listened for the deeper base and the higher sparks. Pink Floyd's Time and Lyle Lovett's Church made it easier to find those missing bits that MP3's remove due to general midrange equipment cannot reproduce those extreme lows and highs. Those details are retained in the FLAC file. The bass went deeper with the subwoofer for set B and the piano sounded more full bodied with on the headphones."

"Set B sounded sharper if you listen carefully a few times, but the third song were hard due to the drum"

"I can't pinpoint exactly what's different, but set B sounds more open and sparkles more than set A to my ear."

"Extension of frequency by comparison, like if lossy, u won't hear very low or high frequency."
 
"'Time' set A actually sounded superior to me. But overall, B superior."

"I made the test two times. At the very beginning of the survey in good and silent conditions and 2nd time today January 27 with windows open and neighbors drilling behind the wall and electricity polluted. Both time the results are the same and are manifested with MUCH more air, natural details of female voices, clapping less harsh cymbals and generally more separated sources (voices and instruments). The DSOTM was the hardest, because my home version sounds slightly different. The metal band was easiest, because mp3 turns the cymbals into a constant dirty sheen."

"B just sounds more "real" especially in the percussion on Megahertz - listen to the hi-hat @ 1:14 (i'm a drummer); but I could be just be guessing ... it is VERY difficult to tell the difference! thanks for putting this together, looking forward to "hearing" the results!"

"obvious only on Keine Zeit (i.e. need complex sounds)"

"I wouldn't use the terms "inferior" or "superior" or "better" or "worse" - I'd just ask "Which one did you like better?" "Time" is one of my all-time favorite songs. I heard a difference immediately between "A" and "B" - but the funny thing is, I couldn't tell you what the difference is. Tried listening with the "...is it the bass? The high end? Better depth? Better tone?..." kind of mindset, but that was hopeless. What it comes down to is "B" sound more "solid" and "3 dimensional" and "real" to me than "A" does... but I can't tell you why. It just does. I don't know if it's because of the difference digital datastreams, or because of the processing you did to *GENERATE* the different data streams. For all I know, the "B" track might be the MP3, but whatever you did to transcode the "A" to the "B" made me like "B" better. Who knows? Thanks for putting this together, was quite interesting."

 

Those who chose Set B:


"With the good DAC this was REALLY easy to discern. I could tell within a second which was which. It's interesting it did not take a megabuck rest of the system to hear the difference. With the SBT analog outs the difference was much closer, but still there."

"I wonder if the you should have used tracks with "more going on" to help the listener? Tracks with more prominent percussion for longer periods of time, for example, since that's where it was fairly easy to detect mp3's in the past. Interesting test... it's my first listening test. Thanks!" (Selected "no difference")

"Although picked B as mp3, I'm not really that sure. Only a few things made me *think* this is true (e.g., piano). But I wouldn't be surprised to hear that I was wrong. mp3 is at worst very transparent here."

"the music was flat vs open i also tried with (cheap) headphones on a laptop and was almost impossible to hear a difference"

"Presence."

"There is a common sound to PCM processed with floating point math, also dither (especially with noise shaping) and MP3 (worst of all). All these processes smear the precision of the sounds in time and stereo position, likely introducing quantisation errors that the designers of AD and DA converters tried hard to avoid :)"

"I'm 67 and sadly, my high end rolls off @13khz. I found the highs on the bells & chimes in "Time" to be harsher and more strident on the B version"


"I used ABXer for the test. I scored 100% on Church, 60% on Keine Zeit, and a big zero on Time." (Ed: Thank you for spending the time!)

 "I listened to the cymbals. HF content, easy clue for spotting bad compression. Cymbals seemed more mellow/relaxed/natural in A, but if there was a volume difference, that probably fooled me. I heard very little difference."

"I'ts a very difficult test. I've recognized difference only on the first Time track. Others are indistinguible for me."

"great test this has helped me realize that i don't have those golden ears and i should just simply enjoy the music"

"not sure if i'm correct as my audio set-up is not exactly "hi-fi", but the most audible difference i heard was the hi-hat in the ending part of church, in B, it sounded more synthetic compared to A. I guess MP3s are still awesome when space is a constraint, i don't strain my ears so hard when listening normally and so do most people in order to enjoy music! (right?) very glad to help you with the survey, hope i have been of some help!"

"hard edge at top end"

"Not much difference really. Maybe headphones thru a laptop isn't much but noticed in Lyle Lovett's Set B, his vocal's seem more 'pronounced'"

"For me, the one thing I noticed is the the sound stage was completely different with the clocks on Time. One set had a real definied space for each clock, where the other was a lot tougher to pinpoint the clocks actual location. In fact, I would have honestly said it was two different masters moreso than one being compressed or another. I've not experienced anything like that with my LAME Mp3 settings... my mp3s match my FLAC files to the point no one has ever been able to guess which is which. This has been a very interesting an fun thing! Next time let me compress the files and give them to ya! ;)"

"There is a distinct difference between the sharpness of both encodings but whether the sharpness is considered better quality or inferior quality is up to our tastes. I chose the sharper quality as the better recording."

Saturday, 2 February 2013

High Bitrate MP3 Internet Blind Test: Part 3 - DISCUSSION

Previous - Part II: Results

DISCUSSION:
So, what does this all mean?

Firstly, it's important to keep in mind the limitations of this survey. As an attempt to gather testers around the world, there are numerous uncontrolled variables including the varying degrees of technical savvy among users and competence in terms of maximizing the sound quality of their gear. Having said this, looking at the responses I got, I believe most respondents did give the test a fair trial and looking at the responses where equipment was listed, it's clear that the cohort doing this test is beyond the average consumer of audio electronics in terms of quality of hardware. For the most part, even those describing equipment used as <$100, the models chosen are generally highly regarded within the price bracket.

Despite the lack of control of equipment or listening methodology, this test is 'naturalistic' and captures the preference of the "audiophile" in his/her own room, and own equipment. Even if unfamiliar with the music, there's a familiarity with the sound of the gear and the room which one would expect should help with sound quality evaluation. Furthermore, plenty of time was afforded so there should have been no stress since this is not a time-limited task nor were the respondents forced to choose one or the other (as I said in the instructions, I was also interested in those who did not think they could hear a difference).

As I noted in the PROCEDURE page, the MP3 encoding is somewhat unorthodox in that the parameters used were chosen to mask certain anomalies easily detected in MP3 files sourced with standard settings. Nonetheless, I believe the resulting quality still reflects approximately the same lossy characteristic as a direct 320kbps encode. In fact, one might even suspect that these test files could actually be worse (from an accuracy perspective in comparison to the lossless source) because the audio was run through the psychoacoustic process twice (once at 400kbps, second time 350kbps), and in retaining the full 16/44 audio spectrum, significant portions of the bitrate were devoted to encode inaudible frequencies rather than more accurately represent the audible.

Reading the comments on the various message boards, I believe that I have been successful in maintaining the anonymity of the MP3 files. There was one board where someone commented on how the frequency spectrum appears unusual but was not able to identify which was MP3 sourced.

As for the test itself (a blind AB comparison) and the survey question "which Set sounds inferior", the respondent has to make 2 choices:
1. Is there a difference between the two Sets of audio? If not, the respondent can vote "no difference".
2. If there were a perceived difference, which is "inferior"?

For question 2 above, intellectually we can imagine that "lossy" compression implies the music has been altered such that the loss is somehow bad or a degradation in quality. Likewise, the general consensus in media (as per my links in Part 0) suggests MP3 should be "bad sounding". But isn't it also possible that running music through a psychoacoustic model may "clean up" the sound by retaining a focus on the most relevant signals? One might imagine that this might come across as a less noisy background or reduced ultrasonic intermodulation distortion since high frequencies are often filtered out. An alternate model like the ABX paradigm would have resolved these two concurrent decisions but ensuring the integrity of a blind test would be impossible.

Even based on the result from this admittedly small survey of 151 respondents, there was a significant preference for the sound of the MP3 Set (ie. most thought the lossless Set sounded "inferior"). The fact that a significant result was achieved suggests that high bitrate MP3 is NOT strictly "transparent" since this would imply exactly the same sound and presumably a random insignificant result. The fascinating suggestion from this dataset therefore is that in a blind test, most listeners would actually consider the MP3 tracks as sounding better! This pattern of preference surprisingly appeared EVEN STRONGER in those using more expensive equipment to evaluate. Furthermore, respondents who thought there was a greater difference in the more "noisy" and distorted track 'Keine Zeit' also showed an even stronger preference for the MP3 encoded version (some were very vocal in noting how "obvious" this was) even though from an objective perspective, this was the most difficult track for MP3 encoding.

As with any survey / study based on group results, even though the consensus points to one conclusion, this does not necessarily apply to everyone. To be clear, there were a few respondents who appeared very sure of their perception in the survey and proved to have been correct.

Going into this endeavor, I expressed that my reason to do this test was to find out whether MP3 encoding resulted in significant deterioration in sound quality. From what I can tell with 151 responses from around the world, a majority did not find a significant deterioration, and surprisingly most thought it sounded superior! Let me know if you've seen any other tests show such a bias.

Thanks again to all the respondents in contributing their time! :-)

Continue to - Part IV: Subjective Descriptions

High Bitrate MP3 Internet Blind Test: Part 2 - RESULTS

Previous - Part I: Procedure

RESULTS:
The final tally for respondents is 151. Here's the updated map of where the responses came from:
As I mentioned in "Part 0", the majority of responses were from North America (64), followed by Europe (47), Asia (33), Australia & New Zealand (4), finally South America (3). It looks like freeonlinesurveys.com may actually not be completely accurate since at least one person indicated they were in Russia which was not highlighted on the map!


First, lets have a look at the demographic that responded to this test in terms of equipment used:


As you can see, the price range (asked to specify in $USD) of the audio gear and system setup varied greatly.  A large proportion of respondents used headphones for the test (23%) which I suspect is reasonable especially given the computer-audio nature. I suspect many of us consider the headphones plugged into the computer/DAC to be superior to whatever speakers may be on the desk. 25% responded to the optional field and actually listed the gear used (thanks!). Depending on the price range, scanning the responses I see a huge range of headphones tested (Beyerdynamic DT990, 880 & DT770 seems popular, a few AudioTechnica M50's & AD700, Sony V6, Creative Aurvana, Senn HD800/650/600/570, Bose QuietComforts, AKG K701, Shure SRH440, Hifiman Re-0, Ultimate Ears Triple-Fi 10, Superlux 668B, Fostex). Likewise a full range of speakers like Martin Logans, PBN Montana, Decware MG944, a couple Magnepans, B&W 802D's, KEF iQ1, Sapphire ST2). It's notable that some folks used a combination of headphones and speakers. Some DIY guys also got involved with their own DAC's - one respondent specified a homemade Sabre DAC. Network streamers were mainly Squeezebox Touch models sent to outboard DAC's, one person listed the Naim NDX. As for DAC's, I see everything from a dCS setup to DragonFly to Mytek to Xonar Essence ST / D2 / One's to Meridians... Looking at the detailed responses, I think I can honestly say that respondents took the test seriously, some describing their test procedure and running foobar2000 ABX tester themselves.

As for which song was felt to be easiest to differentiate between MP3 and lossless:
"Time" was the winner followed by "Church". Interesting given that from Part 1, we can say with some objectivity that it's actually "Keine Zeit" which shows the greatest variance in comparison to the original lossless audio. For most of us, familiarity is important and I think for the demographic, "Time" and "Church" would likely be most accessible (like I said, I had complaints about putting the metal track in the test!). Some respondents would have preferred a classical track as well. I agree this also would have been revealing but it's always a compromise trying to keep the test simple and download size reasonable.

Perhaps not unexpected, most respondents had to work hard or felt it was impossible to tell the difference between the Sets (total 50.7% for these 2 groups). Interesting that almost 1/4 (21%) thought the test was "easy" - it'll be interesting to see later if this confidence leads to accurate identification!

Finally, what you've all been waiting for:

WOW! Remember that Set B was the MP3, yet for those who picked A or B, most thought A sounded inferior! Looking at just the ones who selected A or B, assuming a 50% chance of success in a "guess", the fact that only 45 respondents got the answer correct out of 123 is statistically significant with a probability <1%.

Lets have a look at those who were confident and said this test was easy:

As you can see, despite the confidence, most of the respondents thought that Set A (the original lossless audio) sounded worse than Set B (MP3).

How about those with more expensive equipment vs. less expensive?


For those who used equipment $6000 and above, we see a similar distribution of preference for Set A, but look at what happened to the proportion for those using less expensive equipment. It appears that those using <$500 actually showed a more balanced preference of A and B - it seems like the participants with more expensive equipment preferred the lossy tracks.


Looking at the larger groups, it was interesting to see that those who used speakers (either floorstanders or bookshelves) seem to prefer Set A more than headphone users (likely not significant but interesting observation):



As for the songs themselves, the song "Keine Zeit" where the lossy file measured with the most variance compared to the original lossless file (ie. the song most difficult to encode resulting in the most error), was the one where most preferred the sound of the MP3!
In contrast, the other 2 songs were slightly more balanced. Note though that since songs were grouped as "sets", these results are obviously not independent of each other.


Surprised by the results? I sure was!

Continue to - Part III: Discussion

Friday, 1 February 2013

High Bitrate MP3 Internet Blind Test: Part 1 - PROCEDURE (Set B = MP3)

The survey has closed today (February 1, 2013). Over the next few days, I will have write ups on the Procedure (released today), followed by Results, and finally a Discussion section.

As you'll see in the description below, "Set B" was the MP3 encoded collection of music.

For the survey participants - now that you know, consider how you voted.  Do you believe MP3 ~320kbps causes significant or serious sonic degradation, or in a significant way impaired your ability to enjoy the music through your system?

--------------------------------------

Procedure:
Over the course of approximately two months (December 10, 2012 - February 1, 2013), an anonymous survey was activated on freeonlinesurveys.com to gather feedback on the audibility of 2 "Sets" of FLAC-encoded audio files. One set of files contained segments of music ripped directly from audio CD (PCM 16/44) whereas the other set had the audio converted to MP3 then decoded back to 16/44 format where it was converted to FLAC. The specific details of this MP3 conversion will be discussed below.

Song selection:
The 3 musical segments selected for the test were:

1. "Time" (2:29) from Pink Floyd off the 2011 re-master of "Dark Side Of The Moon" - a 2.5 minute excerpt with all the ruckus of chimes, bells and clocks in wonderful detail and space. A classic audiophile test track. This segment has a score of DR11 using the Foobar2000 dynamic range meter.

2. "Church" (2:31) (from Lyle Lovett off his 1992 record "Joshua Judges Ruth". An acoustic country track with layered vocals, hand clapping, and a choir to evaluate sound quality with. The DR16 measurement for this segment represents a highly dynamic and natural-sounding track.

3. "Keine Zeit" (1:20) from Megaherz off the recent 2012 album "Götterdämmerung". For those who have used VBR algorithms for lossy encoding, "loud" music tends to demand higher bitrates to encode. This track at DR6 is not particularly dynamic but is representative of modern mastering for music in the hard rock / metal genres.


Participant Invitation for Test:
During the 2 months that this test was conducted,"subjects" were recruited from a number of "audiophile" and music related message forums. The hope was to achieve an adequate number of serious audiophiles and music lovers representing a cohort who would be able to seriously assess sound quality and would own higher quality equipment for audio playback. In principle, this would be the group most likely to be critical of sonic degradation. Invitations were posted on the following forums:
- audioasylum.com "PC Audio"
- forums.slimdevices.com "Audiophile"
- www.head-fi.org "Computer Audio"
- www.computeraudiophile.com "General Forum"
- stereophile.com "MP3 vs AAC vs FLAC vs CD" article comments
- www.hydrogenaudio.org "Listening Tests"
- www.audiocircle.com "The Discless Circle"
- www.stevehoffman.tv "Audio Hardware"
- www.wiredstate.com "Equipment Reviews, Listening Impressions"
- www.xtremeplace.com "Planet Audio" (Singapore)
- vr-zone.com "Audiophile's & HTPC Corner" (Singapore)
- www.lowyat.net "Home Entertainment / Audiophiles" (Malaysia)

Reminder messages were posted on the forums approximately ever 2-3 weeks to increase visibility of the invitation with the last reminder approximately 1 week before the closure of the survey. There should have been plenty of time for all the respondents to listen and make judgments on perceived quality of the samples.

How the MP3 test tracks were produced:
For those with some experience with digital audio editing, it is relatively trivial to detect if a WAV/FLAC file were sourced through a standard MP3 process. Lossy encoders like MP3 will "throw out" frequencies the psychoacoustic model deems inaudible. For example, running an FFT frequency analysis on many MP3's quickly reveals that most encoders will remove frequencies at 18kHz and above. Characteristics like this allow programs like Tau Analyzer to estimate the probability of an audio file to have been modified by lossy encoding. Therefore, for the purpose of this test where the samples are freely available to many likely technologically savvy participants, it was necessary that the MP3-encoded samples be process in some way which results in equivalent sound quality to a direct MP3 encode around 320kbps, yet mask the file from easy detection.

A 2 stage technique was used to create the MP3 test files using LAME 3.99.5 (current version at this time):
Stage 1 - convert to 400kbps
lame.exe --freeformat --lowpass -1 -b400 <file.wav> <file400.mp3>
lame.exe --decode <file400.mp3>

Stage 2 - convert to 350kbps
lame.exe --freeformat --lowpass -1 -b350 <file400.wav> <file350.mp3>
lame.exe --decode <file350.mp3>

Use dBPowerAmp to convert the <file350.wav> to FLAC

This utilizes LAME's "free format" to create initially a 400kbps MP3 without the usual lowpass filter in place, then runs the resulting file through the MP3 encoder again but at a lower 350kbps bitrate (again with low-pass turned off) which closer approximates the 320kbps target bitrate for the test. By doing this, even though the resulting MP3 size is slightly larger by 30kbps, the degradation in sound quality by objective measures is in fact approximately the same or slightly worse than if the audio were processed directly through 320kbps but without the tell-tale sign of the strong low-pass filter.

Since this was not a direct conversion to 320 kbps, to confirm the amount of sonic degradation of this process used to create the test MP3, WavDiff was employed to calculate the variance from the original lossless file vs. the MP3 processed test files and also the variance of the original lossless file vs. MP3 encodes at CBR 320 kbps and 256 kbps (for the sake of brevity, I will just report the RMS Error [RMSE]):
Time - Test file: 105.879  /  MP3 (320): 110.403  /  MP3 (256): 176.337
Church -  Test file: 46.591  /  MP3 (320): 46.915  /  MP3 (256): 76.357
KeineZeit - Test file: 393.914  /  MP3 (320): 372.550  /  MP3 (256): 607.282

From the values above, one can see that the three musical selections objectively are similar in variance to a 320 kbps MP3 and that variance is significantly larger with 256 kbps (as expected). Note that I have also used the above technique to encode test tones to ensure that the distortion characteristics from the encoding method closely represents CBR 320kbps MP3. Something to keep in mind is that because the low-pass filter was turned off, bits are now being used by the MP3 encoder for frequencies beyond the usual threshold for hearing for most people (is there ever a need to allocate any bits for 20-22kHz for example?), taking away from the ability for the encoder to better represent audible frequencies. Theoretically this should worsen the sound quality by worsening the distortion for the frequencies we are more sensitive to.

On a side note, KeineZeit seemed to be the most difficult to encode resulting in the highest amount of error going through the lossy encoding process.

The files were checked with the "Foobar2000 Dynamic Range Monitor" and ensured to be equal in  volume after the MP3 compression of  <0.01 dB difference.

The original lossless files were labelled Time_A, Church_A, and KeineZeit_A; whereas the lossy MP3 encoded/decoded files were labelled Time_B, Church_B, and KeineZeit_B. The files were encoded to FLAC for lossless compression and delivered for download as a ZIP file with instructions totaling approximately 75MB.

As noted above, freeonlinesurveys.com was utilized to collect the survey results. The primary question asked was for the respondents to choose whether "Set A" or "Set B" sounded INFERIOR; with the implication probably being that the MP3 encoding would deteriorate the sound quality (it is by definition "lossy"). In order to not force the respondents to guess if in their opinion the samples are equivalent, an option was provided to select "no difference". The other questions in the survey pertained to which of the 3 songs was thought to be most revealing of differences, confidence of the respondent ("easy" to "impossible to tell"), approximate cost of the audio system used, and a description of the type of equipment used. The hope is that these other variables can be used to analyze the data to determine if the respondent's level of confidence and cost of the system (presumably the more expensive systems are more revealing) predicted accuracy.

Continue to - Part 2: Results