Friday, 28 June 2013

MEASUREMENTS: Do bit-perfect digital S/PDIF transports sound the same?

Using suggestions from this page, the Touch can be used to transport DSD to the TEAC as DoP wrapped around a 24/176 FLAC file through Triode's USB kernel. Neat hack - proof of bit perfect transmission. Unfortunately the files are huge, so I likely will await an efficient solution. Note that this is NOT the test setup described below, just something cool. :-)

Let's talk about digital transports for a bit.

Most of us I'm sure remember those days when the only way to get digital data to an outboard DAC was through a CD transport. Although we can still resort to a CD reader, I suspect that many of us here have gone mostly into the computer audio realm with data on hard drives or flash/SSD devices. Thankfully gone are the days when the data being read off the CD could be inaccurate and interpolation may be needed in realtime, or be susceptible to mechanical failures of CD drive mechanisms (though hard drive failures and need for backups present another challenge).

I have shown that bit-perfect data can be transferred and played back without any concern off a USB asynchronous interface (eg. the various Mac and Windows software players and laptops). While I have the Squeezebox Receiver still on hand, let us have a look at the effect of using different transport devices with S/PDIF interfaces objectively.

Remember to keep in mind that the S/PDIF interface, unlike packetized asynchronous USB (or ethernet) conducts its data transfer in a unidirectional serial fashion formalized around 1985 when the AES3 (AES/EBU) standard was also laid down. As I think many of us have read, S/PDIF combines the data and clock signals using "biphasic mark code" and it is this "feature" of the interface that has resulted in many an audiophile nightmare regarding timing issues - jitter concerns especially well publicized. This is to a large part the basis for the well known paper by Dunn and Hawksford in 1992 asking "Is The AESEBU/SPDIF Digital Audio Interface Flawed?". (Of course the topic of jitter can be very complex as described in the paper going far beyond what we need to concern ourselves with here.)

With that said, let's be practical and see what the results looks like with the different devices using TosLink and coaxial interfaces (with comparison to asynchronous USB)...


For this round of testing, I decided to take a break from the TEAC UD-501 DAC and go back to the ASUS XONAR Essence One for a bit. Although I have been listening to the TEAC a lot in the last couple months, on a daily basis, I still use the ASUS Essence One at my computer workstation and it was just more practical to run these tests there. Despite my concerns around the upsampling feature of the ASUS, it measures well and sounds excellent.

Here's the hook-up:
* Transport device * -> Coaxial/Toslink/USB cable -> ASUS Essence One -> Shielded 3' RCA -> E-MU 0404USB -> 6' Belkin Gold USB -> Win8 laptop

Coaxial cable = 6' Acoustic Research
TosLink cable = 6' Acoustic Research
USB cable =  6' Belkin Gold

Transport devices tested:
1. Squeezebox 3 -> Coax / TosLink
2. Transporter -> Coax / TosLink
3. Receiver -> Coax / TosLink
4. Touch -> Coax / TosLink
5. Laptop -> CM6631A Async USB to S/PDIF -> Coax / TosLink
6. Laptop -> Async USB direct to ASUS Essence One

As you can see I've got the host of Squeezebox devices on the test bench along with the usual two ways to connect the computer's USB port to DACs (direct or through USB-S/PDIF converter).

I. RightMark Analysis:

Since there are so many devices/combinations, I'll show the results a few at a time to demonstrate what was found. Let us start with the results of the four Squeezebox devices. I decided to "max out" the capabilities of the SB3 and Receiver by using 24/48 sampling rate:

Numerically, not much difference...  From my subjective listening during the tests, I would agree with these numbers in saying that "it sounds like the Essence One DAC"; there are more similarities in the sound than subjective differences.

Let's have a look at the frequency response graph because there does appear to be some difference - here's coaxial interface only:

Hmmm, interesting! Small differences at the top end. Let's zoom into that top end and have a good look:

Notice the shape of the curves suggest slightly earlier roll-off with some devices. The flattest, most extended frequency response (and possibly most "accurate") is the Transporter, followed by SB3, then Touch, and Receiver. Remember, we are talking only about 0.15dB difference between the Transporter and Receiver at 20kHz; not perceptible IMO but since we're looking for evidence of a difference, useful to note.

Let's now include the TosLink measurements:

Even though we ran out of colors, it doesn't matter because there are still only 4 curves. There is no difference between coaxial and TosLink; they overlay on top of each other essentially perfectly.

Let us now add the computer-USB interfaces (take away the TosLink since no difference):

As you can see, the flattest response curves come from USB direct and Transporter. Here's the ranking: Transporter, ASUS USB direct, SB3, Touch, CM6631A, and Receiver. We'll talk more about this later, just keep in mind then that frequency responses are *slightly* different between transports despite bit-perfect settings...

The rest of the RightMark graphs - no significant difference:

II. Jitter Analysis:

Dunn J-Test stimulation of jitter. To keep it more manageable, I'll group them into 16-bit and 24-bit side-by-side first, let's just look at the coaxial interface here:

A. Squeezebox 3 (16-bit / 24-bit):

B. Transporter (16-bit / 24-bit):

C. Receiver (16-bit / 24-bit):

D. Touch (16-bit / 24-bit):

E. Laptop -> CM6631A USB to coaxial (16-bit / 24-bit):

F. Laptop -> USB direct (16-bit / 24-bit):

Objectively, it looks like the CM6631A USB-to-S/PDIF and USB direct 24-bit graphs are cleaner, and of the Squeezebox devices, the Transporter on the whole seems to have the least data-correlated jitter. Even at its worst, the sidebands for the SB3 around the primary signal is down around -120dB. Is this a problem? I doubt it since auditory masking will easily make this inaudible (assuming one could even hear down that low around 12kHz pitch). Furthermore, in theory, the J-Test should create a "worst case scenario" for jitter which is unrealistic in real music.

Here's the difference between coaxial vs. TosLink:

A. Squeezebox 3 24-bit, Coaxial vs. TosLink:

B. Transporter 24-bit, Coaxial vs. TosLink:

C. Receiver 24-bit, Coaxial vs. TosLink:

D. Touch 24-bit, Coaxial vs. TosLink:

E. CM6631A USB to S/PDIF 24-bit, Coaxial vs. TosLink:

In general, we can say that indeed TosLink is worse (remember however TosLink is immune to electrical noise with galvanic isolation so there are some positives in this regard). Interestingly this is very clear with the Transporter! However, increased jitter with TosLink is not a given because the SB3 and Receiver seem to behave in the opposite fashion and show less jitter artifacts with the TosLink interface.

Remember that all of these jitter graphs are indicative of the interface between the transport device connected to the Essence One DAC. The graphs could be different with another DAC since much of the result will depend on the accuracy of the DAC in extracting the clock information and what other steps it might take (eg. data buffering, reclocking) to further stabilize the timing - it's not just about the transport device.

III. DMAC Protocol

So, up to now we can see differences between bit-perfect devices with RightMark, and obvious differences with the J-Test. How do they sound? Let's see what the computer "hears". For this test, I am using the Transporter playback as reference against which all the others are being compared.

First, I must admit that I'm not as confident about these numbers as I am of the graphs and plots above simply because it was really tough getting this done properly! From previous experience with the Audio DiffMaker program, results can vary depending on environmental factors like temperature of the equipment and subtle "sample rate drift" over time. With each transport measured, cables needed to be reconnected, settings needed to be changed, and for each condition, I ran the test 3 times to get a sense of the "range" of results. Admittedly, I made an error with the 'USB direct' measurements and did not realize this until after the fact so did not include the results here (foobar was accidentally set to output 16-bit instead of 24-bit).

The bottom line is that the results suggest that each device "sounded" different according to the computer. Instead of the usual high "correlated null depth" like in my previous tests with player software around 80-90dB (similar to the Transporter tested against itself above), we're seeing numbers in the 60-80dB range between transports. The computer thought the Squeezebox 3 sounded the most different from the Transporter. Good to see that it was able to detect the Receiver playing 320kbps MP3 as "most different" (ie. lowest correlation) to provide a point of reference. A reminder, this measurement is logarithmic so the actual mathematical difference between the MP3 sample compared to the others is larger than what it might look like on the graph.

Remember that this is a measurement of the difference between each device and the Transporter connected to the Essence One. There is no implication here of whether one sounds "better" than another since that would of course be the listener's subjective judgment call.

IV. Summary

Let me see if I can summarize this based on the results here along with what I know/believe over the months of testing as applicable... Q&A format:

Q: Do all bit-perfect transports sound the same?
A: Based on the results, not exactly. Even though bit-perfect (I have verified this with the Touch, Transporter, SB3, CM6631A, ASUS USB direct with ASIO), small differences in frequency response can be measured. Furthermore, jitter analysis clearly looks different between devices and this also varies between coaxial and TosLink interfaces (with TosLink generally worse than coaxial for jitter). Likewise, the DMAC test also suggests the level of audio correlation when playing musical passages is not as high as previous tests with bit-perfect software or decoding lossless compression. Within the Squeezebox family, not surprisingly the Transporter performed the most accurately with flattest frequency response and lowest coaxial S/PDIF jitter, although I was quite surprised by the stronger TosLink jitter.

Q: Why do you think the frequency response varies?
A: My belief is that this is not a jitter issue. The reason I say this is that there appears to be no difference between coaxial and TosLink even though jitter varies between the two interfaces as demonstrated by the Dunn J-Test. I believe that this is the result of mild clock speed / data rate differences of the transport devices. Since the word clock has to be recovered from the S/PDIF signal, clock accuracy is dependent on the transport's internal clock - some transports may be timed a little quicker, some a littler slower and the DAC has to adjust to this (of course the E-MU 0404USB ADC measuring the audio has a part to play in setting where it believes the roll-off should be). This frequency roll-off variability is not seen with laptops connected to an asynchronous USB device for example (that's of course the point of being asynchronous; not time-coupled to the data sender by having the recipient working off its own clock and telling the sender to speed up or slow down if necessary).

Q: But surely different/better/more expensive digital S/PDIF cables can help?
A: No. I don't think so. As I have measured and discussed before, digital cables make no substantial difference to timing/jitter as far as I can tell. Even though very long or poorly constructed cables may add to the jitter, the difference IMO is much less than what I'm showing here and as far as I can tell is irrelevant for a reasonable length of decently constructed coaxial/TosLink.

Q: Those jitter plots look nasty... I bet I can easily tell the transports apart!
A: Of course, anyone can claim anything over the Internet or in print since there are rarely if ever any actual "double checking" with sound methodology or formal peer review in the case of print magazines (obviously these are not scientific journals). Although I have shown these measurable differences, as a (currently) 41 year old male who works in an office environment, have generally avoided very loud concerts, and have a hearing frequency threshold around 16kHz, I do not believe I would be able to differentiate any of these bit-perfect transports in controlled testing with the same ASUS Essence One DAC.

Q: Surely you just need better gear to hear it!
A: The data correlated jitter with any of these devices would be >100dB below the primary signal. The frequency response difference is less than 0.15dB at 16kHz (my frequency threshold). Unless there's some significant interaction that causes anomalies in the output significantly beyond what I measure here, these difference would be inaudible to me irrespective of the quality of the sound system. Of course if you have younger ears and better hearing, this could be different. I believe speakers and headphones would introduce much more distortion and change to the frequency response than what I'm measuring here with a good modern DAC.

Q: Well, if that's the case, then I might as well go for the cheapest digital transport/streamer I can find, right?
A: Well, maybe, maybe not. When it comes to sound quality, I think a digital transport would have to be quite incompetent to sound poor (eg. non-bit-perfect, horrific jitter or imagine if the frequency response rolled off way too early because of severe S/PDIF timing inaccuracies). Therefore, spending more on a digital transport is IMO not primarily about sound quality but rather features and the aesthetic "look and feel" you're after (eg. better remote, can handle higher sampling rates, more reliable, fits into the decor...). Sound quality IMO is better served by putting the money into good speakers/room treatments/amp/DAC. Back in the "old days" of CD spinners, better mechanics with higher reliability and accuracy just cost more money. Even then it's not a given; I remember spending five times more on a higher model Harmon/Kardon CD player 20 years ago and that failed within three years whereas an inexpensive JVC from Costco with digital out still runs fine today. I have not had occasion to try the "low end" devices like the <$100 media streamers (eg. WD TV)  to see how those compare to the Squeezebox dedicated audio units.

BTW: If you're not aware, the Squeezebox devices by nature are asynchronous since they receive the data through WiFi or ethernet from Logitech Media Server and buffered with a decent amount of internal memory. You can see that the TosLink and coaxial connections have worse jitter than what's measured directly off the analogue outputs (eg. look at SB3, Touch, Transporter, Receiver jitter measurements).

As usual, feel free to comment or link to any good data you may have come across regarding this topic especially if conclusions are different from what I've presented.

Musical selection this evening:
Philippe Jaroussky - Carestini (The Story of a Castrato) (Virgin Classics, 2007) - amazing vocals and fascinating musical history. ("It's a man, baby!" -- Austin Powers)

Happy listening! ;-)


  1. I have difficulties with the explanation about the change in FR and do not see how the FR can change because of timing issues as the the jitter test doesn't show alarming differences in jitter (which is timing errors). Of course we are mostly looking at the essence one and perhaps repeating the test with the UD-501 can help draw a more 'certain' conclusion. If the FR can really change with just a different transport (assuming data out of the SPDIF IS exactly the same) and is quite different as well (up to 20 dB !) with the DMAC test on the Asus, how different could they be on a better DAC or on a worse DAC for that matter that doesn't deal as well with jitter ?

    My gut feeling is that the data that is sent to the DAC might have been 'processed' (filtered ?) by the squeezeboxes before the data is sent out via the SPDIF. This seems logical as a correlation between DMAC test and FR does not seem to be there. Meaning the ever so slight differences in FR at the highest frequencies alone are possibly not the only differences that could warrant the relatively big difference between the tests. This could mean a filter might have 'changed' something more than just the FR to create the differences in the DMAC test.

    You say you confirmed they are bit perfect but like to know if you actually compared the data that came out of the SPDIF with each other at digital (byte content) level ? If you did the differences in FR can not be easily explained and is a strong case for the subjectivists to say... I told you so music is sounding different (DMAC test)

    Now, digital isn't my middle name and could have drawn the wrong conclusions. It's intriguing non-the-less that the same bits and bytes, that earlier could not show quite measurable differences, suddenly do show differences with the only difference being playback equipment (transports).

    1. I've looked at the FR and indeed it looks different as well with the TEAC. Again, very slight like the ASUS.

      I know that Logitech Media Server is sending bit-perfect data because I am able to transfer DTS error-free through the SB3, Touch, and Transporter into a DTS decoder. Also, no issues with decoding DoP with the Touch to the TEAC.

      Since the transports are bit perfect, the factor that I must assume is creeping in has to do with timing anomalies... My belief is that because we are dealing with a serial stream, the Essence One syncs with the transport at the clock rate being sent to it; and in this case the *accuracy* of the transport clock (irrespective of clock *stability* causing jitter). For example, if we take the Receiver where the roll off is the earliest, my assumption is that it's sending data slightly slower. Instead of being capable of a full 24/48kHz, let's say it's actually sending at something like 47.8kHz; this would mean at best the analogue output will "only" get up to 23.9kHz and therefore looks like it's rolling off slightly earlier comparatively. That's my theory at least. Maybe I can have a look at the SPDIF clock signal with the oscilloscope to verify after my holidays.

      As for the DMAC. It's *very* sensitive from about 70dB up. I find that it's great to use for "essentially identical" output when there's null depth from 80+dB, but below that, there's a "gray zone".

      Mentally, my experience with the DMAC:
      80+ dB - pretty sure it's identical and difference likely just DAC analogue noise
      70-80 dB - likely some true difference in the underlying signal but I suspect inaudible.
      55-70 dB - high quality lossy MP3/AAC region - 256kbps and 320kbps usually measure here.
      45-55 dB - decent quality MP3 like 160 or 192kbps.
      <45 dB - territory of 128kbps and lower audible bitrates.

      I often have to remind folks (especially the golden ears who claim to easily hear differences below 90dB) that what's being measured here by the DMAC is averaged over 45 seconds of music. Even very low level analogue noise will add up and eat into the null depth. The result is also dependent on the ADC used of course.

  2. Hello,
    first of all compliments for the great tests you've published. I would like to ask your opinion about high resolution audio files like 96/24 or 192/24 (I know it's off topic but I didn't know where to post). I think there isn't a really audible improvement among standard 44/16, taking into account also your mp3 test. Thanks for your attention.

    1. Hey Fabio... Thanks for the note.

      Here's my reply:

  3. I am curious about the different high end response too, and not sure about your hypotheses, though I don't have an alternate one.

    I mentioned before using a two tone test signal for AD/DA with the two tones an octave apart. It will show if the residual null is timing or a level difference. It also will let you calculate how much difference in timing there is. If the level of the residual is an even 6 db higher for the higher tone it is almost wholly due to timing. From the AES paper on Diffmaker you will find the following formula:

    The achievable depth (drop in Difference track energy, relative to the Reference track energy), at any frequency will be limited by the phase error of "theta" degrees existing between the Reference track and the Compared track at that frequency, and will be no better than

    10*log (2-2*cos(theta)) [dB]

    So with two tones you can determine if the residual is from timing shift or not by looking at levels in the FFT. If timing you can use the null of either tone and this formula to determine the amount of timing shift. As a for instance, timing shifts of 600 picoseconds would give a null of -99db on a 3 khz tone. It would give 93 db on a 6 khz tone. If the residual levels are an even 6 db apart for the two tones you can be sure any level differences are more than 20 db below timing residuals.

    You also can difference sawtooth waves and the result will point to which of the two files is slower. If you have file A, and file B, invert B and null them you will get periodic spikes (if it is a timing difference primarily causing the residuals). If the spikes point to the upper half of the waveform A is slower. If the spikes are on the lower half then B is slower. It is pretty obvious looking at the waveform in audio editing software in which direction the wave is asymmetric. Even a mixture of level difference and timing difference will show it. You would see a reduced sawtooth with a periodic spike in the result.

    So if you incorporated a few seconds of the two tone an octave apart followed by a few seconds of sawtooth waves you could determine at least for those few seconds how much timing shift there was and which wave is slower. Intersperse these with the rest of your test tone, maybe at beginning, middle and end then you could learn a bit more about what is making these different software players null out differently.

    I have found Diffmaker to take care of this for you sometimes, and sometimes it gets confused giving wrong null depth.

    I would welcome emailing you about the topic if you are interested. I know you are on computer audiophile forums some. I will send a message there with my email address.

  4. One additional comment in regards to Diffmaker.

    Another benefit of the sawtooth waves is that I have found this to be the easiest type of waveform for Diffmaker to time align. So even a half second of sawtooth waves at the start of any compared tracks is beneficial in the use of Diffmaker.

    1. Thanks for the comments Dennis!

      Been busy with the summer :-).

      I'll look for you out at CA!

    2. One more thing... My suspicion is that these FR differences are so small, it would be beyond the capability of DiffMaker since they only would show up with extreme high frequency content which would push us into synthetic test scenario again as you noted above with sawtooth waveforms.

  5. Isn't it entirely possible that those miniscule differences can be attributed to chance? No measuring equipment has 100% reliability. You would need repeated measurements to ascertain the standard error of the mean and then a bootstrap simulation in order to accurately represent individual differences between those sources for a high alpha level of statistical significance

    1. True. The possibility of chance is certainly there and I agree, more time needs to be spent to have a better look. Alas, unless I have more time to delve into this, I think it's OK to leave it since the differences are so miniscule that there's really no meaningful difference.