Saturday, 25 April 2026

Audiophile small room acoustics: complex and essential.


In the post "What are the most important audio components?" a few years ago, I said that we need to understand the various audiophile "domains": audio hardware, room acoustics, quality of your music, and of course personal mental engagement. Each of these have a part to play in the achievement of and personal judgment around sound quality.

Of those domains, it's hard to speak about personal mental engagement as this is a highly individual exercise (and also relates to one's auditory acuity). The quality of the music we listen to again is related to not just personal preferences (like genre) but also what mastering quality and variants like hi-res, multichannel, and on what format (SACD, vinyl, etc.) the record label makes available of our favorite albums.

It's always tempting of course to get excited about audio hardware - the latest toys. For example, recently, I heard that Andrew Jones has another ported box speaker design with paper/fabric drivers, this time with a coaxial tweeter+midrange that's based on a field coils / electromagnets, pair to permanent magnet woofer at something like USD$34k. While I'm sure this could be a compelling design for many, I think we can also recognize that these days, as equipment fidelity has improved significantly to the point where a simplistic "high price = high quality sound" equation no longer works, it's maybe wiser to direct our attention to the other domains.

This then leaves the very important issue of sound room acoustics, an area that we can all to some degree optimize (see also previous discussion on the sound room). I have heard it said that the sound room is the "foundation" from which we build our hi-fi sound system. As with any project, without an adequate foundation, no matter how great the rest of the system may be, you can only achieve a certain ceiling of performance dictated by the limits of that foundation.

In this post, let's dive a bit deeper into the importance of typical domestic small room acoustics and general principles around improving sound quality.


First, let's define a "small room". From an architectural perspective, a standard small bedroom in a modern home here in North America is roughly 10' x 12' x 8' (3m x 3.7m x 2.4m) or around 1000 ft³ (28m³) in volume. A small living room would be about 15' x 15' x 8' (4.6m x 4.6m x 2.4m) or around 2000 ft³ (57m³).

So for rooms we might use in a home, we can probably say something like:
1000-2000 ft³ (28-57m³) = "SMALL" room
2000-5000 ft³ (57-140m³) = "MEDIUM" room 
>5000 ft³ (>140m³) = "LARGE" room in most homes
While that kind of description might be fine as a rule of thumb as we enter a room and "size it up", there are more scientific ways to define a "small" space. Instead of defining rooms based on dimensions or volume, we need to consider frequency-dependent behavior.

As you will recall from high school physics, when waves are created in an enclosed space, at low frequencies where wavelengths fit within the dimensions of the room, we end up with resonant standing waves that can be additive (peaks) or subtractive (nulls); these we can room modes. The number of modes increase exponentially beyond the fundamental into higher frequency harmonics and are a result of axial (between opposite walls, 2 boundaries, strongest), tangential (4 boundaries), and oblique (6 boundaries, weakest) combinations and permutations in any given enclosed space (see this Mixdown article for diagrams of the modal types). Thankfully, the vast majority of potential modes end up being acoustically irrelevant as we'll see next.

In a "small" room, sparse/low density modal behavior (that is, a handful of individual modes can be seen as isolated peaks and valleys) extends higher into the audible spectrum. As we go even further up in frequency like 1kHz and above, eventually, the sheer number of modes overlap and when this happens, they self-average into the equivalent of a smooth "diffuse field". Statistically, the approximate frequency where a small room's modal behavior transitions to that smoother diffuse field is related to decay time of a room and was calculated in the 1960s from work at Bell Labs by Manfred R. Schroeder - the eponymous Schroeder Frequency (fs = 2000 × √(T₆₀ / V)). The equation's transition point was defined as a threshold of 3 or more overlapping modes' influence at any specific frequency; the start of where they practically average out and resemble a Gaussian random field.
Let's calculate the "modal density" for every 1/3-octave of rectangular rooms. We can use 3 room sizes as examples to demonstrate what this looks like:

Note that both the modal density Y-axis and the frequency X-axis are logarithmic.
RT₆₀ for these model rooms have been set ~0.45s.
BTW, 1/3-octave modal density graphs typically up to 200Hz are called
Bonello Graphs; these could be useful diagnostically to determine room behavior.

What we see is that as room size increases, the Schroeder frequency drops from ~225Hz in the small (10x12x8') room, to ~170Hz in the larger (15x18x8') room, to ~133Hz in the largest (25x16x10') space. As room size increases, those sparse (low density) modes that cause strong peaks and nulls are confined to the lower frequencies away from coloring the important lower-mid and midrange, about 200Hz and up. So, despite the huge number of possible room modes across the audible frequencies, it's really the first few hundred below about 300Hz that are likely to cause audible problems with their peaks and valleys.

Notice that the equation for the Schroeder frequency is dependent on T₆₀ or reverberation time in seconds for 60dB decay. How much impact is there if we were to vary the amount of absorption throughout the room to reduce (R)T₆₀?




Yes, there is clearly some impact. Notice that even if we dropped the average RT₆₀ from a reflective 0.6s room with windows and hardwood floors, to a well-dampened 0.3s room with multiple wall absorption panels, the Schroeder frequency drops "only" from 200Hz to 140Hz. That drop is meaningful, but in the big picture, absorption itself is not going to push the Schroeder frequency down too much and an audiophile will need to address these modes in our small rooms regardless.

Modal behavior is important to address but it's not the only thing. Let's model the audible frequency range of a "lightly treated" room with some furniture inside of 15'W x 18'D x 8'H dimension, RT₆₀ of ~0.5s. Let's place the speakers about 4' from the front wall and 3' from the sides, listener seated in the middle between those speakers, and 12' from the front wall / 6' from the rear along that 18' depth. Listener height set to 3.5'.

Here's a look at the hypothetical set-up and modal distribution to 300Hz using REW's Room Simulation (for even more sophisticated room mode calculations including non-rectangular spaces, have a look at AMROC Pro):


Here's an illustrative frequency response of what such a room might measure like from 20Hz to 20kHz:


The graph above is a simulation at the listening position if we put an idealized anechoically flat-measuring floor-standing loudspeaker measured at 1m with good bass extension down to 40Hz (blue) into that "lightly-treated" domestic 15' x 18' x 8' room (red).

As you can see, despite the basically perfect performance of the loudspeaker at 1m, without significant room treatments or DSP/EQ to address irregularities, by the time the sound hits our listening position, the room greatly has affected the sound! Given the magnitude of change, one might even be tempted to think that what we actually hear in most hi-fi systems is mainly determined by the room - again, the idea that the room is what actually sets the foundation for optimal possible sound quality.

[As an audiophile, if you have never personally measured the frequency response in your room, I would highly recommend doing so! Once you have a good look at the imperfections created by the space, I challenge you to still worry about comparatively meaningless tweaks and things like audio cables or whatever benefits stuff like audiophile-grade ethernet switches are supposed to provide.]

Let's examine that frequency response graph to see its main characteristics and anomalies.

1. Small room acoustics can be broadly categorized into 3 "zones" with practical implications. (Do not think of it as just two, as has been suggested.)

Anomalies in the lowest bass up to the Schroeder Frequency is dominated by excitation of spaced room modes that can be calculated by room dimensions as discussed above with contribution from RT₆₀. This we call the "Modal Zone".

Then there is a "Transition Zone" approximately between the Schroeder Frequency and 2 octaves up (4x Schroeder frequency). This is a challenging region where the physics is in flux with a mixture of increasingly concentrated room modes plus reflected sounds.

Finally, there is the "Quasi-Diffuse Zone" where frequency response is typically more controlled with decay times predictably similar across adjacent frequencies, and free from discrete resonances. (See here for further reading about these zones.)

[Some have argued that in a small room there is no true "diffuse field" since the space has to be large enough for the absence of any dominant reflections. This is hard in a small space, thus the label "Quasi-Diffuse" because even if not strictly dominant-reflection free, the frequency response is at least statistically uniform enough to look/sound diffuse.]

2. Notice that peak amplitude dips as we go from lower bass to upper treble. Air itself will attenuate amplitude by 0.1dB/m at 8kHz and 0.5dB/m at 20kHz in moderate 50% humidity (see calculator here). Furthermore, high frequency response attenuates further with tweeter beaming and off-axis drop-off. This is where speaker dispersion characteristics, seating position, and toe-in will make a significant difference.

When speakers are placed close to walls such as what I modeled above with them 3-4' from the back and side walls, we will see bass reinforcement from the boundary loading. If we place speakers very close to rigid walls, reflections increase the sound pressure by up to +6dB/wall. This mostly increases bass frequencies because even if it's a sealed woofer pointing away from the wall, low frequencies radiate omnidirectionally due to the long wavelengths whereas tweeters beam directionally forward. This, along with the modal peaks will make systems sound "boomy" if not properly addressed.

3. Speaker Boundary Interference Responses (SBIR) are acoustic distortions caused by speaker placement, not just room size (ie. the room modes); you'll also see the term "Allison Effect" used to describe this, named after Roy Allison's work.

SBIR specifically refers to soundwaves emanating from behind or to the side of the speaker, bouncing off the front or side walls, causing comb filtering (nulls and peaks) as it interacts with the direct forward waves. 

SBIR is fundamentally a radiation pattern issue and depends on how much energy the speaker directs towards each wall/boundary. While beyond our discussion here, dipole/open baffle, and omnidirectional speakers have different radiation pattern so be mindful of how these types of speakers need to be placed and the changes in peaks and nulls. Here's a comparison table of speaker types:

[While it's much easier with closed box speakers, another complexity to be aware of involves speaker ports, especially rear ports and front wall SBIR around the tuning frequency typically 40-60Hz (with significant output ~30-150Hz). Since port position and phase of the port output are different from the direct woofer, some of the SBIR nulls may actually be partially filled or shifted.

As for dipole and omnidirectional speakers, many are actually hybrid designs where bass frequencies are still typically handled by cone woofers +/- ports with commensurate radiation patterns. Be mindful of manufacturer recommendations for speaker proximity to walls.]

SBIR problems in small rooms typically can be seen starting around 70-100Hz with speakers placed 3-4' from walls.

Beyond SBIR, another important cause of distortion due to boundary interference is floor and ceiling bounce. Play around with this SBIR and floor/ceiling bounce calculator to estimate what you will hear/see in your own room. Between the ceiling and floor bounce, since the floor is typically the closer surface, this often dominates especially with larger speakers and woofers placed lower down (the lower the driver, the higher in frequency the acoustic bounce distortion). For example, if we're seated about 3.5' ear height, 8' from the speaker, and the woofer is 2' off the floor, the first null would be predicted at 350Hz. Thick carpets are always a good idea.

4. Make sure not to sit too close to the rear wall! This one is important because space is at a premium in small rooms and we're often tempted to push the listening seat as far back as possible. The problem with this is two-fold:

a. Bass intensity/pressure is greater near the boundaries; more bloated bass, sacrificed "tightness".

b. Comb filtering can color the midrange and higher frequencies. This is because the waves that travel behind the listener quickly bounce off the back wall since it's so close, reflect forward and interfere with the direct sound coming to our ears (sometimes called Listener-Boundary Interference Response - LBIR - like SBIR). 

For example, sitting 1' from the rear wall will create a first null at 281Hz (due to 1.78ms delay for the 1' travel behind head plus 1' forward towards ear mixing with direct sound) which is in the lower midrange, important for vocals, cello, piano (middle C is 261.63Hz), guitar "body" and even some of the kick drum "punch". A null here will color the timbre by "hollowing" out the sound. Comb filtering can further affect the 1-4kHz region which can cause issues with vocal intelligibility and because it represents a temporal phase distortion, will mess up soundstage stability and localization.

What does sitting 1' from the back wall instead of 6' do to the frequency response of the room model? Something like this:

Some audiophiles will add absorption behind the head to reduce the intensity of the reflections but realize that to even take ~40% off the amplitude of a reflected wave that would create the 281Hz null, you'd need something like 4" rockwool. The real solution of course is to make sure we move our sitting position away from that back wall; 3' (1m) would be a nice start - the first null will now be at 94Hz and within reach of bass room treatments and EQ.

Here's a table of the first null (usually the strongest) and spacing based on distance seated from the rear wall:


So, what to do?

As I noted earlier, in the pursuit of high-fidelity audio, enthusiasts often focus their attention - and their budgets - on those "visible" components: the vacuum tube amp, DAC, or the sculptural elegance of loudspeakers. There seems to be a prevailing, yet flawed, notion that if we can select a perfectly linear transducer and components feeding it, we might achievement complete "musical truth" (some kind of "absolute sound"?!). In reality, true high-end audio is more about the holistic interaction between good recordings, accurate source components (DAC, turntable, tape player), the loudspeaker, your acoustic environment, and hopefully your own receptive mental state. I hope the discussion above highlighted some of the reasons why the room is probably the most critical component; and most difficult to get right.

For this reason, many seasoned audiophiles will naturally comment on the "look" of the sound room when we see a system and notice glaring issues like reflective windows, hard floors, strange speaker positioning, and seats shoved against the back wall. These room deficiencies will stick out like the proverbial "sore thumb" and signal a more meaningful impact on sound quality than what is to be gained with an expensive D'Agostino amplifier, or fancy dCS DAC.

Let's talk practically then about what we might want to do...

In a small room, from 20Hz up to about 350Hz (through Schroeder Frequency), we need to pay attention to those room modes (Modal Zone). For those low frequencies, we'll need to aggressively deal with modal peaks typically with large porous bass traps especially in the corners to tame the "bass boom" in these high-pressure regions through absorption converting the energy to heat (mechanical energy). Find out more about bass traps at places like GIK Acoustics, Primeacoustics, Vicoustic, RealTraps, among others. DIY is an option for the handy audiophile. Room-correction DSP is also another powerful tool; one I've talked about extensively here, here, and here.

Between 250Hz to 2kHz, we want to reduce midrange distortion through the Transition Zone which is obviously a very important part of the audible spectrum for vocals and instruments as discussed above. Assuming you've done what you could for speaker placement and made sure you're not sitting right against the back wall, use a combination of absorption behind the speakers, at the first reflection points, a thick rug for floor bounce, and if possible ceiling absorption. Doing this can reduce those boundary distortions to the frequency response. Beyond conceptualizing this as just a frequency effect (time and frequency always intricately connected), we can appreciate the temporal benefits of reducing reflections especially 5-20ms delays. These spatially destructive reflections are close enough to be perceived as a single event rather than an echo (Haas effect), but late enough to carry directional information that conflicts with the direct wavefront, hence damaging soundstage.

Since we're dealing with the physics of soundwaves, when we want to affect lower frequencies, be prepared to get thicker absorption panels using materials with better absorption coefficients. Here's a graph of some commonly-used absorption material and how thick they have to be to achieve a good amount of absorption (α ≥ 0.75) across 25 to 300Hz.

Modeled with Delany-Bazley-Miki (1990) using estimated air flow resistivity I could find online for the Rockwool (Comfortbatt and RWA45) and Owens Corning 703 (OC 703) fiberglass.
Notice the diminishing returns as panels get thicker. Also, denser material like RWA45 can be counterproductive if too thick due to impedance match issues. Mixed density panels with lower density material deeper towards the wall can work better.

As you can see, in order to absorb down into the Schroeder region for small rooms, we need about 3-4" (75-100mm) of Rockwool/mineral wool or OC 703. Consumer acoustic foam is typically less dense and you'll need a thicker panel like 5" to achieve about the same effect as the Rockwool.

For serious audio physics geeks, check out the "Porous Absorber Calculator" to model multiple layers. For example, what does the absorption vs. frequency look like if we have 3" (75mm) of RWA45 Rockwool absorber + 1" (25mm) air gap vs. just the rockwool without air space?

Air layer clearly improves absorption especially ~200Hz.

Another suggestion to improve 250Hz-2kHz midrange quality is to add diffusion to the rear wall (again, don't sit too close to the wall!). A diffuser scatters the rear reflections in multiple directions. By doing this, you reduce the intensity of strong specular reflections that cause comb filtering. The classic mathematically defined diffuser is the "quadratic residue diffuser" (QRD) of various designs (not too expensive these days). How far down in frequency a QRD is able to effectively scatter is dependent on the maximum well depth with a 4"/10cm-deep diffuser capable of scattering down to about 850Hz as a lower limit but most effective for 1kHz and above.

At home in the sound room, a loosely packed LP rack behind the listening position can act as a diffuser - the key is that the LPs need to be arranged with some random depth variation.

c = speed of sound = 343m/s at sea level, 20°C.

A rack of LPs can achieve random scatter but not in the same way as the mathematically optimized and uniform QRD diffuser. Although I don't think LPs are the peak of audiophile fidelity, this could be another reason why they're still a desirable physical format. 😉

My loosely packed rack of LPs of random variable depth
behind main listening position for some diffusion.

In the higher frequencies from 1-8kHz, consider using just diffusers to reduce specular reflections especially in the range where human hearing is most sensitive (1-5kHz). At these frequencies, modest amounts of soft furnishings and the carpet already attenuate the sound passively. Excessive absorption can result in a "dead" or overly "dry" sound if too much is taken out of the "presence" and "brilliance" portions of the audible spectrum above 4kHz.

Beyond 8kHz, it really doesn't practically matter. Our hearing acuity drops off, wavelengths become smaller (4.3cm at 8kHz), nulls becoming spatially even narrower and effectively self-averaging with music listening.

As discussed before, beyond room treatments and basic EQ, there are sophisticated room-correction DSPs we could use to shape the frequency response and improve temporal accuracy at the listening position(s). We can also define for ourselves a steady state frequency target curve such as the Dolby Atmos Music Target. We can have a more thorough look at room target curves and preferences another time.

Every room is unique. In my opinion, no serious audiophile should go through life without at least having measured the frequency response in their sound room! Sure, one can argue that "ears are all I need" when it comes to enjoying music. However, it's highly unlikely that just using ears would accurately and thoroughly identify issues we can often fix. No serious acoustician would treat a room without examining measurements to verify their results. Likewise, I don't see why serious audiophiles should not aim for verifiable results.

If you like reading about sound room acoustics in general and want more (a lot more!), make sure to check out books like:

F. Alton Everest & Ken Pohlmann - Master Handbook of Acoustics (7th Ed., 2022)

Floyd Toole, Sean Olive & Todd Welti - Sound Reproduction: The Acoustics and Psychoacoustics of Loudspeakers, Rooms and Headphones (4th Ed., 2026)

Vincent Verdult - Optimal Audio and Video Reproduction at Home: Improving the Listening and Viewing Experience (1st Ed., 2019) is practical with good discussions about room set-up (book discussed here). A resource for both audio and video playback.

--------------------

To end, let's spend time with some music!

I mentioned the new Andrew Jones-designed Jones & Cerreta Troubadour electromagnet (field coil) speakers at the start of this post. I see that it received positives from Stereophile the other day in their AXPONA 2026 coverage.

I found it interesting the music they used to demo the speakers listed in the report. There's stuff like:

Francine Thirteen's "Queen Mary". I've heard this used in audio shows in the last few years including at the 2024 Pacific Audio Fest playing on the Songer Audio field coil speakers.

Then there's Hedegaard's "Ratchets" - intense stuff:

And Lalo Schifrin's "Blues in the Bassment" - highly dynamic (DR15):

It's interesting to see the change in music being used over time at audio shows. In the last few years, I think we're appropriately starting to see more electronica and newer genres; hopefully this will be interesting to new and younger audiences. That Hedegaard track has some intense bass and needs to be heard with subwoofers capable of <30Hz response. I can imagine how some audiophiles might not be used to or even comfortable with this kind of content! 😮

Ghost Rider's psy-trance "Make Us Stronger" has also been popular in the last few years. GR's track "Speed Of Soul" also sounds cool:

While it's nice to have variation in the music, I don't think there's anything wrong with demo'ing audiophile favorites like Diana Krall once awhile! At least one can apply the classic definition of the "absolute sound" when evaluating Diana's voice reproduced in an acoustic space, right? In contrast, I wouldn't know how to evaluate the absoluteness of that demon-robot voice in "Ratchets" (check out "INFERNO"). 😉

Here's "Let's Fall In Love" - I've always enjoyed the instrumental bridge in the middle:

Finally, here's a picture of Diana in concert in Vancouver at the Orpheum Theater the other night (April 22, 2026):

This was the last evening of her 29-city 2026 tour. She did a great job and the 3 encore selections they played were a joy.

Okay friends, finally I've cleared up much of the work I needed to attend to; off to Asia for an extended vacation. 🙂 Hoping with all the geopolitical tensions, the world doesn't run out of jet fuel over the next month so I can get back home!

Hope you're enjoying the music in a nice sound room, dear audiophiles.

No comments:

Post a Comment