Ok what’s the most relevant thing in integrating the system again? Ah yes .. it’s the delay. So while last time, we realized some important methodological aspects, there were still some things that did not appear to be as good as they could.
So what would that be then those? Did we not tweak the system perfectly already? Well ..
1.
_________________________________________________________________Binaural imbalance specifically for lower frequencies at around 90–103Hz.
This is a somewhat of a subtle thing, but still affects the stage appearance and sweet spot position somewhat. In the forementioned range, there is a bass bubble somewhat near the right speaker. It is induced by the room geometry and the responding impulses, so there is no way around that. It is simply so that the frequencies will sum up given by the room structure, but they do so slightly right of the listening position. Moving the components, desk, subwoofer, monitors etc. will NOT change this! So there were some compensational measures like moving the sub back below the desktop away from the left wall, and then slightly out of the center towards below the right speaker. This results in a better addition specifically in the given range, but it does not solve as mentioned the bass bubble.
2.
_________________________________________________________________Mikroattack signature of bass components.
This is a topic that is best solved by modelling the bass response and texture using a double bass blueprint, meaning, eg. jazz tracks with lots of double bass content, like eg. Nardis from Bill Evans.
So what is this, and why would you care? Lower frequencies evolve much slower, so to speak, due to their lower wavelengths. And they also carry more energy, which in turn has to be transported by the cables to the speakers. If you think a bit about this: it makes the timing relevant in very specific ways, namely that you hear misalignments quite clearly. With the higher frequencies, it may not be so obvious and one rather hears it in finer grained measures.
One would want to start with the fact that if you hear any boominess at a certain frequency, it is for real and your measure will not necessarily make it go away.
The following components influence the transient representation here:
A. The cable
B. The phase alignment
C. The timing
So while some people thing cables are hokus pokus, i am beyond the point of listening to what other people say when i know better for sure.
Cables do make a subtle difference when it comes to things, we mentioned this before: the human ear is sensitive at orders of magnitude that are hard or impossible to measure reasonably. And one of those aspects relates to the micro hull curve of a signal. There are some cables that have a quicker attack than others on a micro scale, which is something worthwhile knowing:
Note that those diagrams are not necessarily showing the physical time domain, but rather a psychoacoustics scale: amplitude is normalized and the time domain (x axis) is showing the perceived stimulation as a total estimated acoustic appearance. If you would only look at the differences in terms of physically correct runtime differences (only looking at the signal evolvement), you would rather see realy tiny differences in the μs ballpark. But this results in a different stimulation which can be rather modelled as show in the above charts.
Note that this is quite a subtle thing to understand here: in essence, the Mogamis are the faster cables here, which means the signal comes quicker and with steeper attack slopes, which means eg. the room modes are stimulated more, on a macrolevel. Note that we are talking about a few 0.1ms difference here. But this is what ultimately makes a difference with the cables, if you understand.
So what does that mean now? Depending on your setup, you need to know this and how to work with it. But it means in essence that the attack peak is about 0.1-0.3ms delayed with the EFF ISL Supra cables, and at the same time the distribution of energy is more even, but the damping factor is lower as well coming with this. This results in less dynamics, but possibly in a more textured bass representation. There is no wrong or right here. But it is a bit easier to handle in my experience, if you have a ‘flatter but wider microattack energy spike’: room modes are less exited.
A more punctual first impulse response is more likely to mask all the following impulses on a macroscale, specifically if you are trying to get the room response under control.
You may want to use double bass tracks like Nardis by Bill Evans,
preferably studio recordings with a high resolution and quality. What i ended up with to really nail it using bass reverberations, compared Life of Pai from Marc Johnson on both Beyerdynamics 880DT and my system, but only looking at the bass reverberations. This was some final heuristics to really nail it down. One could also do it with other tracks, and in the end that was only one of many different test tracks obviously. But it is hard to tell ‘what is correct’ etc.
Also note that this is somwhat dependant on the correct balancing of the upper structure, like lower mids and mids, if you want to decide the woodness factor for example. I recommend trying to concentrate on isolated aspects at a time, then zooming out again to the macrolevel. The reason for that is that you system will have to integrate on the time space scale dimension in the end, which means, all the way from the microattack level to the overall stage appearance and signature, the transitions between different scale changes will have to be transparent and invisible, just as your whole system. This is why you can consider microattack the smalles time unit, so to say, then you would also have instrument attacks in total (not the same thing, but related), and then, when it comes to audio systems, also the overall summation over the whole spectrum etc.
Note that this is an interesting way to name the gist of it: invisible transitions between different space time scale (not space time) dimensions where different scales of (timewise) change are playing a role, also in the phase or group delay domain etc.
It helps to know what is relevant on which scale: the phase domain is not really relevant on the microattack level, it also is not so much on the instrument attack or hull curve level, but it is on the frequency domain level etc.
To put this into a formula, one gets:
This is relevant to really be aware of what can be modelled using which measures on which time domain. Eg. does it not make so much sense to optimize the phase on a microattack level and so on.
The tricky part is that they form a gradient together and this makes it a complex optimization problem.
3.
_________________________________________________________________
Too expensive flattening of the amplitude curve by inappropriate usage of IIR filters
A bit related to the forementioned aspect of microattack dynamics and cables — fine tuning the microattack does only help so much if the resonances are not balanced out on a makrolevel, and that has, in first instance, not so much to do with the cable, but with aligning the delay on a millisecond level (order of magnitude x10) and also the overall position of components (speakers, subwoofer).
So i made the mistake to force my main modes to be levelled down to an acceptable niveau, but i bought this at the cost of phase twists induced by high q IIR filters, which are the only ones i can use in my minimalistic setup.
By correcting that, using flatter and lower filters, the bass representation, and thus also the whole stageing became again much more calm and even. Imaging bass runs with a perfectly even progression, no resonances where they do not belong.
Note that for 2. and 3., the limiting factor will be main room modes: you can not get rid of them, and for bass frequencies, this will be always an optimization topic, not a binary one. This maybe clarifies why 2. and 3. are so important, specifically in untreated rooms: you want to stimulate the room modes as little as possible, while you can never fully get rid of them. Imagine a bass run with longer standing deep bass waves – 2. and 3. cannot help you there, but in general, music is mainly composed by other sounds 95% of the time.
4.
_________________________________________________________________
Group Delay in the lower regions around the 2nd main room mode at around 40Hz.
This is a bit annoying and again, due to the room geometry / physics. Here is a picture of the setup using REW:
The picture above shows the main room modes (colored lines) given a certain impulse response in a certain setup, which is a model of the room measured in (REW Room Simulator feature): one can place the subwoofer and monitors in a room, defining its size and then get the calculated room response as shown above.
This comes in handy if you just want to know upfront what you can tackle and what you need to ‘work with’, or how this changes when you move things around. It can literally save you quite some time, and i wish i had known this in the beginning. What i gained now in terms of knowledge is that a 2nd subwoofer would not solve my dip at 34hz, unless i position it somewhere far away from the rme device.
So if in general, one places a 2nd subwoofer at an opposite side, the dip at 35hz would be somehow smoothed out, amongst a more even frequency distribution in other areas:
and this would be its position (lower left side box, obviously not to-the-wall-firing then):
The issue with that would be the cableing, unless i use an over-the-roof-cabling of about 5m, this is not possible in my room.
Ok but back to point 4.: the group delay peak at about 39hz has to do with the main room modes at about 42hz, and the energy dip at about 35hz–in between, somewhere at around 39hz, there is a slight peak in the gd which is obvious, but a limiting factor to be optimized for. I can get it to below 50ms at reasonable levels still.
5.
_________________________________________________________________
Fine tuning the delay (in a window of ±0.5ms).
It has been said before. But in general, this is something you may not find a reasonable thing in general. This is mostly related to 2., in terms of feasibility. Or to put it another way: if you did not solve the phase alignment and general timing and delay on a ms scale, please do not touch the fine tuning.
This only will make sense if you are looking for the micro attack to snap in.
At this level of detail, at the same scale, one may want to look at compensating for a delta in amplitudes between the left and right speaker and correct this in a similar ballpark region.
So now what is the gain out of those last percent measures? One would describe it as a silky signature, stable phantom centre, better staging and wider stereo imaging. In essence it is a different sound dimension that opens up. Would you be able to get this by using Dirac or even Trinnov Nova? Yes. But a much higher cost, both runtime and money wise.