[R][D] A Quiet Bias in DL’s Building Blocks with Big Consequences

TL;R: eep learning’s fundamental building blocks — activation functions, normalisers, optimisers, etc. — appear to be quietly shaping how networks represent and reason. Recent papers offer a perspective shift: these biases drive phenomena like superposition — suggesting a new symmetry-based design axis for models. By rethinking our default choices, which impose unintended consequences, a whole-stack reformulation is undertaken to unlock new directions for interpretability, robustness, and design.

Symmetries in primitives act like lenses: they don’t just pass signals through, they warp how structure appears – a 'neural refraction' – even the very notion of neurons is lost.

Showing just the activation function reformulations, standard ones (anisotropic) while new isotropic-tanh right

This reframes several interpretability phenomena as function-driven, not fundamental to L, whilst producing a new ontology for deep learning's foundations.

Swapping the building blocks can wholly alter the representations from discrete clusters (like "Grandmother Neurons" and "Superposition") to smooth distributions – this shows this foundational bias is strong and leveragable for improved model design.

The 'Foundational Bias' Papers:

Position (2nd) Paper: Isotropic eep Learning (IL) [link]:

TL;R: Intended as a provocative position paper proposing the ramifications of redefining the building block primitives of L. Explores several research directions stemming from this symmetry-redefinition and makes numerous falsifiable predictions. Motivates this new line-of-enquiry, indicating its implications from model design to theorems contingent on current formulations. When contextualising this, a taxonomic system emerged providing a generalised, unifying symmetry framework.

Primarily showcases a new symmetry-led design axis across all primitives, introducing a programme to learn about and leverage the consequences of building blocks as a new form of control on our models. The consequences are argued to be significant and an underexplored facet of L.

Predicts how our default choice of primitives may be quietly biasing networks, causing a range of unintended and interesting phenomena across various applications. New building blocks mean new network behaviours to unlock and avoid hidden harmful 'pathologies'.

This paper directly challenges any assumption that primitive functional forms are neutral choices. Providing several predictions surrounding interpretability phenomena as side effects of current primitive choices (now empirically confirmed, see below). Raising questions in optimisation, AI safety, and potentially adversarial robustness.

There's also a handy blog that runs through these topics in a hopefully more approachable way.

Empirical (3rd) Paper: Quantised Representations (PPP) [link]:

TL;R: By altering primitives it is shown that current ones cause representations to clump into clusters — likely undesirable — whilst symmetric alternatives keep them smooth.

Probes the consequences of altering the foundational building blocks, assessing their effects on representations. emonstrates how foundational biases emerge from various symmetry-defined choices, including new activation functions.

Confirms an IL prediction: anisotropic primitives induce discrete representations, while isotropic primitives yield smoother representations that may support better interpolation and organisation. It disposes of the 'absolute frame' discussed in the SRM paper below.

A new perspective on several interpretability phenomena, instead of being considered fundamental to deep learning systems, this paper instead shows our choices induce them — they are not fundamentals of L!

'Anisotropic primitives' are sufficient to induce discrete linear features, grandmother neurons and potentially superposition.

  • Could this eventually affect how we pick activations/normalisers in practice? Leveraging symmetry, just as ReLU once displaced sigmoids?

Empirical (1st) Paper: Spotlight Resonance Method (SRM) [link]:

TL;R: A new tool shows primitives force activations to align with hidden axes, explaining why neurons often seem to represent specific concepts.

This work shows there must be an "absolute frame" created by primitives in representation space: neurons and features align with special coordinates imposed by the primitives themselves. Rotate the basis, and the representations rotate too — revealing that phenomena like "grandmother neurons" or superposition may be induced by our functional choices rather than fundamental properties of networks.

This paper motivated the initial reformulation for building blocks.

Overall:

Hopefully, an exciting research agenda, with a tangent enquiry on symmetry from existing GL and Parameter Symmetries approaches.

Curious to hear what others think of this research arc so far:

  • What reformulations or consequences (positive or negative) interest you most? Any implications I've missed?
  • If symmetry in our primitives is shaping how networks think, should we treat it as a core design axis?

I hope this research direction may catch your interest for future collaborations on:

iscovering more undocumented effects of our functional form choices could be a productive research direction, alongside designing new building blocks and leveraging them for better performance.

Leave a Reply