Homunculi

A heavily modified Mixture-of-Recursions architecture with brain-region-inspired processing layers. Small models trained. Hints of Theory of Mind emerged.

incubating aipythonpytorchresearcharchitecture

Source on GitLab.

Where this came from

The starting point was the Mixture-of-Recursions research paper — the idea that a single transformer block, applied recursively with different routing decisions at each pass, can do useful things that a stack of distinct layers struggles with. I started modifying the architecture based on my own ideas, and the direction those ideas took came from two places: cognitive science, and watching LLMs fail.

LLM failure modes look a lot like psychological dysfunction when you stare at them long enough. Not metaphorically — structurally. The attention patterns, the failure to maintain context, the way models confabulate under uncertainty — these map surprisingly cleanly onto things we understand about how human cognition breaks down. That observation made me wonder whether the solutions that evolution landed on for those problems might be worth taking seriously architecturally.

Not cargo-culting biology. Not “insert emotions here.” More: these brain regions exist because they solve real computational problems, there must be a mechanical reason we evolved them this way, what happens if we build functional analogs of those mechanics into the processing pipeline?

What I actually built

Functional analogs — emphasis on functional — of four regions: the prefrontal cortex, temporal lobe, amygdala, and neocortex. In terms of how they participate in processing tokens at each recursive pass, not as a literal simulation of neuroscience.

The prefrontal component handles planning and inhibition across recurrences. The temporal handles contextual binding over longer ranges. The amygdala equivalent acts as a kind of salience gate — flagging what matters enough to route differently. The neocortex components are basically just experts.

They interact with the recursion routing, so the model has some capacity to decide how deeply to process something and through which pathways.

I used a packet for neuromodulation that influences rather than forces hyperparameters values to be considered during the forward pass, that then helps feed into the layers training themselves. It’s all very meta.

What happened

It worked. Small models converged. They were dumb — these are genuinely small — but the behaviour was clearly pointing at something. Hints of Theory of Mind came out unprompted, which was the thing that made me sit up. That’s not confirmed, mind you, just was consistent enough to make me go “huh, something is happening here that’s weird and unexpected.” It seemed to be consistently thinking about how the other person might be thinking.

I also built out a fairly extensive metrics system to get snapshots of what the internal state looks like at different stages of training. Partly for debugging, partly because if you’re going to make claims about what a model is doing internally, you should actually be able to look at it.

Current state

The model architecture trains. The blocker is the training application I built alongside it — a larger system intended to make the whole thing reproducible and properly instrumented. That’s currently a complication rather than a help. Training continues to work, but the script is in a state that makes it hard to iterate on and collect metrics reliably.

Partially on hold while I untangle that and figure out the next phase of scaling. The architecture isn’t going anywhere.