Intelligent selection
I've always been fascinated by emergence, or the process of complex, sophisticated new "things" emerging out of simpler building blocks. It was the reason I'd picked Computer Engineering as my major in college -- I wanted to understand why computers worked, not just how.
It might seem like a subtle distinction, but I view the how as relatively arbitrary details. Those details can be interesting, but for me the exciting bit is how the different subsystems fit together that create completely different new capabilities at the next order of magnitude up. For example, it's amazing that bits and logic gates are the fundamental units of computation which allow for you to read this blog post that I published on the internet (fun fact: in his master's thesis, Claude Shannon proposed using Boolean algebra to simplify relays, laying the foundation for all of digital computing!).
I've been thinking about this again more recently because the way that we've historically designed software architecture has been a top-down approach: we gather the requirements for the system today and make predictions about where our systems will need to be in the medium-term future, make tradeoffs depending on these requirements, and design the best system possible given constraints like time, cost, staffing, etc. These systems don't emerge, they're intelligently designed.
Contrast that with life, the quintessential example of emergence. For centuries, scientists and philosophers have been perplexed by what makes something alive vs not alive. What allows for life to emerge out of one set of conditions, and not in another?
Imagine a rock and its rigid, highly consistent internal molecular structure, and compare that against the molecular structures in your body which support all of your cells, tissues, organs. How can it be that two objects at the same scale can have such dramatically different levels in internal complexity?
So it seems that life in all of its variety, sophistication, and robustness can evolve blindly, but if we build software systems blindly we end up with spaghetti code. It's sort of weird, especially when we are intelligent beings writing this software, and we should be capable of intelligently selecting patterns such that we can do just as well as natural selection, right?
It's not quite that simple. Natural selection, while blind, is a highly intelligent process, but we don't always frame it that way. We take for granted that good adaptations propagate through a population while maladaptive mutations will die out. Once again, we can get into understanding the how of the mechanics for genetic inheritance and expression, but why is this happening?
Well, what makes something intelligent, anyways? I would argue that a system is intelligent if it has a way to store information, which is reflected in the system's behavior. The system should also be able to update the structure of its stored information (aka memory) when presented with meaningful new information. The determination of what is and is not meaningful information is an inherent property of the system, a result of the system's geometry.
For example, base pairs in DNA only have meaning in the context of a cell, can instruct the cell on how to build proteins, is preserved through replication, and can be updated in different environments or by random mutations. DNA is causal, it affects the function of the system, and it is selected for via natural selection. Less effective DNA will die off in a population.
So natural selection is intelligent because it stores, tests, and refines information across generations. The mere existence of the cell already tells us that its system had been selected for. The cell's memory, in the form of its DNA and other protein structures, encode the lineage that the cell had taken to exist in this point in time.
Back to the original question: what is a reproducing cell doing that a human writing spaghetti code isn't? Spaghetti code may temporarily be selected for since it can be faster to implement, but generally will not stand the test of time. Either the codebase will be matured by future maintainers, or the tech debt will catch up to the codebase's ability to evolve and extend, perhaps resulting in deprecation.
Well-designed software systems are typically designed top-down because the timescale of codebases is much smaller than the timescale of natural selection. Rather than blindly searching for an optimal implementation, we can reason forward within the problem space as we understand it, whereas natural selection explores by trial and error across generations.
We can thus view emergence as a search problem: given the set of possible configurations for a set of building blocks, what are the viable objects that can be produced by these building blocks? Which objects are good designs, and which objects are completely defective? Life converges on at least a local maximum of good design because it is functional and has successfully evolved more and more intelligent lifeforms over time. But is the current design a global maximum?
As the cost of implementation drops with LLM agents writing the majority of code, we may end up seeing "emergent" software design become a thing. Perhaps agents will implement many versions of a feature and choose the best one, "killing" off the other options. Once a system reaches a certain level of complexity, perhaps Claude will prompt itself to understand its codebase, write up a summary.md, and then prompt another agent to re-implement the codebase from scratch based on the feature set alone.
And if that happens, we may find that the most sophisticated software systems aren't the ones we designed: they're the ones that emerged.