Whenever an optimization process declares its intentions of pursuing some goal, an adversary can conceive of degenerate cases to halt it in its tracks. This is true no matter the approach taken by the optimization process, as any assumption it makes can be subverted to veer it off course. For the search to be directed, it must be based on assumptions to drive it. It is a pointless exercise to search when you already have the answers, so the assumptions used must be an imperfect reflection of the answers. This separation — between the premises used and the goal’s complete description — provides the adversary all the ingredients it needs to craft a worst-case scenario.
Once an adversarial scenario is produced, the only choice an optimization process has is to repetitively bang its head on the wall, unaware of its blindspots. The only way to restart the process is for someone to actively go in and adapt its assumptions to circumvent the obstacle.
In contrast, a learning strategy that constantly undermines, overturns, and negates its own prior beliefs and premises can avoid this fate. An adversary cannot trap it forever using a single unrelenting task. It must come up with more and more things to straddle it with, again and again. No single specific challenge can end its hope. It can always keep chugging away at anything thrown at it. There is no guarantee that such learning strategy will always prevail, but — as long as the rules of the game allow for self-reproduction and self-representation in a universal way — it will always persevere.
Random search has no bias. It tries out everything, with no preference to where it lands. Evolution, done by random mutations on uncompressed program representations, does have a bias. Namely, toward local changes, as the cloud of mutations results in a normal distribution around its current state.
Blind evolution, as any biased approach, can be trapped and exploited by negating the premise of its bias: what if the correct solutions are faraway, beyond a ridge impassable to incremental change? Meanwhile, random search is impervious to this attack. You cannot negate the premise of its bias because it has none — it has an equal amount of belief in everything. Conversely: it does not assume a thing. Still, its purposeless net is being cast too wide and too shallow, making it impractical and infeasible to use in most cases.
Random search avoids bias by brute-force, allowing for all counterfactuals. A local search bias is more efficient in some cases, but can be fooled when the the negation of its assumptions holds true. Is the trade-off unavoidable? Can there be something in-between?
We’re looking for some strategy in which we do not have to resort to a fixed locality or an overambitious globality, but can instead use temporary fleeting biases. Some way to have “strong opinions, loosely held”: dynamic foundations that can be rearranged when required, a higher-level representation that pushes us in certain directions and forces us to follow-thorough yet is always malleable enough to not be actively hoodwinked, that allows us to escape, that is always aware to the breakdown of its assumption and to the possibility of the negation of its premises.
Once a separation exists between the true objective and the heuristic used to navigate towards it, no matter how small, an adversary can exploit it to deceive the agent into a blind alley.
This is not a theoretical and hypothetical argument made only to show the inherent limits of directed learning in the abstract. It immediately offers us a clear recipe for the construction of a channel with which we can speak in a way inaccessible for our optimization methods — making it impossible for them to follow us — like parents using a non-native language.
A synthetic reward is a heuristic approximation of the true goal and as such can be exploited by an adversary to misdirect any process aiming at following it away from its true objective. A natural reward, where the models receive feedback directly from the world itself, and not by proxy, cannot be manipulated in this way, as an adversary cannot change the rules of the universe.
Unfortunately, simply following outcomes, even natural ones, means learning can only be done in retrospect and after the fact, an imitation of what has already happened. There are many stories we can tell about the future direction of the world; it is unlikely that those who only anticipate what has already been seen will always be the right ones. Our models must take a stand, they must form an opinion, they must assume things never before realized, to have any chance of seeing true where it is that we are going.
There is merit, in some cases, to learning from scratch and disposing of the old. The learner, starting from a blank slate, sees a very idiosyncratic pattern of phenomena and a history of ideas. Consequently, the theories, explanations, and representations they may come up with will be uniquely their own, offering them a chance to form their own independent view of the world that might shed lights on areas never before traveled, offer an alternative view of the common knowledge, or make accurate predictions of the yet-unseen.
In contrast, a learned model cannot avoid closing the doors on many opportunities for discovery. It is by the same process: understanding the world through a lens predisposes to missing anything outside its scope. Only by discarding preconceptions and backtracking from existing efforts, by putting hard-earned methods in temporary limbo and unlearning rules of thumb can learned models open themselves to further discovery.
During the twentieth century one could scarcely avoid hitting limiting results in almost any field looked at, with metamathematics and computing leading the charge. But these limits were not only sobering, just as a border intent on prohibiting cannot resist defining what’s indeed allowed. When it came to the limits of knowledge these results arrived hand-in-hand with universal systems that could in principle express anything imaginable. Yet each had its own Achilles’ heel, one that felt almost intentionally and frustratingly handpicked.
Many of our known unknowns are reflections on the unknown unknowns: they give us systems that allow us to universally express everything we can think about, yet all are directly pointing to the existence of things beyond their reach.
Current Machine Learning best practice assumes that all local minima are roughly equivalent in their quality. In fact, seeing results vary wildly based on initial seed values is used as a code smell, hinting at malfunction, and, in other people’s code, whiffs of dredged p-hacking. This is backed empirically, but the explanation is sobering: it seems that model capacity and data quantity trump everything else, so why even bother at finding a global minimum.
Further, once we consider model size as part of our criterion, it’s clear from the pigeonhole principle and the definition of Kolmogorov complexity that there can only be a small number of minimal minima programs that truly capture a phenomenon. On the same grounds we can observe that the more idiosyncratic a situation, the fewer minimal minima we can find for describing it, and the more generalized, the more it can be compressed, expressed, and explained in different ways.
Things are different in problem spaces with shifting landscapes. Some minima having an equivalent current performance are more robust to change than others, or allow for quick escape hatches in the case of a calamity.
Generalization measured as performance on unseen observations is only a weak proxy to a far more vivid behavior: The adaptability to other unforeseen fitness functions, ones only indirectly similar to already encountered environments.