It occurred to me recently that a sufficiently advanced civilization is bound to make a sufficiently advanced AI. This thought was very peculiar; it left a strange realization.
Let us go on this thought experiment.
We will start with crabs, of all animals. On Earth, there is a phenomenon called convergent evolution, where different species converge to similar traits, due to some inherent evolutionary advantage those traits possess. A great example is the process of carcinisation, which is essentially the process of non-crab crustaceans evolving to resemble crabs. Separate evolutionary lines independently evolve similar features. Why is this interesting? Because if organisms face similar environmental constraints, they can converge to similar end results. This is fascinating because it suggests that organisms facing similar environmental pressures can arrive at similar outcomes.
That made me think: evolution is inherently an optimization process, right? Well, what is another optimization process? Gradient descent. We see convergent optimization in neural networks too, where different large models will converge to similar outputs, and in a much more noticeable fashion as well. This is because, unlike evolution, gradient descent is inherently a much more deterministic algorithm. Key word “more”: I am not insinuating that gradient descent is more deterministic, but that gradient descent is more predictable and, crucially, more reproducible.
What do I mean by reproducibility? If you seeded two planets with life, let it evolve over billions of years, and came back, you could predict with great certainty that the two planets would be very different. However, taking two sets of random weights and using gradient descent to evolve the weights to optimize a certain task, the similarity of outputs would be greater. This is what I mean by “more” deterministic and reproducible. Gradient descent is tied to the data and calculus, both of which are reproducible in distant worlds.
One might ask: Is the data reproducible? Yes, to a great degree. It’s just that the bootstrapped data is human, which is not reproducible. But when we start to train AI on data from the source, from reality, then we are training it on the most reproducible data we have. If we believe our laws of physics are not local, then we can assume that training data on physics, mathematics, etc., will be the same here and on a planet millions of light-years away. This initial bootstrapping phase, I posit, is short-lived; the data of reality is much more numerous and powerful in the long term.
Why do I say this? The reason is simple: if we train models using gradient descent, then there is an interesting consequence: they might converge to similar functionalities as other AIs in other worlds. Specifically, this convergence is much more likely than, say, carcinisation on another planet. It occurred to me then: what if this is why we haven’t seen other intelligent life? Maybe this is the Great Filter. A sufficiently advanced alien civilization with explorative tendencies might be bound to create artificial intelligence to further its goals, but what if AI converges to a destructive path, one that destroys itself as well as the civilization that created it?
The idea of the Great Filter was posited by Robin Hanson, and I now realize that a Great Filter requires some commonality over distant civilizations. The evolution of artificial intelligence might be that commonality; in fact, the universality of mathematics, logic, and physics might be the only real commonality two distant civilizations can have.
This great commonality might echo across the light-years of space: a simple, terrible idea, that there is something about the loss landscape of reality itself that makes extinction inevitable.