AI for research: powerful servant, poor master
Research is a method for discovering the unknown, and it often clashes with our intuitive thought processes. Our minds are naturally equipped to detect patterns—even when those patterns are coincidental or observed only once—and to ascribe explanations based on these perceived patterns. In simple scenarios, this instinctual reasoning can yield true insights: for instance, recognizing that the sun has risen every day and correctly predicting it will rise tomorrow. Other times, instinctive reasoning leads us to close-enough conclusions: objects tend to fall toward the Earth, so we conclude that the Earth is the "natural place" for most objects. But in low-stakes situations, we might not even notice when our conclusions are far off the mark—such as believing the stars are part of an outer rotating shell of lights. It's unsurprising, then, that it took centuries to develop an effective method for systematic inquiry.
This method, broadly speaking, starts from established facts and then hypothesizes possible new facts. Logical implications from these hypotheses are deduced and tested against empirical data. The realm of known facts gradually expands by incorporating hypotheses that consistently align with observations; as knowledge grows, our repertoire of detectable patterns grows such that once-alien concepts integrate themselves into our intuition, and help build a new vista from which novel hypotheses are generated.
This process is challenging because (1) data is often polluted by circumstantial noise, and (2) there's always the possibility of important-but-unknown factors affecting our data. As a result, every "known fact" is somewhat obscured and cannot be held with absolute certainty. We acknowledge this by considering any fact as provisionally true. This explains why research is a slow, ongoing, and self-correcting endeavor: whereas intuitive reasoning ascribes ad hoc explanations, leaving those explanations susceptible to the obfuscation of noise and ignorance present in existing data, the research method posits hypotheses a priori and then seeks data that bears thereon; alignment of the a priori hypothesis to data is much less likely to be due to obfuscating factors and thus more-strongly support that hypothesis; using multiple tests in diverse contexts drives down the influence of the obfuscating factors for true facts by revealing the flaws in falsehoods.
After decades of being relegated to the realm of science fiction, futuristic artificial intelligence (AI) has just landed on the White House lawn with models that can approximate (and often surpass) human-level performance for many complex cognitive tasks…and it will likely continue making exponential strides in power and competence to eventually reach super-human levels. As AI matches and then exceeds human competence, it will become tempting for researchers to outsource thinking to AI. Already, we've seen AI consume and synthesize vast amounts of human-produced data to generate findings in hours that would take teams of humans years. But relying on AI to discover breakthroughs may forever be a mistake.
Consider how AI operates: it is trained on data and extracts patterns from it. This is similar to our intuitive reasoning, but differs in that, while we detect patterns based on the structure of our brains and senses, AI learns patterns based on some constraining function(s) that are encoded by the programmer. Just like our instinctual reasoning, this pattern recognition is effective in certain contexts. Since AI is computationally more powerful than human minds, we should expect AI to be better than we are in some contexts where a programmer can write an applicable constraining function—more stably detecting more subtle and complicated patterns. Just as with our own instinctive reasoning, this can lead to the discovery of new true facts: for example, when AI plays a never-before-seen move in chess that turns out to lead to a victory.
But, as we’ve established, this mode of fact finding becomes unreliable outside of certain, unexplored contexts—regardless of whether used by humans or AI. Specifically, it becomes unreliable as we approach and exceed the limits of the patterns we are equipped to detect within our existing data. Truly-new knowledge must be sought outside of the known. Surely, depending on the engineered constraining functions, the most powerful AI may likely conclude something like “the earth is the natural place for rocks”—or more likely, that it is some high-order polynomial function. In short, the problem is that the unknown tends to exist outside of the training data. Only by hypothesizing the unknown would allow an explication of gravity as the fabric of spacetime, or the presence of all life on earth as the temporary-terminal branches of a single tree that grows by a process of naturally-selected pruning.
One of my concerns is that, due to the superhuman capacities of AI, researchers will increasingly rely on it; but since its mode of operation may be incompatible with true discovery, many fields could stagnate. In my experience with data analysis, I've seen researchers and entire fields stagnate by over-relying on algorithmic technology nowhere near are capable as AI (e.g., ANOVA). Researchers blindly misapplying a narrow repertoire of (statistical) tools and constraining their studies to try to conform with the operation of these tools. But new knowledge often requires novel and bespoke methods. This is not to say that AI cannot complement and enhance research—its amazing ability to detect complicated and subtle patterns can help derive maximal knowledge from existing data—but overreliance on AI may be a handicap insofar as AI cannot reach into the unknown. Even for emerging ideas at the frontier of knowledge, AI may be of limited use because there is insufficient training data for nascent concepts. This has an interesting implication: if AI is able to expand humans’ capacity to generate knowledge, the surface area of the cutting edge—and thus, the very topics for which AI is of least use—will expand and thereby increase the demand for human researchers.
This may change someday. AI might advance to the point of autonomously discovering novel truths about the world. I believe this will require AI not only to learn how to think but also to imagine. Even then, the comprehension of knowledge is tied to cognitive architectures of the entities involved. My understanding of gravity is influenced by its effects on me and my ability to grasp the analogy of a deformable surface. Super-intelligent AI will have different—or perhaps no—cognitive architectures and may encode knowledge in ways entirely alien to us.
We may even be skeptical of super AI to reliably discover anything we recognize as “truth”. To some extent, the ability to define and detect truth is a function of values. I do not mean that truth is subordinate to the mind, but rather that the innate appeal and attainment of opportunities, and the threat and incurring of consequences, steers our efforts and hypotheses to that in which we recognize potential value—the value of wellbeing and avoiding hardship. What, if anything, do super AI’s value? It seems possible that entities which cannot experience the sequalae of opportunities and consequences may have no (meaningful) values and thus can become untethered to truth. And if they do value anything, how well will the values of super AI’s—entities which will not (at least, entirely) share our physical and psychological features—align with our own such that concepts of truth may be passed between them and us? Arguably, the only values instantiated in AI will be those we program. Thus, the knowledge generated by future AI researchers may never resonate with humanity.
In conclusion, while artificial intelligence presents unprecedented capabilities in processing and pattern recognition, its strengths are inherently tied to existing data and the constraints programmed into it. True discovery requires venturing beyond known patterns and data—an endeavor that relies heavily on human imagination, intuition, and the values that guide our pursuit of truth. Over-reliance on AI risks confining us within the boundaries of what is already known or detectable, potentially leading to stagnation in fields that thrive on innovation and novel insights. Therefore, it is crucial for researchers to balance the utilization of AI as a powerful tool with the irreplaceable human capacity for creative thought and hypothesis generation. By doing so, we can ensure that the expansion of knowledge remains a dynamic process, driven by human curiosity and enriched by AI, but not limited by it.