Google’s AI beats humans in decades-old math problem

On December 14th, 2023, Google published a paper in nature, introducing their AI model, FunSearch, which searches for new solutions in maths and computer science. FunSearch works by pairing a pre-trained LLM (they use Google’s PaLM 2, similar to chatGPT) with an automated “evaluator”, which challenges the GPT’s responses for hallucinations and incorrect ideas. By “conversing” back-and-forth between the GPT and the evaluator, the initial responses evolve over iterations.

Google did not clarify if the evaluator was another GPT instance. However, this concept of evaluating one GPT’s response by another GPT acting as an evaluator is not a new one. There is a community of GPT engineers that have coined the term GPT Agents, where each GPT agent is delegated a certain role, and multiple GPTs are linked together to communicate with each other. The more specific of a role GPT is assigned, the higher the accuracy it is at performing it. Through FunSearch, Google has validated the immense potential that chaining GPT models with an evaluator can be.

What makes FunSearch unique is its capability to provide insights as to how its solutions were outputted, rather than just giving an answer. FunSearch inherently outputs answers in programming languages, which means there is full logical transparency at every single line.

Changing the field of mathematics

FunSearch discovered new solutions for a longstanding open problem in mathematics, called the “cap set” problem. The problem consists of finding the largest set of points (cap set) in a high-dimensional grid, where no three points lie on a line. The number of possibilities that mathematicians need to consider is greater than the number of atoms in the universe.

The field of mathematics and computer science is complex. Finding new solutions to longstanding problems are difficult and hard to come by. However, FunSearch has discovered new, more effective solutions to problems in the space.

“This is the first time anyone has shown that an LLM-based system can go beyond what was known by mathematicians and computer scientists”
Pushmeet Kohli (@Google Deepmind London)

Applications in medicine

FunSearch serves as one of the first examples highlighting that once we can place measures to get around hallucinations, the potential for new scientific discoveries is truly immense.

In the field of medicine, we cannot rely on information that has no factual ground. Hallucination is one of the biggest challenges preventing adoption of LLMs in practice. What if we can have chatGPT only refer to factual sources for responding to questions? What if we can have chatGPT be designed from the ground up, by physicians, who put trust at the center of its values?

Ever since November of 2022, when chatGPT was released, the whole world jumped to building AI solutions. There are currently several healthcare companies that are using the latest innovations in LLMs to solve problems in healthcare. You may have heard some already, companies such as AvoMD or Abridge AI.

We don’t need to wait for the next decade. Year by year, there will be more innovations and advancements in the application of LLMs in healthcare. These technological leaps promise to revolutionize our approach to patient care, and research. The future of AI in medicine looks bright, and I’m excited for what’s to come in 2024.