Back to news

What is the learning capacity of a machine?

Imagem: Freepik

Reproduction from the IMPA Science & Mathematics blog, from O Globo, coordinated by Claudio Landim.

André Carlos Ponce de Leon Ferreira de Carvalho – Vice-Director of the Institute of Mathematical and Computer Sciences of the University of São Paulo (ICMC-USP)

As I have already mentioned on this blog, one of the main areas of Artificial Intelligence (AI) is Machine Learning (ML), which studies how a machine can learn a task from a set of data. The machine can be any physical device capable of receiving, manipulating, storing, and sending data, such as your cell phone. But what can the machine learn? What is its learning capacity? Recent studies by a group of mathematicians working with ML indicate that this learning capacity is related to a logical paradox, known as the continual hypothesis, discovered by the Austrian mathematician Kurt Gödel almost 100 years ago [1].

A paradox is a logical statement that seemingly contradicts itself. An example is the phrase "every rule has an exception," or "I must be cruel in order to be kind." Two Irish playwrights are famous for creating paradoxes: George Bernard Shaw and Oscar Wilde. We Brazilians also like to create paradoxes, such as this one, attributed to the writer Millor Fernandes: " The paradoxes between technique and reality are blatant and indestructible: we remain young in old photos and old in new photos ."

Well, let's go back to mathematics and AI. In this paradox, Kurt Gödel showed that, using standard mathematical language, the ability to learn cannot be proven either true or false.

In 1870, one of the fathers of set theory, the Russian mathematician Georg Cantor, who lived in Germany for much of his life, stated that not all infinite sets are created equal, that the set of all integers (countable) is smaller than the set of all real numbers (continuous). Georg Cantor also stated that there is no set larger than the set of integers and smaller than the set of real numbers, which is the continuum hypothesis.

In 1940, the American mathematician Paul Cohen demonstrated that, using the axioms of set theory, the continuum hypothesis cannot be proven either true or false. An axiom is a statement that we assume to be true, used as a starting point for an argument.

Gödel and Cohen's studies on the continuum hypothesis imply the existence of parallel mathematical universes. In one of them, the continuum hypothesis agrees with the axioms of set theory, and is therefore true. In the other, it contradicts the axioms, and is therefore false. But what does this have to do with AM?

In [2], the authors define the ability to learn as the ability to make predictions about a large dataset (test dataset subset) from a small dataset (training dataset subset). This selection of the training subset, called compression by the authors, should summarize, represent, the most relevant characteristics of the original dataset.

The connection to Cantor's continuum hypothesis paradox is that there are infinitely many ways to choose the training set . If the continuum hypothesis is true, it is possible to find the correct predictive model, capable of exhibiting good predictive performance for unseen data. That is, a good generalization capacity. If the hypothesis is false, no finite training set will be sufficient.

To read the full text, visit the newspaper's website.