Back to news

Pi Center develops AI project in partnership with Globo.

Integrantes do Centro Pi apresentaram projeto para técnicos da Globo.

Aware of the prospects in the entertainment sector, the Pi Center (IMPA's Center for Projects and Innovation) has collaborated with Globo on a project focused on the automatic extraction and enrichment of metadata using artificial intelligence models. This work has the potential to support the company in its business challenges, enabling improvements to its recommendation systems.

This month, researchers and students working at the center visited the company to present the project to some of their technical teams.

Read more: Prolímpico reaches its fifth edition in August
The list of successful candidates for the IT Specialist position has been released.
Former IMPA student selected for PhD program in the USA.

Along with the data provided by Globo, the group aggregated multiple metadata databases to build its system, containing more than 1.5 million films and series. From this vast amount of information, the researchers and students developed machine learning algorithms to extract keywords capable of accurately and extensively characterizing each title. Applying these descriptions, the group developed methods for recommending titles based on keywords and other available technical information.

IMPA researcher Paulo Orenstein presented an executive summary of the project. “Our primary objective was to extract a deep understanding from a vast amount of information and, from that, create a consistent and interpretable recommendation system that will generate value for Globo's consumer audience.”

In addition to him, the meeting was attended by researcher Roberto Imbuzeiro, technologist Roberto Beauclair, postdoctoral researcher Lucas Nissenbaum, and fellows Alex Akira, Lucas Resende, Lucas Schwengber, Thiago Ramos, and Rodrigo Schuller.

Bolsistas do Centro Pi Lucas Resende, Lucas Schwengber, Thiago Ramos e Alex Akira.

The first stage consisted of enriching the data provided by Globo, when collaborators from the Pi Center compiled and aggregated multiple metadata databases. Alex Akira, one of the project members, emphasized that "aggregation errors have a very high cost in all subsequent stages." Therefore, the group dedicated itself to eliminating recurring problems in this area, such as the dissociation of different films that have much data in common.

Keyword extraction is another fundamental pillar of the system developed by the Pi Center. In addition to developing its own models to accurately link content, the group distributed keywords into different categories, such as genre, themes, technical structure, people involved, places depicted, and much more.

“Having a broad database of content metadata is fundamental to our business, and being able to enhance this database through automatic extraction is a major advancement. Furthermore, establishing partnerships with relevant research centers like IMPA accelerates development and brings significant value to our data-driven solutions,” says Carlos Octávio Queiroz, Director of Corporate Strategy and Architecture at Globo.

In addition to its contribution to the market, the project generated inspiration for new ideas in academic research, highlighted Lucas Nissenbaum. “Working on the system, we started thinking about new approaches to the record linkage problem, a classic problem in data science that seeks fast and accurate ways to aggregate databases about the same objects, but with different information in each one. It's a very mathematically rich field, where we can make a concrete contribution based on the work we developed this semester,” he explained.

Read also: INCTMat will fund scientific events with up to R$ 20,000.
Viana discusses sphere packaging in a column in Folha.