35th CBM: Project uses machine learning to combat sexual abuse.
In the thematic session “Vision and Computer Graphics”, held this Monday (28), at the 35th Brazilian Mathematics Colloquium (CBM) , Professor Leo Ribeiro from ICMC/USP, Institute of Mathematical and Computer Sciences of the University of São Paulo, presented a project that can help in the detection of sexual abuse without access to sensitive data. Based on machine learning and neural networks, the work proposes training models without using data from the Sectoral Committee for Access to Information (CSAI).
To preserve sensitive information, they trained the models with characteristics, not images. “The idea is to extract statistical data, mostly automated, from the Federal Police database. How many people are usually in these photos? What is the perceived gender of the people who appear in them? What is the average age of the people? Do we usually have children and adults? Do we usually have only children? Do we usually have only adults in these images? This type of general information that doesn't describe any individual image is something we can obtain,” Leo explained.
Without access to confidential data, the idea is to train models, from a test database, that can be used in real-world tasks. “ We will have limited hardware, so these networks can't be very large. They can't be LMMs , for example, unless they are very quantized; that's the maximum the police can run locally. We can't use the cloud at all. And we also have general problems with the appearance of the images. So, low lighting, the distribution of real images is very different from the distribution of public data that we have to train our models.”
The research aims to address several problems. Among them are the glorification of crime and pedophilia through the dissemination of these images; and also the psychological harm to law enforcement officers who have to deal with this database and the storage of these images. The team seeks to minimize the problem through automatic recognition, screening, and data analysis.
IMPA doctoral student Daniel Perazzo attended the lecture and drew attention to the strategies for obtaining data without resorting to sensitive materials. “I found the lecture very interesting because of the importance of the problem and also because of how the team overcame the fact that they couldn't access the data. How they cooperated with the Federal Police and how the professor's team developed a very interesting technique for the problem they had.”
Read more: 35th CBM brings together leading figures to discuss scientific advancements