Applied Data Science

This discipline will begin by introducing data manipulation and cleaning techniques, as well as the fundamental abstractions and data structures for data analysis. Next, advanced ways of data visualization will be presented. Applied machine learning will be discussed according to its techniques and methods, and it will explain why it is different from descriptive statistics. Data size, data clustering, and cluster evaluation will also be topics discussed in this course. Examples of predictive modeling methods will be presented to understand issues related to data generalization (e.g. cross-validation and overfitting). Advanced techniques on building sets and practical limitations of predictive modeling will also be topics in this discipline. The basics of text mining, including regular expression manipulation, text cleansing, and text preparation for use in machine learning processes, as well as natural language processing methods and text classification, will also be topics discussed through exercises and examples in this discipline. Finally, we will present network analysis techniques, the concept of connectivity versus robustness, centrality, and intertwining.

VANDERPLAS, J. – Python Data Science Handbook: Essential Tools for Working with Data (1st ed.). O’Reilly Media, Inc., 2016.

Note: This course is offered as a master’s degree. At the PhD, she has additional requirements.


* Standard program. The teacher has the autonomy to make any changes