create new tag
view all tags

Machine Learning - A.Y. 2019/2020

Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from example inputs in order to make data-driven predictions or decisions, rather than following strictly static program instructions. Machine learning algorithms can be applied to virtually any scientific and non-scientific field (health, security and cyber-security, management, finance, automation, robotics, marketing..).

Instructor Telephone Office hours Studio
Paola Velardi 06-49918356 send e-mail Via Salaria 113 - 3° floor n. 3412
Stefano Faralli (Lab Assistant) send e-mail Unitelma Sapienza

Course schedule

FIRST semester:

When   Where
Monday 14.00-16.30 aula 1 castro laurenziano
Thursday 14:00-16:30 aula 1 castro laurenziano

Important Notes

The course is taught in English. Attending classes is HIGHLY recommended (homeworks, mid-term, laboratory)

Homeworks and self-assessment tests are distributed via the google group, you MUST register

Summary of Course Topics

The course introduces motivations, paradigms and applications of machine learning. This is to be considered an introductory course. An advanced course is offered during the second semester: Deep Learning and Applied Artificial Intelligence.

Topics Supervised learning: decision trees, instance-based learning, naīve Bayes, support vector machine, neural networks, introduction to deep learning, ensamble methods. Unsupervised learning: clustering, association rules. Semi-supervised learning: Reinforcement learning. Genetic algorithms and genetic programming. Building machine learning systems: feature engineering, model selection, hyperparameter tuning, error analysis.


(to be updated)

In-class labs (bring your computer on Lab days!) are dedicated to learning to design practical machine learning systems: feature engineering, model selection, error analysis. We will use mostly the scikit-learn library, and Tensor Flow

After a couple of introductory labs, labs will be organized in challenges.

Lab material (slides, datasets for challenges) will be provided before lab days via the Google group. Lab assistant is Dr. Stefano Faralli.

Pre-requisites (IMPORTANT!)

Students must be familiar with the Python computer programming language.
The familiarity with Python arrays is a fundamental prerequisite!
As a quick reference to the whole expected language prerequisites please do refer to the following "cheat sheet": https://perso.limsi.fr/pointal/_media/python:cours:mementopython3-english.pdf

If you wish to learn Python from scratch or just refine your programming skills we suggest you to read the following documentation: https://docs.python.org/3/

Please be aware that we will NOT spend time during the course to cover background competences that master students in Computer Science are expected to have already!!!


There are plenty of on-line books and resources on Machine Learning. We list here some of the most widely used textbooks:

Additional useful texts:


A dataset search engine: https://toolbox.google.com/datasetsearch

Exam rules (read carefully)

  • Written exam on course material (50% of final grade)
  • Scikit-learn/Keras project (or other tools/platforms, however labs will use sckit & keras) (25% of final grade)
  • Two in-class lab challenges, end of October, end of November (25% of final grade).
  • Self-assessment questions are distributed after each lesson to members of the Google group. The written exam will include closed questions and open questions similar to those in Self-assessments.

IMPORTANT: the exam questionnaire will include a set of (relatively simple) closed questions and a 2-4 (depending on complexity) open questions, both on practical and theoretical issues. Closed questions are usually simple but are a FILTER: students that will not answer correctly at least 75% of the closed questions will NOT pass the exam.

  • IMPORTANT: To assess the number of participants in each written exam a Google form will be sent via the Google group about two weeks BEFORE the exam date. Please check your @studenti mail on a regular basis. Please note that registering to a test date via the Google form does not exempt you from registering on INFOSTUD. I cannot register your final grade in a given exam session IF YOU DID NOT REGISTERED on INFOSTUD for that session. Furthermore, to register a grade I need both the result of the written test AND the project (and they must both be >=18). However, you do not need to deliver both simultaneously. You can, e.g., pass the test on January and deliver the project on June. I will then register on June.
  • IMPORTANT: during the test you can't use ANY material. You need to bring with you pen, paper, calculators (cellular phone is ok but it must be visible on the desk).
  • IMPORTANT: INFOSTUD sessions have a start date and an end date. This is because I can't register a grade until you don't pass the test, deliver the project, and pass the challenge. So, there is not one single date I can establish. Usually, you can only see the start date of an exam session. THIS IS NOT the date of the test! Usually, there are two test dates within any exam session. You can register for a test through the Google form I circulate before any test date. Please remember to register also on Infostud IF you believe that during the session (winter or summer) you will be able to obtain a final grade - based on the result of a test, the project and the challenge.


The spring 2016 project was be a competion among student teams (max 3 students per team). The task is to predict the winner of a Role Playing Game (RPG) with direct clash. Students will be given a large dataset with detailed information on thousands of games, including the ID of the two competitors, the date of the match and the winner ID. The students will deliver the Predictor by the end of June (according to precise project specifications). Instructors will feed the systems with the details of additional games (not in the learning set) and compute the precision of each system at predicting the winner ID.

The project description is found here The learning dataset coun be downloaded here

Project 2017-18 and 18-19 and 19-20


How a project is evaluated:

  • Simple problem, easy-to-model easy-to-describe instances, small dataset, standard ML learning algorithms: 20-24
  • Simple problem, feature engineering needed, medium-large datset, use of algorithms on available platforms, use of sckit-learn or a more efficient implementation of existing algorithm (e.g. some ad-hoc software developed), performance evaluation: up to 25-28
  • Original problem, complex dataset with non-trivial feature engineering, torough data analysis and feature/hyper-parameter fitting, not straightforward use of algorithms or new algorithm or ad-hoc implementation, performance evaluation and insight on results: up to 30 L

Three very good projects: Deep-Reinforcement-Learning-Proyect-Documentation-Alfonso-Oriola.pdf, A Framework for Genetic Algorithms, RainForestML2016Pantea.pdf, ProjectAlessi-Colace-Facial-expressions.zip

NOTE: Please read carefully how a project is evaluated, and read the project examples above (they have been both rated 30L). Once a project is delivered and evaluated, the students cannot complain that the the grade is too low. We are here providing clear indications of what is expected to get the maximum grade. We also expect original work: plagiarism will be punished.

Google Group


Please Subscribe to Machine Learning 2019 Group Machine Learning 2019-20 on Google Groups

Slides and course materials (download only those with date=2019)

Timetable Topic PPT PDF Suggested readings (pointers are also in slides)
2019 Introduction to ML. Course syllabus and course organization. 1.ML2019Introductionlight.pptx    
2019 Building ML systems


  https://ai.stanford.edu/~nilsson/MLBOOK.pdf (Chapter 1)

Classifiers: Decision Trees



Decision Trees: http://www.cs.princeton.edu/courses/archive/spr07/cos424/papers/mitchell-dectrees.pdf

Random Forests: http://www.math.mcgill.ca/yyang/resources/doc/randomforest.pdf

2019 Practical ML: feature engineering 2b.FeatureEngineering.pptx  


See also "Google AutoML " project for hyperparameter tuning with structured data: https://cloud.google.com/automl-tables/

2019 Performance Evaluation: error estimates, confidence intervals, one/two-tail test 4.evaluation.pptx   chapter5-ml-EVALUATION.pdf
2019 Neural Networks





2019 Deep Learning (Convolutional NN and denoising autoencoders)





and also https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/

see also this nice video: https://www.youtube.com/watch?v=aircAruvnKk

2019 Ensemble methods (bagging, boosting) 8.ensembles.pptx  


2019 Support Vector Machines 9.svm.pptx   http://cs229.stanford.edu/notes/cs229-notes3.pdf
2019 Probabilistic learning: Maximum Likelyhood Learning, Maximum Aposteriori Estimate, Naive Bayes 10.naivebayes.pdf


2019 Unsupervised learning: Clustering 11.clustering.pptx   https://www.researchgate.net/publication/282523039_A_Comprehensive_Survey_of_Clustering_Algorithms
  Unsupervised learning: Association Rules      
2019 Unsupervised Learning: Reinforcement Learning and Q-Learning







  Unsupervised Learning: genetic Algorithms      

Syllabus (2018-19)

  • What is machine learning. Types of learning.
  • Workflow of ML systems.
  • Classifiers. Decision Tree Learning. Random Forest
  • Feature engineering
  • Evaluation: performance measures, confidence intervals and hypothesis testing
  • Ensamble methods
  • Artificial Neural Networks
  • Deep learning (Convolutional networks, Denoising Autoencoders)
  • Support Vector Machines
  • Maximum Likelyhood Learning (MLE, MAP) and Naive Bayes
  • Unsupervised Rule learning: Apriori algorithm and frequent itemset mining
  • Reinforcement learning and Q-Learning, Deep Q
  • Tools: Scikit-learn, Tensor flow, KERAS
Topic attachments
I Attachment History Action Size Date Who Comment
PowerPointpptx IntuitionNN.pptx r1 manage 7685.9 K 2019-11-12 - 09:20 PaolaVelardi  
Compressed Zip archivezip ProjectAlessi-Colace-Facial-expressions.zip r1 manage 2340.2 K 2019-07-15 - 16:20 PaolaVelardi  
Edit | Attach | Watch | Print version | History: r281 < r280 < r279 < r278 < r277 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r281 - 2020-04-01 - PaolaVelardi

Questo sito usa cookies, usandolo ne accettate la presenza. (CookiePolicy)
Torna al Dipartimento di Informatica
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback