Degree | Type | Year |
---|---|---|
2502441 Computer Engineering | OB | 3 |
2502441 Computer Engineering | OT | 4 |
You can view this information at the end of this document.
It is recommended that in order to take this course, minimum competences had been achieved in the courses of Algebra, Calculus, Discrete Mathematics, Fundamentals of Computers, and Programming Methodology (first year), as well as of Artificial Intelligence, Statistics and Programming Lab (second year).
The Course on Machine Learning is embedded in the “Computing” mention, along with other subjects like "Knowledge, Reasoning and Uncertainty", "Computer Vision" and "Robotics, Language and Planning". Due to its contents, this subject is not only for students who follow the "Computing" mention, but indeed for any student of the Computer Engineering grade, since it is closely related to the subject of "Artificial Intelligence" in the second year. It is also highly recommended to have understood and feel manageable with the mathematical concepts explained in the subjects of "Calculus", "Algebra" and "Discrete Mathematics" of the first year, and "Statistics" of the second year, due to the strong mathematical content of this Course.
The course aims both to expand some of the topics developed during "Artificial Intelligence", and to introduce new problems associated with AI, mainly the learning of concepts and trends from data. It is about training students to be "data engineers/scientists", one of the occupations with the most brilliant future and most demanded by an increasing number of companies, including Facebook, Google, Microsoft and Amazon, to cite but a few. In fact, it is expected that the growth of the demand of these professionals in data engineering/science will be exponential at an international level, especially due to the growth in the generation of massive data. Thus, the main objective of the Course is to teach how to find a good solution (sometimes the best one is impossible) for different data analysis problems at different context,, based on identifying the best knowledge representation and applying the most appropriate technique to automatically generate good mathematical models that best explain the observed data with an acceptable deviation.
The contents taught in this Course are also given in the Universities of Stanford, Toronto, Imperial College London, MIT, Carnegie Mellon and Berkeley, to put just the most representative names. Therefore, on the one hand, thestudent gets an opportunity to achieve knowledge and skills comparable to those taught at the best universities. On the other hand, the student must be aware that this knowledge has an inherent mathematical difficulty, which involves considerable study and dedication. This is because in this Course not only the most important contents to become a data engineer are taught, but also a curriculum line is formed to allow the student to expand the range of jobs available after the Career, as well as giving the necessary methodological bases for carrying out a Master degree in data engineering/science or artificial intelligence.
If you are looking for a Course to open an international labor market, and to learn the most used machine learning algorithms in not only the great technological companies mentioned above, but also in many data analysis SME and spin-offs in our country, this Course will not disappoint if you put both attitude and aptitude.
The objectives of the Course can be summarized in:
Knowledge:
- Describe the basic techniques of computer learning.
- List the essential steps of different machine learning algorithms
- Identify the advantages and disadvantages of the learning algorithms.
- Solve problems by applying different machinelearning techniques to find the optimal solution.
- Understand the results and limitations of each learning technique in different case studies.
- Know how to choose the most appropriate learning algorithm to solve contextualized problems.
Skills:
- Recognize situations in which the application of machine learning algorithms may be adequate
- Analyze the problem to solve and design the optimal solution applying the learned techniques
- Write technical documents related to the analysis and solution of a problem
- Program the basic algorithms to solve the proposed problems
- Evaluate the results of the implemented solution and propose possible improvements
- Defend and argue the decisions taken in the solution of proposed problems
UNIT 1: INTRODUCTION
1.1 Basic concepts
1.2 History of machine learning
UNIT 2: DATA REGRESSION
2.1 Linear regression and gradient descent
2.2 Regularization and polynomial regression
UNIT 3: DATA CLASSIFICATION
3.1 Logistic regression
3.2 Support vector machines
UNIT 4: BIOINSPIRED REGRESSION AND CLASSIFICATION
4.1 Multilayer Perceptron
4.2 backpropagation
UNIT 5: GROUPING DATA
5.1 Data memorization: lazy learning
5.2 Data clustering: k-means and Expectation-Maximization
Title | Hours | ECTS | Learning Outcomes |
---|---|---|---|
Type: Directed | |||
MD0: Theoretical contents and seminars | 12 | 0.48 | 5, 6, 8, 3, 7 |
MD1: Problems resolution | 10 | 0.4 | 5, 2, 8, 7 |
MD2: Solution of Practical projects | 6 | 0.24 | 1, 5, 6, 2, 8, 4, 7 |
MD3: Kaggle Case | 12 | 0.48 | 5, 2, 8, 3, 7 |
Type: Supervised | |||
MD2: Project Programming | 6 | 0.24 | 1, 5, 6, 2, 8, 4 |
MD3: Kaggle Case | 12 | 0.48 | 1, 5, 6, 2, 8, 3, 4, 7 |
Type: Autonomous | |||
MD0: Individual study | 12 | 0.48 | 5, 6, 2, 8, 3, 7 |
MD1: Resolving problems (individually) | 24 | 0.96 | 5, 6, 2, 8, 3, 7 |
MD2: Solving practical cases (in group) | 16 | 0.64 | 1, 5, 6, 2, 8, 4, 7 |
MD3: Practical description in python of a Machine Learning case | 32 | 1.28 | 5, 2, 8, 7 |
All the subject information and related documents that students need can be found on the Caronte page (https://caronte.uab.cat/course/view.php?id=95), in the subject menu Machine Learning (102787). It will serve to be able to see the materials, manage the practice groups, make the corresponding deliveries, see the notes, communicate with the teachers, etc. To be able to use it, you must do the following steps:
In the development of the subject, seven types of teaching activities can be distinguished:
MD0 Presentation of theory content: Presentation of the theoretical content to be worked on in the subject. These contents must have been prepared before class by reading texts, searching for information, etc. The contents presented will be directly related to the problems, projects and seminars proposed in other teaching activities, so that they will be the basis on which other activities of the course will be developed. The contents will be found on the Caronte page (presentations and videos) and will consist of two parts: a presentation where the main theoretical and mathematical concepts related to specific computational learning tasks are exposed (this syllabus will be the basis of the exam theory of the subject, see evaluation section of this teaching guide), and a second part of code in python on Jupyter notebooks that exemplify the details of coding and libraries to implement in a practical case the main concepts seen in the previous hour . The students will then be able to watch the videos of the classes, download the presentations and the python notebooks and test all the codes on their computer, to do the necessary tests and to be able to play with the various parameters to finish understanding the reasons for the different performances and precisions that are achieved in a specific database with specific configurations of the algorithms explained in the subject.
MD1 Computational problem solving: Delivery of up to a maximum of 3 problems implemented in a Jupyter Notebook. All the theory topics will be accompanied by a list of notebooks, from which the student will have to work on the problem sessions and hand in optionally. These activities must allow the student to deepen their understanding and personalize the theoretical knowledge in a specific numerical case. Some examples of data that require the design of a solution in which the methods seen in the theory classes are used will be considered. It is impossible to follow the problem classes if you do not follow the contents of the theory classes. The result of these sessions is to achieve the necessary skills for solving problems that will have to be delivered according to the specific delivery mechanism that will be indicated on the subject's website (Caronte area).
MD2 Implementation of a short guided group project: Realization of 1 guided practice to deepen the applied aspects of the theory. The practical part of the subject will be completed with practical sessions, where the students will have to solve specific problems of a certain complexity implemented in python. These projects will be solved in small groups of 2-4 people, and where each member of the group will have to do a part and share it with the rest to have the solution end These working groups must be maintained until the middle of the course and must be self-managed: distribution of roles, work planning, assignment of tasks, management of available resources, conflicts, etc. Although the teacher will guide the learning process, his intervention in the management of the groups will be minimal. To develop the project, the groups will work independently and the practice sessions must be dedicated mainly by the teacher to monitoring the status of the project, indicating errors to be corrected, proposing improvements, etc. Doubts that may arise regarding the implementation of the practicals will be transmitted through the Caronte forum, where other students can answer them.
MD3 Solution of a Kaggle practical case: each group of 1-2 persons will create a jupyter notebook where the various steps taken to solve a Computational Learning problem should be explained. The projects will be applied to selected databases from the Kaggle platform (https://www.kaggle.com/search?q=machine+learning), and will consist of three parts: an explanation of the most important attributes of the database and of the attribute to predict/classify; brief description of the computational learning method applied, along with the chosen parameters; and a presentation of the results obtained. Examples of jupyter notebooks can be found in the following repository: https://datauab.github.io/
MD4 Consultations and doubts: Free hours for the student for consultations and tutorials on aspects in which he needs additional help from the teaching staff. All inquiries will be made online, through the subject's forum, or emails to teachers, for example. It will be appreciated that the students answer the doubts of their colleagues as well as that in these answers they provide information that helps in understanding the content of the teaching activities.
MD5 Evaluation activities: for each of the activities described above. See the assessment section of this teaching guide.
In the case of repeaters, if the teacher in charge is asked, the grades for the teaching activities they took the previous year will be validated, if they havepassed. Repeaters must retake the individual theoretical tests (MD0).
This year there is a special itinerary for international students. In this case, students should contact at the start of the semester with the responsible professors who will describe the methodology followed in the English itinerary, which is described in this section.
Transversal Competences
-T01 Habits of thought (T01.02 Developing the capacity for analysis, prospective synthesis): in autonomousand supervisedactivities (study of the MD0 theory, realization of the MD2 practices, realization of the MD1 problems, and description of an MD3 practical case)
- T03 Teamwork (T03.02 Assume and respect the role of the various team members, as well as the different levels of dependence on it; T03.03 Identify, manage and resolve conflicts): in MD2 practices, as an autonomous activity in its preparation and delivery, and as a supervised activity in its preparation and presentation in a seminar.
- T06 Personal attitude (T06.03 Generate innovative and competitive proposals in the professional activity): in autonomous activities (study of MD0 theory, participation in the subject forum in Caronte MD4), directed (resolution of practical MD2 projects) and supervised (analysis of a MD3 case study).
Annotation: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.
Title | Weighting | Hours | ECTS | Learning Outcomes |
---|---|---|---|---|
Delivery of problems | 25% | 0 | 0 | 5, 6, 2, 8, 3, 7 |
Individual theory tests | 20% | 4 | 0.16 | 5, 6, 2, 8, 3, 7 |
Written documentation, implementation and presentation of the Kaggle case | 35% | 2 | 0.08 | 1, 5, 6, 2, 8, 4, 7 |
Written documentation, implementation and presentation of the practical project | 20% | 2 | 0.08 | 1, 5, 6, 2, 8, 4, 7 |
h) Single evaluation
The single evaluation of the subject will consist of the following evaluation activities that will be necessary to hand in on the day of the second part of the theory exam:
- MD0, evaluation of the theory, 30% of the final grade, recoverable.
- MD1, delivery of problems, 30% of the final grade, recoverable.
- MD3, Kaggle case submission (github repository), 40% over final grade, non-retrievable.
For MD0 and MD1, the same recovery system will be applied as for the continuous evaluation: the theory recovery exam can be taken, and problems can be handed in through Caronte on the same day as the theory recovery.
In addition, the review of the final grade follows the same procedure as for the continuous assessment.
Web links
- Caronte: http://caronte.uab.cat
- Artificial Intelligence: A Modern Approach. http://aima.cs.berkeley.edu/
- Web of the UAB Library Catalogue: https://bit.ly/3xdcdFB
Basic bilbiography:
- S. Russell, P. Norvig. Artificial Intelligence: A Modern Approach. Ed. Prentice Hall, Second Edition, 2003.
Complementary bilbiography
- L. Igual, S. Seguí. Introduction to Data Science. Ed. Springer, 2017
- Bishop, Pattern Recognition and Machine Learning, 2007.
- Duda, Hart, and Stork, Pattern Classification, 2nd Ed., 2002.
- Marlsand, Machine Learning: an Algorithmic Perspective, 2009
- Mitchell, Machine Learning, 1997
- Ripley, Pattern Recognition and Neural Networks, 1996.
Related bilbiography
- Eberhart, Shi, Computational Intelligence: Concepts to Implementations, 2007
- Friedman, Tibshirani, The Elements of Statistical Learning, 2009.
- Gilder, Kurzweil, Richards, Are we spiritual machines? Ray Kurzweil vs. the Critics of Strong AI, 2011
- Kurzweil, The Singularity is Near: When Humans trascend Biology, 2006
- Rosen, Life Itself: A Comprehensive Inquiry into the Nature, Origin, and Fabrication of Life (Complexity in Ecological Systems), 2005
- Witten,Frank, Hall, Data Mining: Practical Machine Learning Tools and Techniques, 2011
The software required will be the Python programming language, a programming environment (such as Spyder, Pycharm or Visual Studio Code), the Jupyter Notebook web application, and the libraries needed for data analysis: scipy (contains NumPy, matplotlib, pandas), sklearn and Seaborn.
Name | Group | Language | Semester | Turn |
---|---|---|---|---|
(PAUL) Classroom practices | 441 | Catalan | first semester | morning-mixed |
(PAUL) Classroom practices | 442 | Catalan | first semester | morning-mixed |
(PLAB) Practical laboratories | 441 | Catalan | first semester | morning-mixed |
(PLAB) Practical laboratories | 442 | Catalan | first semester | morning-mixed |
(PLAB) Practical laboratories | 443 | Catalan | first semester | morning-mixed |
(PLAB) Practical laboratories | 444 | Catalan | first semester | afternoon |
(TE) Theory | 440 | Catalan | first semester | morning-mixed |