Degree | Type | Year | Semester |
---|---|---|---|
2503758 Data Engineering | OB | 3 | 1 |
It is essential to have acquired a good mathematical basis as well as to have a good level of programming, mainly in Python.
The course aims to introduce the concepts of artificial intelligence that is based on obtaining knowledge, concepts and trends from the data. It is about training the student to be a “data engineer,” and it is one of the professions with the most future and most in demand today by large companies and technology start-ups. In fact, the growth in demand for this data engineering professional is expected to be exponential at the European level, mainly due to the growth in mass data generation. Thus, the main objective of the subject is that the student knows how to find a good solution (sometimes the best is impossible) to problems in different contexts of the treatises, from identifying the needs of representation of the knowledge and, according to this, apply the most appropriate technique (s) to automatically generate good mathematical models that explain the data with an acceptable error.
The contents chosen for this subject are the techniques and concepts that are used extensively in the industry, understanding it in its broadest concept. The algorithmic basis will be fundamental during the development of the subject that wants to have an eminently engineering approach, focusing on the use of the proposals without leaving aside the understanding of the mathematical foundations that support them. The algorithms and techniques shown are the fundamental basis for ‘traditional’ computational learning without which one cannot understand the techniques that will be developed in future courses. Not because they are basic, they are obsolete, on the contrary, they cover a wide range of applications and problems where they are fundamental. The student must be aware that this knowledge that is the spearhead of the state of the art has an inherent difficulty, involving considerable study and dedication, quantified in hours in the section of formative activities of this guide. . This is because in this subject not only some of the most important contents in the field of machine learning to become a data engineer are taught, but also a curriculum line is worked that allows to expand the range of jobs to which you can access after the degree, as well as lay the methodological bases necessary to do a Master in data engineering or artificial intelligence.
The objectives of the subject can be summarized in:
Knowledges:
- Describe basic computer learning techniques.
- List the essential steps of the different learning algorithms
- Identify the advantages and disadvantages of the learning algorithms that are explained.
- Solve computational problems applying different learning techniques to find the optimal solution.
- Understand the result and limitations of learning techniques in different case studies.
- Know how to choose the most appropriate learning algorithm to solve contextualized problems.
Abilities:
- Recognize situations in which the application of computational learning algorithms may be appropriate to solve a problem
- Analyze the problem to be solved and design the optimal solution applying the techniques learned
- Write technical documents related to the analysis and solution of a problem
- Program the basic algorithms to solve the proposed problems
- Evaluate the results of the implemented solution and assess possible improvements
- Defend and argue the decisions taken in solving the proposed problems
UNIT 1: INTRODUCTION
1.1 Basic concepts and bioinspired paradigms
1.2 History of computer learning
UNIT 2: REGRESSION AND CLASSIFICATION
2.1 Regression of numerical data: gradient descent
2.2 Regularization and logistic regression
2.3 Classification of numerical data: support vector machines
2.4 Decission trees, Radom forest
2.5 Bayesian Classification
UNIT 3: CLUSTERING AND SEARCH
3.1 Memorization: lazy learning
3.2 Recommender systems: Content-based vs. Collaborative filtering
3.3 Clustering: k-means and Expectation-Maximization
All the information of the subject and the related documents that the students need will be found in the page Virtual Campus (http://cv.uab.cat/).
The different activities that will be carried out in the subject are organized as follows:
Lectures
The main concepts and algorithms of each theory topic will be presented. These subjects suppose the starting point in the work of the subject.
Problem seminars
They will be classes with small groups of students, which facilitate the interaction, or of individual character, according to the cases. In these classes, practical cases will be considered that require the design of a solution in which the methods seen in the theory classes are used. It is impossible to follow the kinds of problems if the contents of the theory classes are not followed. The result of these sessions is the resolution of the problems that must be delivered on a weekly basis. The specific mechanism for the delivery, and the evaluation process, will be indicated on the web page of the subject (Charon space).
Annotation: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.
Title | Hours | ECTS | Learning Outcomes |
---|---|---|---|
Type: Directed | |||
Theoretical content | 22 | 0.88 | 2, 1, 3 |
Type: Supervised | |||
lab practicums | 16 | 0.64 | 2, 1, 3, 4 |
serminars | 10 | 0.4 | 2, 1 |
Type: Autonomous | |||
Setup an ddevelopment of practical projects | 52 | 2.08 | 2, 1, 3, 4 |
study | 28 | 1.12 | 2, 1, 3 |
Assessment activities and instruments:
To evaluate the achievement of the knowledge and skills associated with the subject, an evaluation mechanism is established that combines the assimilation of knowledge, the ability to solve problems, and significantly, the ability to generate computational solutions to complex problems, both group and individual.
To this end, the evaluation is divided into three parts:
- Content evaluation
The final grade of contents will be calculated from several partial exams:
Note Contents = 1 / N * Test_i
The number of tests may vary and will be at least 2. In order to have a content grade, the grades of each of the tests must be higher than 4.
The partial tests will be done during the course and will eminently be of conceptual content where to answer different questions about the content developed in the ‘theoretical’ sessions.
These tests are intended to be an individualized assessment of the student with their abilities to understand the techniques explained in class and assess the level of conceptualization that the student has made of the techniques seen.
Recovery tests. In case the content grade does not reach the appropriate level in any of the tests, to obtain a final grade sufficient to consider the achievement of knowledge, students can take the exam of the call of the subject and return to take an exam that evaluates the contents seen in the subject of the part (s) not passed. In case of appearing to raise note, the highest note prevails.
There are no validations in case the theoretical part had been passed in previous years.
- Evaluation of the work in the seminars of problems
The problems aim to cause the student to enter the contents ofthe subject continuously and from small problems that make him become familiar directly in the application of the theory. As evidence of this work, the presentation of a portfolio is requested in which he will have kept the problems he has been carrying out. This portfolio will have weekly digital delivery. The student will be able to self-evaluate continuously since he will have the solutions of each one of the sets of problems once finalized the period of delivery. Along with the tutoring hours in case of doubt, it is enough for each student to identify their weaknesses.
- Evaluation of the defense of solutions to problems
At least twice during the course, each student must defend, either orally or in writing, the solutions he has provided as a solution in the problem seminars. This assessment will be individual and may focus on a subset of problems.
Eventually, more in-depth problems can be posed in a non-targeted way to which a computational solution will have to be presented.
The final grade of the course is obtained by combining the assessment of these 3 activities as follows:
Final Note = (0.3 * Contents) + (0.5 * defense of computational solutions) + (0.2 * Portfolio)
Conditions to pass the subject:
The final grade of contents must be greater than or equal to 4 in order to pass the subject.
The grade of the portfolio and the defense of solutions (separately) must be greater than or equal to 6 to be able to pass the subject.
In the event that the grade, applying the formula of the previous section ("final grade of the subject"), was higher than 5 but the minimum required in any of the parts had not been exceeded, the final grade in the record will be a 4.5.
As many honors registrations will be assigned as the current regulations allow as long as the mark is higher than 9.0. The assignment of the registrations will be done following the order of notes. In case there are multiple candidates with the same evaluation likely to receive Md'H, additional activities will be proposed to determine the best candidate (s).
The student will be graded as "Not Evaluable" if he / she has no evaluated part of either the theoretical or practical contents.
Important notices:
- let copy;
- present a group work not entirely by the members of the group (applied to all members, not only those who have not worked);
- present as their own materials prepared by a third party,even if they are translations or adaptations, and in general works with non-original and exclusive elements of the student;
- have digital and / or communication devices (such as mobile phones, smart watches, camera pens, etc.) accessible during individual theoretical-practical assessment tests (exams).
- talk with classmates during individual theoretical-practical assessment tests (exams). - observe / look at the theoretical-practical evaluation tests (exams) of other classmates during the performance of the same, even if the copy has not been carried out.
- observe / look at the table, sheets, wall, etc. writings related to the subject during the realization of the theoretical-practical evaluation tests (exams) even if the copy has been proceeded.
The numerical mark of the transcript will be the lower value between 3.0 and the weighted average of the marks in case the student has committed irregularities in an act of evaluation (and therefore it will not be possible to pass it for compensation). In short: copying, copying or plagiarizing (or attempting to) in any of the assessment activities is equivalent to a SUSPENSION with a grade below 3.5.
Title | Weighting | Hours | ECTS | Learning Outcomes |
---|---|---|---|---|
Individual test | 30 | 7 | 0.28 | 2, 1, 3 |
Problem porfolio | 20 | 5 | 0.2 | 2, 1, 3 |
Problem solutions defence (code + presentation + follow-up) | 50% | 10 | 0.4 | 1, 3, 4 |
Web links
Basic Bibliography
Additional Bibliography