Logo UAB

Learning and Natural Language Processing

Code: 106585 ECTS Credits: 6
2024/2025
Degree Type Year
2504392 Artificial Intelligence OT 3
2504392 Artificial Intelligence OT 4

Contact

Name:
Joaquin Cerdà Company
Email:
joaquin.cerda@uab.cat

Teaching groups languages

You can view this information at the end of this document.


Prerequisites

There are no official prerequisites but it is recommended to have completed the subjects of Fundamentals of Programming I and II, Fundamentals of Mathematics I and II, Probability and Statistics, Data Engineering, Fundamentals of Machine Learning, and Fundamentals of Natural Language.


Objectives and Contextualisation

This course provides an overview of the Natural Language Processing (NLP) applications, from classical approaches for text processing to advanced methods for person-computer interaction. This course covers both machine learning and deep learning techniques for NLP, considering both text and speech processing.

By the end of this course, students will be able to:

  • Understand the fundamental concepts and techniques used in NLP.
  • Implement and evaluate various NLP techniques using Python and popular NLP libraries.
  • Apply NLP methods to real-world problems and interpret the results.

Competences

    Artificial Intelligence
  • Analyse and solve problems effectively, generating innovative and creative proposals to achieve objectives.
  • Design, implement, analyse and validate efficient and robust algorithmic solutions to computational problems derived from the design of intelligent systems.
  • Develop strategies to formulate and solve different learning problems in a scientific, creative, critical and systematic way, knowing the capabilities and limitations of the different existing methods and tools.
  • Identify, analyse and evaluate the ethical and social impact, the human and cultural context, and the legal implications of the development of artificial intelligence and data manipulation applications in different fields.
  • Identify, understand and apply the fundamental concepts and techniques of knowledge representation, reasoning and computational learning for the solution of artificial intelligence problems.
  • Introduce changes to methods and processes in the field of knowledge in order to provide innovative responses to society's needs and demands.
  • Know and apply the techniques of natural language processing for the exploitation of data of linguistic nature and the creation and evaluation of the components of language-based AI systems.
  • Know and efficiently use techniques and tools for representation, manipulation, analysis and management of large-scale data.
  • Work cooperatively to achieve common objectives, assuming own responsibility and respecting the role of the different members of the team.

Learning Outcomes

  1. Analyse and solve problems effectively, generating innovative and creative proposals to achieve objectives.
  2. Design and apply semantic reasoning techniques.
  3. Design and evaluate learning-based solutions for detecting and extracting linguistic and semantic patterns in large text collections.
  4. Design and implement efficient techniques to search for linguistic and semantic patterns in large text collections.
  5. Design learning models for natural language processing systems based on computational learning theory and techniques.
  6. Design, implement and analyse algorithmic solutions for text mining with linguistic annotations.
  7. Identify, analyse and evaluate the bias of predictive models of natural language processing.
  8. Propose new methods or informed alternative solutions.
  9. Understand current solutions to natural language processing tasks for information extraction, machine translation and summarisation, as well as dialogue systems.
  10. Understand, apply and design computational learning techniques for natural language processing problems.
  11. Weigh up the risks and opportunities of both your own and others' proposals for improvement.
  12. Work cooperatively to achieve common objectives, assuming own responsibility and respecting the role of the different members of the team.

Content

  1. Fundamentals of NLP
  2. Semantic Analysis
  3. Pragmatic Analysis
  4. Recurrent Neural Networks for NLP Applications
  5. Transformers for NLP Applications
  6. Foundation Models for NLP Applications

Activities and Methodology

Title Hours ECTS Learning Outcomes
Type: Directed      
Exercise sessions 25 1 1, 2, 3, 4, 5, 6, 7, 10, 12
Project sessions 6 0.24 1, 5, 7, 8, 9, 10, 11, 12
Theory classes 15 0.6 7, 9, 10
Type: Supervised      
Work on the project 50 2 1, 2, 5, 7, 8, 9, 10, 11, 12
Type: Autonomous      
Exercise solving 25 1 1, 3, 4, 5, 6, 9, 10, 11, 12
Individual studying 25 1 1, 5, 9, 10

Sessions will combine three types of teaching activities: theory classes, solving project-based exercises, and a project development. Students will work in small groups of two or three people to solve both the exercises and the project.

1.     Theory lectures, which will contain the theoretical background needed to solve the exercises and the project will be presented using a presentation. These presentations will contain theoretical concepts, and mathematical formulation, as well as the corresponding algorithmic solutions.

2.     Project-based exercises will be developed using Jupyter notebooks to be able to comment, point by point, the coding solutions. These exercises will be submitted regularly through Campus Virtual, explaining the proposed solution and showing the obtained results. These reports could be presented to the whole classroom in the subsequent session. The group/s responsible for presentation will be chosen randomly in class. The exercise’s submissions will comprise the portfolio.

3.     A project will be carried out during the semester, where students will have to solve a real-world problem. The project will be solved in small groups of two or three students, where each member of the group must contribute a part and put it together with the rest to obtain the final solution. These working groups must be maintained until the end of the semester and must be self-managed in terms of distribution of roles, work planning, assignment of tasks, management of available resources, conflicts, etc. To develop the project, the groups will work autonomously, while the practical sessions will be used (1) for the teacher to present the project theme and discuss possible approaches, (2) for monitoring the status of the project and (3) for the teams to present their final results.

The above activities will be complemented by a system of tutoring and consultations outside class hours.

All the informationof the subject and the related documents that the students need will be available at the virtual campus (cv.uab.cat).

Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.

Annotation: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.


Assessment

Continous Assessment Activities

Title Weighting Hours ECTS Learning Outcomes
Portfolio 49% 0 0 1, 2, 3, 4, 5, 6, 9, 10, 11, 12
Project 30% 2 0.08 1, 5, 7, 8, 9, 10, 11, 12
Test 21% 2 0.08 1, 3, 4, 5, 6, 7, 9, 10

To assess the level of student learning, a formula is established that combines knowledge acquisition, the ability to solve problems and the ability to work as a team, as well as the presentation of the results obtained. 

Final grade

The final grade is calculated in the following way and according to the different activities that are carried out:

Final grade = 0.7 * Exercises Grade + 0.3 * Project Grade

This formula will be applied as long as the Exercises and Project grades, are higher than 4. If the formula yields >= 5, but the student does not reach the minimum required in any of the evaluation activities, then a final grade of 4.5 will be assigned. 

Exercises Grade

The aim of the exercises is to become familiar with the use of the theoretical concepts, and to apply them to a real-world problem. The regular submission of problem solutions will be used as evidence of this work.

In order to obtain a grade for exercises, it is necessary that more than 50% of the exercises are submitted during the semester. In the contrary, the portfolio grade will be 0.

A test about the exercises will be performed individually at the end of the semester. The final problems grade will be the combination of the exercise’s portfolio and this test.

Problems Grade = 0.7 * Portfolio evaluation + 0.3 * Test

The formula will be applied as long as the Test grade is higher than 4.

At the end of the semester, students will have the opportunity to re-submit two different deliveries for retaking, and they will also be able to retake the Test. After the retakes, the maximum grade which can be obtained is 8.

Project Grade

The project carries an essential weight in the overall mark of the subject. Developing the project requires that the students work in groups and design an integral solution to the defined challenge. In addition, the students must demonstrate their teamwork skills and present the results to the class.

The project is evaluated through its report, an oral presentation that students will present in class, and an individual-evaluation process. The participation of students in all three activities (preparing the report, presentation and individual evaluation) is necessary in order to obtain a projects grade. The grade is calculated as follows:

Project Grade = 0.6 * Report + 0.3 * Presentation + 0.1 * Individual evaluation

If performing the above calculation yields >= 5 but the student did not participate in any of the activities (report, presentation, individual evaluation), then a final grade of 4.5 will be given to the corresponding project.

There will be a retake of the project in case the final project grade does not reach the minimum of 4. In case of copy, there will be no recovery and the subject will be considered failed. The maximum project grade that can be obtained in case of retake is 7.

Important notes

Notwithstanding other disciplinary measures deemed appropriate, and in accordance with the academic regulations in force, evaluation activities will be suspended with zero (0) whenever a student commits any academic irregularities that may alter such evaluation (for example, plagiarizing, copying, letting copy, ...). The evaluation activities qualified in this way and by this procedure will not be recoverable. If you need to pass any of these assessment activities to pass the subject, this subject will be failed directly, without opportunity to recover it in the same year.

In case the student does not deliver any exercise, and does not attend any project presentation session, the corresponding grade will be a "non-evaluable". In another case, the "no shows" count as a 0 for the calculation of the weighted average.

In order to pass the course with honours, the final grade obtained must be equal or higher than 9 points. Because the number of students with this distinction cannot exceed 5% of the total number of students enrolled in the course, it is given to whoever has the highest final marks. In case of a tie, the results of the exercises test will be considered.


Bibliography

  • D. Jurafsky, J.H. Martin. Speech and Language Processing. Third Edition. 2021 <https://web.stanford.edu/~jurafsky/slp3/>
  • R.S.T. Lee. Natural Language Processing. 2024. Springer
  • G. Paab. Foundation Models for Natural Language Processing. 2023. Springer
  • J. Eisenstein. Natural Language Processing. 2018. MIT Press
  • H. Lane, C. Howard, H. M. Hapke. Natural Language Processing in Action. 2019. Manning Publications
  • Kenny, Dorothy, ed. Machine translation for everyone. Lanugage Science Press, 2022. < https://langsci-press.org/catalog/book/342>
  • Rowe, Bruce M., and Diane P. Levine. A concise introduction to linguistics. Routledge, 2018.

Software

For the problems and projects of the course we will use Python, along with some Python libraries for NLP that will be specified during the course.


Language list

Name Group Language Semester Turn
(PAUL) Classroom practices 1 English second semester afternoon
(PLAB) Practical laboratories 1 English second semester afternoon
(TE) Theory 1 English second semester afternoon