Logo UAB

Vision and Learning

Code: 106582 ECTS Credits: 6
2024/2025
Degree Type Year
2504392 Artificial Intelligence OT 3
2504392 Artificial Intelligence OT 4

Contact

Name:
Jordi Gonzalez Sabate
Email:
jordi.gonzalez@uab.cat

Teachers

Debora Gil Resina

Teaching groups languages

You can view this information at the end of this document.


Prerequisites

Have taken the subjects of Fundamentals of Machine Learning, Fundamentals of Programming, Fundamentals of Computer Vision, Probability and Statistics, and Neural Networks and Deep Learning.

It is recommended that the student have knowledge and skills of:

  • Programming in the Python programming language
  • Signal, Image and Video Processing
  • statistical validation
  • Computational Learning and Deep Learning

 


Objectives and Contextualisation

Roughly every decade there is a technological tsunami that transforms multiple industries. Artificial Intelligence (AI) is this wave that sweeps the current technological world. If you have ever wondered:

  • How do computers perform face detection in crowds?
  • how do video calling apps blur background or replace background with other images?
  • How do autonomous cars move safely in an urban environment?
  • How do you track the ball with such precision in televised sporting events like tennis, soccer, and basketball?
  • Can we know the most effective cancer treatment from multimodal patient data?
  • Can we know the emotions of a person with a video?
  • how do machines learn?


If we have aroused your curiosity, this course is what you need. In this course we will learn about topics in Computer Vision such as Object Tracking, Image Classification, Personalized Medicine, Face Detection, Optical Flow, Human Pose estimation and many more.

Unlike other computer vision courses, this course approaches computer vision in a more practical, experiential and intuitive way. Its main component is a set of projects that must be developed by students divided into teams. All you need is a working knowledge of the Python programming language.

We will use Python which allows us to incorporate different computer vision libraries. It is used by thousands of companies, products, and devices and is tested every day for scalability and performance. We will also learn to design and adapt specific networks and choose the most appropriate processing method according to the requirements and restrictions of each application.

In summary, Vision and Learning is an eminently practical and interdisciplinary subject that stands on the bridge between artificial intelligence and the real world and that aims to cross this bridge in both directions.


Competences

    Artificial Intelligence
  • Analyse and solve problems effectively, generating innovative and creative proposals to achieve objectives.
  • Conceptualize and model alternatives of complex solutions to problems of application of artificial intelligence in different fields and create prototypes that demonstrate the validity of the proposed system.
  • Develop critical thinking to analyse alternatives and proposals, both one's own and those of others, in a well-founded and argued manner.
  • Develop strategies to formulate and solve different learning problems in a scientific, creative, critical and systematic way, knowing the capabilities and limitations of the different existing methods and tools.
  • Introduce changes to methods and processes in the field of knowledge in order to provide innovative responses to society's needs and demands.
  • Students can apply the knowledge to their own work or vocation in a professional manner and have the powers generally demonstrated by preparing and defending arguments and solving problems within their area of study.
  • Work cooperatively to achieve common objectives, assuming own responsibility and respecting the role of the different members of the team.

Learning Outcomes

  1. Analyse a situation and identify areas for improvement.
  2. Analyse and solve problems effectively, generating innovative and creative proposals to achieve objectives.
  3. Design the best convolutional network architectures for solving image sequence problems.
  4. Develop critical thinking to analyse alternatives and proposals, both one's own and those of others, in a well-founded and argued manner.
  5. Identify the basic concepts of computational learning and adequately apply its techniques to image recognition.
  6. Plan, develop, evaluate and implement a solution to a particular visual recognition problem.
  7. Propose new methods or informed alternative solutions.
  8. Select and design the best data sets for training networks.
  9. Select and design the best methods for training neural networks.
  10. Select and design the best techniques for evaluating the results of training methods or networks.
  11. Students can apply the knowledge to their own work or vocation in a professional manner and have the powers generally demonstrated by preparing and defending arguments and solving problems within their area of study.
  12. Use optimization techniques to plan, develop, evaluate, and implement a solution to a particular problem.
  13. Work cooperatively to achieve common objectives, assuming own responsibility and respecting the role of the different members of the team.

Content

1. Introduction to Computational Learning in Computer Vision

2. Classification of Images

3. Object Detection

4. Segmentation of Regions

5. Indexing and Retrieval

6. Image Generation

7. Multimodal Learning

 


Activities and Methodology

Title Hours ECTS Learning Outcomes
Type: Directed      
Theory lectures 10 0.4 5, 6, 8, 9, 10, 12
Type: Supervised      
Working seminars 20 0.8 1, 2, 3, 8, 9, 10, 11, 12
Type: Autonomous      
Personal work 115 4.6 1, 2, 3, 4, 6, 7, 11, 12, 13

The management of the teaching of the subject will be carried out through the Caronte document manager (http://caronte.uab.cat/), which will serve as a management tool for the work teams, make the corresponding deliveries, see the notes, communicate with teachers, etc. In order to use it, the following steps must be taken:

  1. Register as a user by giving your name, NIU, and a passport photo in JPG format. If you have already registered for another subject, it is not necessary to do it again, you can go to the next step.
  2. Enroll in the type of teaching "VISION AND LEARNING", giving as subject code the one provided on the first day of class.


The course will follow a teaching learning methodology called Project Based Learning (ABP). The PBL methodology aims to empower and motivate the student in their learning. Groups of between 5 and 6 students will be formed who will be entrusted with carrying out a set of projects (medium size) throughout the semester. There will be a weekly follow-up and both group and individual tutoring of the students

The projects are set by the teaching staff in such a way that they meet the following conditions: be as real as possible; be treatable by elementary tools; not have an associated standard solution algorithm.

On the other hand, it is essential to understand that it is not a question of finding an algorithm that works in 100 x 100 cases —often there is no such thing— but simply of “giving you a reasonable solution proposal”.

Projects should be developed by each team with the maximum possible autonomy. Each team will be assigned a tutor who will follow their evolution but in principle will refrain from imposing their ideas. On the other hand, the student must be clear that it is not a question of looking for the solution of the problem in other places, but of making an original contribution. This does not mean that you have to renounce the information that may exist in the bibliography oron the Internet; but when it is used it is necessary to have the teacher informed and explain it in the memory.

The realization of the project must end in a program and a final report. In addition to delivering it in written form, the results of this report will be the subject of an oral presentation. Both of them, written memory and oral exposition, must be addressed mainly to the entity, surely hypothetical, that would have proposed the problem. As a general rule, technicalities will be relegated to specific sections of the written report.

In the oral presentations of the projects it is expected that the whole class attend, and that they intervene through questions and observations.

Note: 15 minutes of a class will be reserved, within the calendar established by the center/degree, for the completion by the students of the surveys to evaluate the performance of the teaching staff and the evaluation of the subject/module.

Annotation: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.


Assessment

Continous Assessment Activities

Title Weighting Hours ECTS Learning Outcomes
Class Coevaluation Note 10% 0 0 4, 11
Group Note 50% 5 0.2 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13
Individual Note 30% 0 0 3, 5, 6, 8, 9, 10, 11, 12
Peer Coevaluation Note 10% 0 0 4, 11

Given that most of the work revolves around a set of projects that is developed throughout the course, the evaluation has a continuous nature, and its final result is not recoverable.

 

Evaluation Methodology

The minutes that the group will write after each tutored session will be evaluated, in which they will describe their discussions and agreements and the self-evaluation surveys that each student will take on their classmates and on themselves. At the end of each project, the students will make an oral presentation of the project and will deliver a report of the work carried out. Both will be evaluated by the teachers of the subject, whether or not they are tutors. Students will not take any written exams. 

For the evaluation, the following INSTRUMENTS and ACTIVITIES will be used:

  • An evaluation made by the teachers from the presentation of the projects carried out by the group (quality of work, presentation, memory delivered). Group grade (0 to 10) From:

    • STUDENT PORTFOLIO: Document where the development of the work done is explained: project approach, meeting minutes, information sought, explanation of the implemented application with a small user manual and tests and tests performed.

    • PRESENTATION: Oral presentation in 5-7 slides on the project developed and results obtained.

    • APPLICATION: developed program.

    • ACTS AND CONTROLS: Presentation of the documentation delivered.

  • An individual evaluation based on the observations made by the tutors in the tutored sessions, where the attitude, initiative, participation, attendance and punctuality of the student in the group sessions will be taken into account. Individual mark (0 to 10).

  • Co-evaluation and self-evaluation surveys among group members at the end of each project. Peer Coevaluation Note (0 to 10).

  • Oral presentations are made before the students and the groups will assess the work of their classmates as a ranking. The group that is in 1st position will receive 10 points, the 2nd 8 points and so on.  Class Coevaluation Note (0 to 10).

 

Grades

Each project will have a grade that will be calculated as follows: 

Project Grade = 0.5 * Group Grade + 0.3 * Individual Grade + 0.1 * Peer Coevaluation Grade + 0.1 * Class Coevaluation Grade

 

Single Assessment

This subject does not provide for the single assessment system.

 

The final grade will come from the weighted average of the projects carried out. The weighting will be the same for all projects 

 

To distinguish between 'failed' and 'no-show', a deadline is set for students to unsubscribe from the evaluation, in which case they will appear as 'no-show'. To unsubscribe, you must notify the teacher, in writing or by email, and obtain an acknowledgment of receipt.


Bibliography

- Richard Szeliski, Computer Vision: Algorithms and Applications, 2nd Edition. Springer (Texts in computer Science) 2021. (http://szeliski.org/Book/)

- Ian Goodfellow and Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press, 2016. (http://www.deeplearningbook.org)

- Adrian Kaehler, Gary Bradsky, Learning OpenCV 3: Computer Vision in C++ with the OpenCV Library, O'Reilly, 2016.

- Aurélien Géron, Hands-On Machine Learning with Scikit-Learn & TensorFlow, O'Reilly, 2017.

- Eli Stevens, Luca Antiga, Thomas Viehmann, Deep learning with Pytorch, Manning Publications, 2020 (https://pytorch.org/assets/deep-learning/Deep-Learning-with-PyTorch.pdf)

- François Chollet, Deep learning with Python, Manning Publications, 2021 (https://github.com/fchollet/deep-learning-with-python-notebooks)


Software

To develop different computer vision systems, both in practice and in problems sessions, the Python programming language will be used, working with Jupyter Notebooks.


Language list

Name Group Language Semester Turn
(PAUL) Classroom practices 1 English first semester afternoon
(PLAB) Practical laboratories 1 English first semester afternoon
(TE) Theory 1 English first semester afternoon