AGD :: Aplicatiu de Guies Docents v2.1

Arxiu en format PDF

2023/2024

Video Analysis

Code: 44778 ECTS Credits: 9

Degree	Type	Year	Semester
4318299 Computer Vision	OB	0	2

Contact

Name:: Maria Isabel Vanrell Martorell
Email:: maria.vanrell@uab.cat

Teaching groups languages

You can check it through this link. To consult the language you will need to enter the CODE of the subject. Please note that this information is provisional until 30 November 2023.

Teachers

: Javier Ruiz Hidalgo
: Ramon Morros Rubio
: Montse Pardàs Feliu
: Federico Sukno
: Sergio Escalera Guerrero

External teachers

: Albert Clapés

Prerequisites

Degree in Engineering, Maths, Physics or similar.

Course C3: Machine Learning for Computer Vision
Programming Skills in Python.

Objectives and Contextualisation

Module Coordinator: Dr. Javier Ruiz

The objective of this module is to present the main concepts and technologies that are necessary for video analysis. In the first place, we will present the applications of image sequence analysis and the different kind of data where these techniques will be applied, together with a general overview of the signal processing techniques and the general deep learning architectures in which video analysis is based. Examples will be given for mono-camera video sequences, multi-camera and depth camera sequences. Both theoretical bases and algorithms will be studied. For each subject, classical state of the art techniques will be presented, together with the deep learning techniques which lead to different approaches. Main subjects will be video segmentation, background subtraction, motion estimation, tracking algorithms and model-based analysis. Higher level techniques such as gesture or action recognition, deep video generation and cross-modal deep learning will also be studied. Students will work on a project on road traffic monitoring applied to ADAS (Advanced Driver Assistance Systems) where they will apply the concepts learned in the course. The project will focus on video object detection and segmentation, optical flow estimation and multi-target / multi-camera tracking of vehicles.

Learning Outcomes

CA03 (Competence) Define all the components that cooperate in a complete system of image sequencing analysis.
CA06 (Competence) Achieve the objectives of a project of vision carried out in a team.
KA06 (Knowledge) Identify the basic problems to be solved in a problem of image sequencing in scenes.
KA14 (Knowledge) Provide the best modelling to solve problems of segmenting videos, estimating movement and following objects.
SA05 (Skill) Solve a problem of visual recognition training a deep neural network architecture and evaluate the results.
SA11 (Skill) Define the best data sets for training visual recognition architecture.
SA15 (Skill) Prepare a report that describes, justifies and illustrates the development of a project of vision.
SA17 (Skill) Prepare oral presentations that allow debate of the results of a project of vision.

Content

Video Segmentation
Motion Estimation
Object Tracking
Recurrent Neural Networks
Attention and Transformers for video
Neural architectures for video
Action Recognition
Self-supervided learning for video
Multi-modal learning for video
Humans on video
Video domain-adaptation

Methodology

Supervised sessions: (Some of these sessions could be synchronous on-line sessions)

Lecture Sessions, where the lecturers will explain general contents about the topics. Some of them will be used to solve the problems.

Directed sessions:

Project Sessions, where the problems and goals of the projects will be presented and discussed, students will interact with the project coordinator about problems and ideas on solving the project (approx. 1 hour/week)
Presentation Session, where the students give an oral presentation about how they have solved the project and a demo of the results.
Exam Session, where the students are evaluated individually. Knowledge achievements and problem-solving skills

Autonomous work:

Student will autonomously study and work with the materials derived from the lectures.
Student will work in groups to solve the problems of the projects with deliverables:
- Code
- Reports
- Oral presentations

Annotation: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.

Activities

Title	Hours	ECTS	Learning Outcomes
Type: Directed
Lecture sessions	35	1.4	KA06, KA14, KA06
Type: Supervised
Project follow-up sessions	10	0.4	CA03, CA06, SA05, SA11, SA15, SA17, CA03
Type: Autonomous
Homework	171	6.84	CA03, CA06, SA05, SA11, SA15, SA17, CA03

Assessment

The final marks for this module will be computed with the following formula:

Final Mark = 0.4 x Exam + 0.55 x Project+ 0.05 x Attendance

where,

Exam: is the mark obtained in the Module Exam (must be >= 3).

Attendance: is the mark derived from the control of attendance at lectures (minimum 70%).

Projects: is the mark provided by the project coordinator based on the weekly follow- up of the project and deliverables (must be >= 5). All accordingly with specific criteria such as:

Participation in discussion sessions and in team work (inter-member evaluations)
Delivery of mandatory and optional exercises.
Code development (style, comments, etc.)
Report (justification of the decisions in your project development)
Presentation (Talk and demonstrations on your project)

Only those students that fail (Final Mark < 5.0) can do a retake exam.

Assessment Activities

Title	Weighting	Hours	ECTS	Learning Outcomes
Exam	0.4	2.5	0.1	KA06, KA14
Project	0.55	6	0.24	CA03, CA06, SA05, SA11, SA15, SA17
Session attendance	0.05	0.5	0.02	CA06, KA06, KA14

Bibliography

Journal articles:

M. Piccardi. “Background subtraction techniques: a review”. Journal: IEEE Int. Conf. On Systems, Man and Cybernetics 2004 , v. 4, pp. 3099-3104, 2004.
A. Sobral, A. Vacavant, “A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos”, Journal: Computer Vision and Image Understanding Vol. 122, pp. 4-21 · May 2014.
S. Baker, D. Scharstein, JP. Lewis, S. Roth, M. Black, R. Szeliski. “A database and evaluation methodology for optical flow”. Journal: International Journal of Computer Vision, Vol. 92:1, pp. 1-31, 2011.
T. Cootes, G. Edwards, C. Taylor. “Active appearance models”. Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 6, pp. 681--685, 2001.
R. Poppe. “Vision-based Human motion analysis: an overview”. Journal: Computer Vision and Image Understanding 108 (1-2): 4-18, 2007

Books:

“Sequential Monte Carlo methods in practice”, A. Doucet, N. de Freitas and N.Gordon (Eds.), Springer, 2001.

Software

Tools for Python programming with special attention to Computer Vision and Pythorch libraries