AGD :: Aplicatiu de Guies Docents v2.1

Arxiu en format PDF

Video Analysis

Code: 44778 ECTS Credits: 9

2024/2025

Degree	Type	Year
4318299 Computer Vision	OB	0

Contact

Name:: Maria Isabel Vanrell Martorell
Email:: maria.vanrell@uab.cat

Teachers

: Javier Ruiz Hidalgo
: Ramon Morros Rubio
: Gloria Haro Ortega
: Montse Pardàs Feliu
: Federico Sukno
: Albert Clapés Sintes

Teaching groups languages

You can view this information at the end of this document.

Prerequisites

Degree in Engineering, Maths, Physics or similar.

Course C3: Machine Learning for Computer Vision
Programming Skills in Python.

Objectives and Contextualisation

Module Coordinator: Dr. Javier Ruiz

The objective of this module is to present the main concepts and technologies that are necessary for video analysis. In the first place, we will present the applications of image sequence analysis and the different kind of data where these techniques will be applied, together with a general overview of the signal processing techniques and the general deep learning architectures in which video analysis is based. Examples will be given for mono-camera video sequences, multi-camera and depth camera sequences. Both theoretical bases and algorithms will be studied. For each subject, classical state of the art techniques will be presented, together with the deep learning techniques which lead to different approaches. Main subjects will be video segmentation, background subtraction, motion estimation, tracking algorithms and model-based analysis. Higher level techniques such as gesture or action recognition, deep video generation and cross-modal deep learning will also be studied.

Students will work on a project analysing video sequences. In the first part, a road traffic monitoring system applied to ADAS (Advanced Driver Assistance Systems) where algorithms and models for video object detection, segmentation, tracking and optical-flow estimation will be developed. In a second part, action detection and recognition on videos will be the main focus.

Learning Outcomes

CA03 (Competence) Define all the components that cooperate in a complete system of image sequencing analysis.
CA06 (Competence) Achieve the objectives of a project of vision carried out in a team.
KA06 (Knowledge) Identify the basic problems to be solved in a problem of image sequencing in scenes.
KA14 (Knowledge) Provide the best modelling to solve problems of segmenting videos, estimating movement and following objects.
SA05 (Skill) Solve a problem of visual recognition training a deep neural network architecture and evaluate the results.
SA11 (Skill) Define the best data sets for training visual recognition architecture.
SA15 (Skill) Prepare a report that describes, justifies and illustrates the development of a project of vision.
SA17 (Skill) Prepare oral presentations that allow debate of the results of a project of vision.

Content

Video Segmentation
Motion Estimation
Object Tracking
Recurrent Neural Networks
Attention and Transformers for video
Neural architectures for video
Action Detection and Recognition
Self-supervided and multi-modal learning for video
Video domain adaptation
Anomaly detection
Video generation

Activities and Methodology

Title	Hours	ECTS	Learning Outcomes
Type: Directed
Lecture sessions	35	1.4	CA03, CA06, KA06, KA14, SA05, SA11, SA15, SA17
Type: Supervised
Project follow-up sessions	10	0.4	CA03, CA06, KA06, KA14, SA05, SA11, SA15, SA17
Type: Autonomous
Homework	171	6.84	CA03, CA06, KA06, KA14, SA05, SA11, SA15, SA17

Supervised sessions: (Some of these sessions could be synchronous on-line sessions)

Lecture Sessions, where the lecturers will explain general contents about the topics. Some of them will be used to solve the problems.

Directed sessions:

Project Sessions, where the problems and goals of the projects will be presented and discussed, students will interact with the project coordinator about problems and ideas on solving the project (approx. 1 hour/week)
Presentation Session, where the students give an oral presentation about how they have solved the project and a demo of the results.
Exam Session, where the students are evaluated individually. Knowledge achievements and problem-solving skills

Autonomous work:

Student will autonomously study and work with the materials derived from the lectures.
Student will work in groups to solve the problems of the projects with deliverables:
- Code
- Reports
- Oral presentations

Annotation: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.

Assessment

Continous Assessment Activities

Title	Weighting	Hours	ECTS	Learning Outcomes
Exam	0.4	2.5	0.1	CA03, CA06, KA06, KA14, SA05, SA11, SA15, SA17
Project	0.55	6	0.24	CA03, CA06, KA06, KA14, SA05, SA11, SA15, SA17
Session attendance	0.05	0.5	0.02	CA03, CA06, KA06, KA14, SA05, SA11, SA15, SA17

The final marks for this module will be computed with the following formula:

Final Mark = 0.4 x Exam + 0.55 x Project+ 0.05 x Attendance

where,

Exam: is the mark obtained in the Module Exam (must be >= 3).

Attendance: is the mark derived from the control of attendance at lectures (minimum 70%).

Projects: is the mark provided by the project coordinator based on the weekly follow- up of the project and deliverables (must be >= 5). All accordingly with specific criteria such as:

Participation in discussion sessions and in team work (inter-member evaluations)
Delivery of mandatory and optional exercises.
Code development (style, comments, etc.)
Report (justification of the decisions in your project development)
Presentation (Talk and demonstrations on your project)

Only those students that fail (Final Mark < 5.0) can do a retake exam.

Bibliography

Journal articles:

M. Piccardi. “Background subtraction techniques: a review”. Journal: IEEE Int. Conf. On Systems, Man and Cybernetics 2004 , v. 4, pp. 3099-3104, 2004.
A. Sobral, A. Vacavant, “A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos”, Journal: Computer Vision and Image Understanding Vol. 122, pp. 4-21 · May 2014.
S. Baker, D. Scharstein, JP. Lewis, S. Roth, M. Black, R. Szeliski. “A database and evaluation methodology for optical flow”. Journal: International Journal of Computer Vision, Vol. 92:1, pp. 1-31, 2011.
T. Cootes, G. Edwards, C. Taylor. “Active appearance models”. Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 6, pp. 681--685, 2001.
R. Poppe. “Vision-based Human motion analysis: an overview”. Journal: Computer Vision and Image Understanding 108 (1-2): 4-18, 2007

Books:

“Sequential Monte Carlo methods in practice”, A. Doucet, N. de Freitas and N.Gordon (Eds.), Springer, 2001.

Software

Tools for Python programming with special attention to Computer Vision and Pythorch libraries

Language list

Name	Group	Language	Semester	Turn
(PLABm) Practical laboratories (master)	1	English	second semester	morning-mixed
(PLABm) Practical laboratories (master)	2	English	second semester	morning-mixed
(TEm) Theory (master)	1	English	second semester	morning-mixed