Logo UAB
2022/2023

Video Analysis

Code: 43082 ECTS Credits: 6
Degree Type Year Semester
4314099 Computer Vision OB 0 2

Contact

Name:
Maria Vanrell Martorell
Email:
maria.vanrell@uab.cat

Use of Languages

Principal working language:
english (eng)

External teachers

Federico Sukno
Gloria Haro
Javier Ruiz
Montse Pardās
Ramon Morros

Prerequisites

  • Degree in Engineering, Maths, Physics or similar.
  • Programming Skills in Python.

Objectives and Contextualisation

Module Coordinator: Dr. Javier Ruiz

The objective of this module is to present the main concepts and technologies that are necessary for video analysis. In the first place, we will present the applications of image sequence analysis and the different kind of data where these techniques will be applied, together with a general overview of the signal processing techniques and the general deep learning architectures in which video analysis is based. Examples will be given for mono-camera video sequences, multi-camera and depth camera sequences. Both theoretical bases and algorithms will be studied. For each subject, classical state of the art techniques will be presented, together with the deep learning techniques which lead to different approaches. Main subjects will be video segmentation, background subtraction, motion estimation, tracking algorithms and model-based analysis. Higher level techniques such as gesture or action recognition, deep video generation and cross-modal deep learning will also be studied. Students will work on a project on road traffic monitoring applied to ADAS (Advanced Driver Assistance Systems) where they will apply the concepts learned in the course. The project will focus on video object detection and segmentation, optical flow estimation and multi-target / multi-camera tracking of vehicles. 

Competences

  • Accept responsibilities for information and knowledge management.
  • Choose the most suitable software tools and training sets for developing solutions to problems in computer vision.
  • Conceptualise alternatives to complex solutions for vision problems and create prototypes to show the validity of the system proposed.
  • Continue the learning process, to a large extent autonomously.
  • Identify concepts and apply the most appropriate fundamental techniques for solving basic problems in computer vision.
  • Plan, develop, evaluate and manage solutions for projects in the different areas of computer vision.
  • Solve problems in new or little-known situations within broader (or multidisciplinary) contexts related to the field of study.
  • Understand, analyse and synthesise advanced knowledge in the area, and put forward innovative ideas.
  • Use acquired knowledge as a basis for originality in the application of ideas, often in a research context.
  • Work in multidisciplinary teams.

Learning Outcomes

  1. Accept responsibilities for information and knowledge management.
  2. Choose the learnt techniques and train them to resolve a particular image sequence analysis project.
  3. Continue the learning process, to a large extent autonomously.
  4. Identify the basic problems to be solved in image sequence analysis, along with the specific algorithms.
  5. Identify the best representations that can be defined for solving problems of image sequence analysis.
  6. Plan, develop, evaluate and manage a solution to a particular image sequence analysis problem.
  7. Solve problems in new or little-known situations within broader (or multidisciplinary) contexts related to the field of study.
  8. Understand, analyse and synthesise advanced knowledge in the area, and put forward innovative ideas.
  9. Use acquired knowledge as a basis for originality in the application of ideas, often in a research context.
  10. Work in multidisciplinary teams.

Content

  1. Introduction to video analysis
    • Approaches to signal processing and applications
    • Deep learning architectures for video
  2. Video segmentation
    • Shot segmentation
    • Background modeling and shadow removal
    • Spatio-temporal segmentation of regions
    • Semantic segmentation.
  3. Motion estimation
    • Classical and deep learning techniques
  4. Tracking
    • Bayesian: introduction to Kalman filters, particle filters
    • Multi-target and contour tracking
    • Model based tracking
    • Tracking and segmentation of objects with deep learning
  5. Applications:
    • Deep video generation
    • Recognition: Activity, Pose and Gestures.
    • Learning from videos. Cross-modal deep learning

Methodology

Supervised sessions: (Some of these sessions could be synchronous on-line sessions)

  • Lecture Sessions, where the lecturers will explain general contents about the topics. Some of them will be used to solve the problems.

Directed sessions: 

  • Project  Sessions, where the problems and goals of the projects will be presented and discussed, students will interact with the project coordinator about problems and ideas on solving the project (approx. 1 hour/week)
  • Presentation Session, where the students give an oral presentation about how they have solved the project and a demo of the results.
  • Exam Session, where the students are evaluated individually. Knowledge achievements and problem-solving skills

Autonomous work:

  • Student will autonomously study and work with the materials derived from the lectures.
  • Student will work in groups to solve the problems of the projects with deliverables:
    • Code
    • Reports
    • Oral presentations

 

Annotation: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.

Activities

Title Hours ECTS Learning Outcomes
Type: Directed      
Lecture sessions 20 0.8 4, 5, 9
Type: Supervised      
Project follow-up sessions 8 0.32 1, 8, 4, 5, 6, 7, 3, 2, 9, 10
Type: Autonomous      
Homework 113 4.52 1, 8, 4, 5, 6, 7, 3, 2, 9, 10

Assessment

The final marks for this module will be computed with the following formula:

Final Mark = 0.4 x Exam + 0.55 x Project+ 0.05 x Attendance

where,

Exam: is the mark obtained in the Module Exam (must be >= 3).

Attendance: is the mark derived from the control of attendance at lectures (minimum 70%).

Projects: is the mark provided by the project coordinator based on the weekly follow- up of the project and deliverables  (must be >= 5). All accordingly with specific criteria such as:

    • Participation in discussion sessions and in team work (inter-member evaluations)
    • Delivery of mandatory and optional exercises.
    • Code development (style, comments, etc.)
    • Report (justification of the decisions in your project development)
    • Presentation (Talk and demonstrations on your project)

Only those students that fail (Final Mark < 5.0) can do a retake exam.

Assessment Activities

Title Weighting Hours ECTS Learning Outcomes
Exam 0.4 2.5 0.1 1, 8, 7, 3, 2, 9, 10
Project 0.55 6 0.24 1, 8, 4, 5, 6, 7, 3, 2, 9, 10
Session attendance 0.05 0.5 0.02 1, 4, 5, 7, 9

Bibliography

Journal articles:

  1. M. Piccardi. “Background subtraction techniques: a review”. Journal: IEEE Int. Conf. On Systems, Man and Cybernetics 2004 , v. 4, pp. 3099-3104, 2004.
  2. A. Sobral, A. Vacavant, “A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos”, Journal: Computer Vision and Image Understanding Vol. 122, pp. 4-21 · May 2014.
  3. S. Baker, D. Scharstein, JP. Lewis, S. Roth, M. Black, R. Szeliski. “A database and evaluation methodology for optical flow”. Journal: International Journal of Computer Vision, Vol. 92:1, pp. 1-31, 2011.
  4. T. Cootes, G. Edwards, C. Taylor. “Active appearance models”. Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 6, pp. 681--685, 2001.
  5. R. Poppe. “Vision-based Human motion analysis: an overview”. Journal: Computer Vision and Image Understanding 108 (1-2): 4-18, 2007

Books:

  1. “Sequential Monte Carlo methods in practice”, A. Doucet, N. de Freitas and N.Gordon (Eds.), Springer, 2001.

Software

Tools for Python programming with special attention to Computer Vision and Pythorch libraries