AGD :: Aplicatiu de Guies Docents v2.1

Arxiu en format PDF

2021/2022

3D Vision

Code: 43090 ECTS Credits: 6

Degree	Type	Year	Semester
4314099 Computer Vision	OB	0	1

The proposed teaching and assessment methodology that appear in the guide may be subject to changes as a result of the restrictions to face-to-face class attendance imposed by the health authorities.

Contact

Name:: Maria Vanrell Martorell
Email:: Maria.Vanrell@uab.cat

Use of Languages

Principal working language:: english (eng)

Teachers

: Josep Ramon Casas Pla
: Javier Ruiz Hidalgo
: Gloria Haro Ortega

External teachers

: Antonio Agudo
: Federico Sukno
: Pedro Cavestany

Prerequisites

Degree in Engineering, Maths, Physics or similar

Objectives and Contextualisation

Module Coordinator: Dr. Gloria Haro

The goal of this module is to learn the principles of the 3D reconstruction of an object or a scene from multiple images or stereoscopic videos. For that, the basic concepts of the projective geometry and the 3D space are firstly introduced. The rest of the theoretical aspects and applications are built upon these basic tools. The mapping from the 3D world to the image plane will be studied, for that we will introduce different camera models, their parameters and how to estimate them (camera calibration and auto-calibration). The geometry that relates a pair of views will be analyzed. All these concepts will be applied to obtain a 3D reconstruction in the two main possible settings: calibrated or uncalibrated cameras. In particular, we will learn how to: estimate the depth of image points, extract the underlying 3D points given a set of point correspondences in the images, generate novel views, estimate the 3D object given a set of calibrated color images or binary images, and estimate a sparse set of 3D points given a set of uncalibrated images. The 3D representation in voxels and meshes will be studied. We will explain the reconstruction and modeling from Kinect data, as a particular model of sensors that provide an image of the scene together with its depths. Finally, we will see some techniques for processing 3D point clouds. The concepts and techniques learnt in this module are used in real applications ranging from augmented reality, object scanning, motion capture, new view synthesis, bullet-time effect, robotics, etc.

Competences

Accept responsibilities for information and knowledge management.
Choose the most suitable software tools and training sets for developing solutions to problems in computer vision.
Conceptualise alternatives to complex solutions for vision problems and create prototypes to show the validity of the system proposed.
Continue the learning process, to a large extent autonomously.
Identify concepts and apply the most appropriate fundamental techniques for solving basic problems in computer vision.
Plan, develop, evaluate and manage solutions for projects in the different areas of computer vision.
Solve problems in new or little-known situations within broader (or multidisciplinary) contexts related to the field of study.
Understand, analyse and synthesise advanced knowledge in the area, and put forward innovative ideas.
Use acquired knowledge as a basis for originality in the application of ideas, often in a research context.
Work in multidisciplinary teams.

Learning Outcomes

Accept responsibilities for information and knowledge management.
Choose the learnt techniques and train them to resolve a specific project of 3D reconstruction of scenes.
Continue the learning process, to a large extent autonomously.
Identify the basic problems to be solved in the recovery of 3D information from scenes, along with the specific algorithms.
Identify the best representations that can be defined for solving problems of 3D information recovery.
Plan, develop, evaluate and manage a solution to a specific problem of 3D reconstruction of scenes.
Solve problems in new or little-known situations within broader (or multidisciplinary) contexts related to the field of study.
Understand, analyse and synthesise advanced knowledge in the area, and put forward innovative ideas.
Use acquired knowledge as a basis for originality in the application of ideas, often in a research context.
Work in multidisciplinary teams.

Content

Introduction and applications.
2D projective geometry. Planar transformations.
Homography estimation. Affine and metric rectification
3D projective geometry and transformations. Camera models
Camera calibration. Pose estimation.
Epipolar geometry. Fundamental matrix. Essential matrix. Extraction of camera matrices.
Computation of the fundamental matrix. Image rectification
Triangulation methods. Depth computation. New View Synthesis
Multi-view stereo. Structure from motion.
Auto-calibration. Bundle adjustment.
3D sensors (kinect).
Point cloud processing.

Methodology

Supervised sessions: (Some of these sessions could be synchronous on-line sessions)

Lecture Sessions, where the lecturers will explain general contents about the topics. Some of them will be used to solve the problems.

Directed sessions:

Project Sessions, where the problems and goals of the projects will be presented and discussed, students will interact with the project coordinator about problems and ideas on solving the project (approx. 1 hour/week)
Presentation Session, where the students give an oral presentation about how they have solved the project and a demo of the results.
Exam Session, where the students are evaluated individually. Knowledge achievements and problem-solving skills

Autonomous work:

Student will autonomously study and work with the materials derived from the lectures.
Student will work in groups to solve the problems of the projects with deliverables:
- Code
- Reports
- Oral presentations

Annotation: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.

Activities

Title	Hours	ECTS	Learning Outcomes
Type: Directed
Lecture sessions	20	0.8	4, 5, 9
Type: Supervised
Project follow-up sessions	8	0.32	1, 8, 4, 5, 6, 7, 3, 2, 9, 10
Type: Autonomous
Homework	113	4.52	1, 8, 4, 5, 6, 7, 3, 2, 9, 10

Assessment

The final mark for this module will be computed with the following formula:

Final Mark = 0.4 x Exam + 0.55 x Project+ 0.05 x Attendance

where,

Exam: is the mark obtained in the Module Exam (must be >= 3).

Attendance: is the mark derived from the control of attendance at lectures (minimum 70%)

Projects: is the mark provided by the project coordinator based on the weekly follow-up of the project and deliverables (must be >= 5). All accordingly with specific criteria such as:

Participation in discussion sessions and in team work (inter-member evaluations)
Delivery of mandatory and optional exercises.
Code development (style, comments, etc.)
Report (justification of the decisions in your project development)
Presentation (Talk and demonstrations on your project)

Only those students that fail (Final Mark < 5.0) can do a retake exam.

Assessment Activities

Title	Weighting	Hours	ECTS	Learning Outcomes
Exam	0.4	2.5	0.1	1, 8, 5, 6, 7, 2, 9
Project	0.55	6	0.24	1, 8, 4, 5, 6, 7, 3, 2, 9, 10
Session attendance	0.05	0.5	0.02	1, 8, 4, 5, 3

Bibliography

Books:

O. Faugeras, Three-dimensional computer vision: a geometric viewpoint, MIT Press, cop. 1993.
O. Faugeras, Q.T. Loung, The geometry of multiple images, MIT Press, 2001.
D. A. Forsyth, J. Ponce, Computer vision: a modern approach, Prentice Hall, 2003.
R. I. Hartley, A. Zisserman, Multiple view geometry in computer vision, Cambridge University Press, 2000.
R. Szeliski, Computer Vision: Algorithms and Applications, Springer, 2011.

Tutorials:

Y. Furukawa and C. Hernández, Multi-View Stereo: A Tutorial, Foundations and Trends® in Computer Graphics and Vision, vol. 9, no. 1-2, pp.1-148, 2013.
T. Moons, L. Van Gool, M. Vergauwen, 3D Reconstruction from Multiple Images Part 1, Principles, Foundations and Trends® in Computer Graphics and Vision, vol. 4: no. 4, pp 287-404, 2010.

Software

Python Programming Tools with special attention to computer vision and image processing libraries