2022/2023

Degree | Type | Year | Semester |
---|---|---|---|

4314099 Computer Vision | OB | 0 | 1 |

- Name:
- Maria Vanrell Martorell
- Email:
- maria.vanrell@uab.cat

- Principal working language:
- english (eng)

- Antonio Agudo
- Daniel Ordoņez
- Federico Sukno
- Gloria Haro
- Javier Ruiz
- Josep R. Casas

Degree in Engineering, Maths, Physics or similar

**Module Coordinator:** Dr. Gloria Haro

The goal of this module is to learn the principles of the 3D reconstruction of an object or a scene from multiple images or stereoscopic videos. For that, the basic concepts of the projective geometry and the 3D space are firstly introduced. The rest of the theoretical aspects and applications are built upon these basic tools. The mapping from the 3D world to the image plane will be studied, for that we will introduce different camera models, their parameters and how to estimate them (camera calibration and auto-calibration). The geometry that relates a pair of views will be analyzed. All these concepts will be applied to obtain a 3D reconstruction in the two main possible settings: calibrated or uncalibrated cameras. In particular, we will learn how to: estimate the depth of image points, extract the underlying 3D points given a set of point correspondences in the images, generate novel views, estimate the 3D object given a set of calibrated color images or binary images, and estimate a sparse set of 3D points given a set of uncalibrated images. The 3D representation in voxels and meshes will be studied. We will explain the reconstruction and modeling from Kinect data, as a particular model of sensors that provide an image of the scene together with its depths. Finally, we will see some techniques for processing 3D point clouds. The concepts and techniques learnt in this module are used in real applications ranging from augmented reality, object scanning, motion capture, new view synthesis, bullet-time effect, robotics, etc.

- Accept responsibilities for information and knowledge management.
- Choose the most suitable software tools and training sets for developing solutions to problems in computer vision.
- Conceptualise alternatives to complex solutions for vision problems and create prototypes to show the validity of the system proposed.
- Continue the learning process, to a large extent autonomously.
- Identify concepts and apply the most appropriate fundamental techniques for solving basic problems in computer vision.
- Plan, develop, evaluate and manage solutions for projects in the different areas of computer vision.
- Solve problems in new or little-known situations within broader (or multidisciplinary) contexts related to the field of study.
- Understand, analyse and synthesise advanced knowledge in the area, and put forward innovative ideas.
- Use acquired knowledge as a basis for originality in the application of ideas, often in a research context.
- Work in multidisciplinary teams.

- Accept responsibilities for information and knowledge management.
- Choose the learnt techniques and train them to resolve a specific project of 3D reconstruction of scenes.
- Continue the learning process, to a large extent autonomously.
- Identify the basic problems to be solved in the recovery of 3D information from scenes, along with the specific algorithms.
- Identify the best representations that can be defined for solving problems of 3D information recovery.
- Plan, develop, evaluate and manage a solution to a specific problem of 3D reconstruction of scenes.
- Solve problems in new or little-known situations within broader (or multidisciplinary) contexts related to the field of study.
- Understand, analyse and synthesise advanced knowledge in the area, and put forward innovative ideas.
- Use acquired knowledge as a basis for originality in the application of ideas, often in a research context.
- Work in multidisciplinary teams.

- Introduction and applications.
- 2D projective geometry. Planar transformations.
- Homography estimation. Affine and metric rectification
- 3D projective geometry and transformations. Camera models
- Camera calibration. Pose estimation.
- Epipolar geometry. Fundamental matrix. Essential matrix. Extraction of camera matrices.
- Computation of the fundamental matrix. Image rectification
- Triangulation methods. Depth computation. New View Synthesis
- Multi-view stereo. Structure from motion.
- Auto-calibration. Bundle adjustment.
- 3D sensors (kinect).
- Point cloud processing.

**Supervised sessions: ***(Some of these sessions could be synchronous on-line sessions)*

**Lecture Sessions**, where the lecturers will explain general contents about the topics. Some of them will be used to solve the problems.

**Directed sessions: **

**Project Sessions**, where the problems and goals of the projects will be presented and discussed, students will interact with the project coordinator about problems and ideas on solving the project (approx. 1 hour/week)**Presentation Session**, where the students give an oral presentation about how they have solved the project and a demo of the results.**Exam Session**, where the students are evaluated individually. Knowledge achievements and problem-solving skills

**Autonomous work:**

- Student will autonomously study and work with the materials derived from the lectures.
- Student will work in
**groups**to solve the problems of the projects with deliverables:- Code
- Reports
- Oral presentations

**Annotation**: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.

Title | Hours | ECTS | Learning Outcomes |
---|---|---|---|

Type: Directed | |||

Lecture sessions | 20 | 0.8 | 4, 5, 9 |

Type: Supervised | |||

Project follow-up sessions | 8 | 0.32 | 1, 8, 4, 5, 6, 7, 3, 2, 9, 10 |

Type: Autonomous | |||

Homework | 113 | 4.52 | 1, 8, 4, 5, 6, 7, 3, 2, 9, 10 |

The **final mark** for this module will be computed with the **following formula**:

**Final Mark** = 0.4 x **Exam** + 0.55 x **Project**+ 0.05 x **Attendance**

where,

**Exam:** is the mark obtained in the Module Exam** (must be >= 3).**

**Attendance**: is the mark derived from the control of attendance at lectures **(minimum 70%)**

**Projects:** is the mark provided by the project coordinator based on the weekly follow-up of the project and deliverables **(must be >= 5)**. All accordingly with specific criteria such as:

- Participation in discussion sessions and in team work (inter-member evaluations)
- Delivery of mandatory and optional exercises.
- Code development (style, comments, etc.)
- Report (justification of the decisions in your project development)
- Presentation (Talk and demonstrations on your project)

Only those students that fail **(****Final Mark**** < 5.0)** can do a retake exam.

Title | Weighting | Hours | ECTS | Learning Outcomes |
---|---|---|---|---|

Exam | 0.4 | 2.5 | 0.1 | 1, 8, 5, 6, 7, 2, 9 |

Project | 0.55 | 6 | 0.24 | 1, 8, 4, 5, 6, 7, 3, 2, 9, 10 |

Session attendance | 0.05 | 0.5 | 0.02 | 1, 8, 4, 5, 3 |

Books:

- O. Faugeras,
*Three-dimensional computer vision: a geometric viewpoint*, MIT Press, cop. 1993. - O. Faugeras, Q.T. Loung,
*The geometry of multiple images*, MIT Press, 2001. - D. A. Forsyth, J. Ponce,
*Computer vision: a modern approach*, Prentice Hall, 2003. - R. I. Hartley, A. Zisserman,
*Multiple view geometry in computer vision*, Cambridge University Press, 2000. - R. Szeliski,
*Computer Vision: Algorithms and Applications*, Springer, 2011.

Tutorials:

- Y. Furukawa and C. Hernández,
*Multi-View Stereo: A Tutorial*, Foundations and Trends® in Computer Graphics and Vision, vol. 9, no. 1-2, pp.1-148, 2013. - T. Moons, L. Van Gool, M. Vergauwen,
*3D Reconstruction from Multiple Images Part 1*, Principles, Foundations and Trends® in Computer Graphics and Vision, vol. 4: no. 4, pp 287-404, 2010.

**Python Programming Tools **with special attention to computer vision and image processing libraries

** **