AGD :: Aplicatiu de Guies Docents v2.1

Arxiu en format PDF

3D Vision and Motion Analysis

Code: 106583 ECTS Credits: 6

2025/2026

Degree	Type	Year
Artificial Intelligence	OT	3
Artificial Intelligence	OT	4

Contact

Name:: Alexandra Gomez Villa
Email:: alexandra.gomez@uab.cat

Teaching groups languages

You can view this information at the end of this document.

Prerequisites

Recommended "Fundamentals of Machine Learning", "Neural Networks and Deep Learning", and “Fundamentals of Computer Vision”

Objectives and Contextualisation

This subject aims to provide students with a comprehensive understanding of 3D vision and its intersection with modern learning-based methods. Building upon fundamental computer vision concepts, the course explores how deep learning techniques have revolutionized our ability to perceive, model, and understand three-dimensional representations from visual data. Students will learn to work with various 3D representations, understand image formation principles, develop skills in single and multi-view 3D inference, and explore advanced topics such as neural rendering and generative 3D models. By the end of this subject, students should be able to design and implement learning-based solutions for 3D vision problems, understand the challenges and limitations of current approaches, and apply these techniques in practical applications ranging from robotics and autonomous driving to virtual reality and computer graphics.

Competences

Artificial Intelligence

Analyse and solve problems effectively, generating innovative and creative proposals to achieve objectives.
Conceive, design, analyse and implement intelligent systems capable of using vision as a mechanism to interact with the environment.
Conceptualize and model alternatives of complex solutions to problems of application of artificial intelligence in different fields and create prototypes that demonstrate the validity of the proposed system.
Develop critical thinking to analyse alternatives and proposals, both one's own and those of others, in a well-founded and argued manner.
Introduce changes to methods and processes in the field of knowledge in order to provide innovative responses to society's needs and demands.
Students can apply the knowledge to their own work or vocation in a professional manner and have the powers generally demonstrated by preparing and defending arguments and solving problems within their area of study.
Work cooperatively to achieve common objectives, assuming own responsibility and respecting the role of the different members of the team.

Learning Outcomes

Analyse and solve problems effectively, generating innovative and creative proposals to achieve objectives.
Design the best convolutional network architectures for solving visual object and scene recognition problems.
Design the best convolutional network architectures for solving 3D scene retrieval problems.
Develop critical thinking to analyse alternatives and proposals, both one's own and those of others, in a well-founded and argued manner.
Identify the basic concepts of image sequence analysis and appropriately apply its techniques.
Identify the basic concepts of 3D information retrieval from images and appropriately apply its techniques.
Identify the best representations for solving problems related to image sequences.
Identify the best representations for solving problems related to image-based 3D information retrieval.
Plan, develop, evaluate and implement a solution to a particular image sequence problem.
Plan, develop, evaluate and implement a solution to a particular three-dimensional information retrieval problem.
Propose new methods or informed alternative solutions.
Propose new ways of measuring the success or failure of implementing innovative proposals or ideas.
Students can apply the knowledge to their own work or vocation in a professional manner and have the powers generally demonstrated by preparing and defending arguments and solving problems within their area of study.
Work cooperatively to achieve common objectives, assuming own responsibility and respecting the role of the different members of the team.

Content

3D Representations

Depth maps
Classical 3D representations: point clouds, meshes, voxels
Transformations between 3D representations

Point Clouds

Point cloud acquisition and preprocessing
Deep learning on point clouds
Applications in 3D object classification and segmentation

Implicit Neural Representations

Signed distance functions (SDFs)
Neural implicit functions
Applications in shape representation and reconstruction

Structure from Motion

Camera models and calibration
Feature matching and geometric verification
Triangulation
Bundle adjustment

Neural Rendering

Ray casting/tracing fundamentals
Neural Radiance Fields (NeRF)
View synthesis and novel view generation
Conditional generation

Gaussian Splatting

3D gaussian primitives
Differentiable rasterization
Optimization techniques
Applications in neural rendering

Diffusion Models

Foundations of diffusion models
Conditioning strategies

Activities and Methodology

Title	Hours	ECTS	Learning Outcomes
Type: Directed
Course Project	0	0	1, 4, 3, 5, 6, 8, 10, 11, 12, 13, 14
Laboratory sessions	0	0	1, 4, 3, 6, 8, 10, 11, 12, 13
Type: Supervised
Theory classes	0	0	4, 3

Real-world perception challenges and applications guide 3D vision. Throughout this subject, practical applications in robotics, autonomous driving, virtual reality, and computer graphics will motivate each section and direct the organization of the contents.

There will be two types of sessions:

Theory classes: The objective of these sessions is for the teacher to explain the theoretical foundations of 3D vision and learning-based methods. For each topic studied, the underlying mathematical principles of 3D geometry and learning approaches will be explained, as well as the corresponding algorithmic implementations. Topics will range from basic 3D representations to advanced neural rendering techniques.

Laboratory sessions: Laboratory sessions aim to facilitate hands-on experience with 3D vision systems and reinforce the concepts covered in theory classes. During these sessions, students will work through practical cases that require implementing solutions using PyTorch3D and other relevant 3D vision libraries. Students will gain experience with tasks such as single-view 3D reconstruction, neural rendering, point cloud processing, and mesh manipulation. The sessions will emphasize collaborative work and provide students with direct experience applying theoretical concepts to 3D vision problems. Problem-solving will be initiated in the class and will be complemented by a weekly set of problems to work through at home.

Course Project: A course project will be carried out during the semester, where students will tackle a challenging problem in 3D. Examples of projects might include developing neural rendering techniques, creating systems for 3D reconstruction from multiple views, or implementing state-of-the-art methods for processing point clouds and meshes. Students will work in small groups of 2-3, where each member must equally contribute to the final solution. These working groups will be maintained until the end of the semester. They must be self-managed in terms of distribution of roles, work planning, assignment of tasks, management of available resources, conflicts, etc. To develop the project, the groups will work autonomously, while some of the laboratory sessions will be used (1) for the teacher to present the projects’ theme and discuss possible approaches and (2) for monitoring the status of the project

All the information on the subject and the related documents that the students need will be available at the virtual campus (cv.uab.cat).

Annotation: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.

Assessment

Continous Assessment Activities

Title	Weighting	Hours	ECTS	Learning Outcomes
Delivery of exercises	10%	35	1.4	1, 4, 3, 5, 6, 8, 10, 9, 11, 13
Delivery of project	50%	72	2.88	1, 4, 3, 2, 5, 6, 8, 7, 10, 9, 11, 12, 13, 14
Written exams	40%	43	1.72	4, 3, 5, 6, 8, 10, 9, 11, 12, 13

The final grade assessment combines three components to evaluate theoretical knowledge, practical application, and problem-solving abilities.

Final Grade Calculation

The final grade is calculated using the following weighted formula:

Final Grade = (40% × Theory Grade) + (10% × Problems Portfolio Grade) + (50% × Project Grade)

To pass the course, students must achieve a minimum grade of 5.0 in both the Theory Grade and Project Grade components. While there is no minimum grade requirement for the Problems Portfolio, it's important to note a special condition: if a student's weighted calculation results in a grade of 5.0 or higher, but either their Theory Grade or Project Grade falls below 5.0, their final grade will be automatically capped at 4.5, regardless of their overall weighted average.

Theory Grade Assessment

The Theory Grade is designed to assess each student's individual mastery of course content through a continuous assessment model utilizing two examinations. The first is the Mid-term Examination (Exam 1), which takes place mid-semester and covers the first half of the course material. The second is the Final Examination (Exam 2), which is conducted at the end of the semester and focuses on the second half of the course materials. The Theory Grade is determined by calculating the average of these two examinations.

Theory Grade = (Exam 1 + Exam 2) ÷ 2

The examinations are structured to evaluate two critical components: students' problem-solving abilities using techniques covered in class, and their conceptual understanding of these techniques. There are specific requirements for the Theory Grade: students must score above 4.0 on both partial exams. If a student achieves an average of 5.0 or higher across both exams but scores below 4.0 on either exam, their theory grade will be adjusted down to 4.5 for the final grade calculation. For students who do not achieve a passing theory grade, there is an opportunity to take a recovery examination, which allows them to retake the failed portions (either Part 1, Part 2, or both) of the continuous evaluation process.

Problems Portfolio Assessment

The problems portfolio serves a dual purpose: to promote continuous engagement

with course material and to provide opportunities for the practical application of

theoretical concepts. To successfully complete the portfolio, students must submit at

least 70% of all assigned problem sets. It's important to note that failing to meet this

70% submission threshold will automatically result in a Problems Portfolio Grade of

Zero. Students should compile and document all their completed exercises as part of the portfolio requirements.

Project Assessment

The project stands as a core component of the course, designed to fulfill multiple educational objectives. Students are required to work collaboratively in teams, develop a comprehensive solution to a given challenge, demonstrate their teamwork capabilities, and present their findings to the class. The project grade is determined through a calculation that takes into account various evaluation components and their respective weightings.

Project Grade = (80% × Deliverables Grade) + (20% × Presentation Grade)

The project has several mandatory requirements: students must participate in all three components (deliverables, presentation, and self-evaluation). If a student achieves a weighted calculation of 5.0 or higher but fails to participate in any component, their project grade will be adjusted down to 4.5. While students have the opportunity to resubmit failed projects for recovery, these grades will be capped at 7.0 out of 10. There are also several conditions that result in automatic failure of the project: these include non-submission of project deliverables, failure to present the project, use of copied content, oruse of synthetically generated content.

Bibliography

Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge university press.
Ma, Y., Soatto, S., Kosecka, J., & Sastry, S. (2003). An invitation to 3d vision: From images to models.
Ma, X., Hegde, V., & Yolyan, L. (2022). 3D Deep Learning with Python: Design and develop your computer vision model with 3D data using PyTorch3D and more. Packt Publishing Ltd.
Botsch, M. (2010). Polygon mesh processing. AK Peters.
Pharr, M., Jakob, W., & Humphreys, G. (2023). Physically based rendering: From theory to implementation. MIT Press.

Software

Per a les activitats pràctiques del curs utilitzarem Python (NumPy, MatPlotLib, SciKit Learn) i PyTorch

Groups and Languages

Please note that this information is provisional until 30 November 2025. You can check it through this link. To consult the language you will need to enter the CODE of the subject.

Name	Group	Language	Semester	Turn
(PAUL) Classroom practices	711	English	second semester	morning-mixed
(TE) Theory	71	English	second semester	morning-mixed