Logo UAB

Visión 3D y Análisis del Movimiento

Código: 106583 Créditos ECTS: 6
2025/2026
Titulación Tipo Curso
Inteligencia Artificial / Artificial Intelligence OT 3
Inteligencia Artificial / Artificial Intelligence OT 4

Contacto

Nombre:
Alexandra Gomez Villa
Correo electrónico:
alexandra.gomez@uab.cat

Idiomas de los grupos

Puede consultar esta información al final del documento.


Prerrequisitos

Recommended "Fundamentals of Machine Learning", "Neural Networks and Deep Learning", and “Fundamentals of Computer Vision”


Objetivos y contextualización

This subject aims to provide students with a comprehensive understanding of 3D vision and its intersection with modern learning-based methods. Building upon fundamental computer vision concepts, the course explores how deep learning techniques have revolutionized our ability to perceive, model, and understand three-dimensional representations from visual data. Students will learn to work with various 3D representations, understand image formation principles, develop skills in single and multi-view 3D inference, and explore advanced topics such as neural rendering and generative 3D models. By the end of this subject, students should be able to design and implement learning-based solutions for 3D vision problems, understand the challenges and limitations of current approaches, and apply these techniques in practical applications ranging from robotics and autonomous driving to virtual reality and computer graphics.


Competencias

    Inteligencia Artificial / Artificial Intelligence
  • Analizar y resolver problemas de forma efectiva, generando propuestas innovadoras y creativas para alcanzar los objetivos.
  • Concebir, diseñar, analizar e implementar sistemas inteligentes capaces de utilizar la visión como mecanismo para interaccionar con el entorno.
  • Conceptualizar y modelar alternativas de soluciones complejas a problemas de aplicación de la inteligencia artificial en diferentes ámbitos, y planificar y gestionar proyectos para el diseño y desarrollo de prototipos que demuestren la validez del sistema propuesto.
  • Desarrollar pensamiento crítico para analizar de forma fundamentada y argumentada alternativas y propuestas tanto propias como ajenas.
  • Introducir cambios en los métodos y los procesos del ámbito de conocimiento para dar respuestas innovadoras a las necesidades y demandas de la sociedad.
  • Que los estudiantes sepan aplicar sus conocimientos a su trabajo o vocación de una forma profesional y posean las competencias que suelen demostrarse por medio de la elaboración y defensa de argumentos y la resolución de problemas dentro de su área de estudio.
  • Trabajar cooperativamente para la consecución de objetivos comunes, asumiendo la propia responsabilidad y respetando el rol de los diferentes miembros del equipo.

Resultados de aprendizaje

  1. Analizar y resolver problemas de forma efectiva, generando propuestas innovadoras y creativas para alcanzar los objetivos.
  2. Desarrollar pensamiento crítico para analizar de forma fundamentada y argumentada alternativas y propuestas tanto propias como ajenas.
  3. Diseñar las mejores arquitecturas de redes convolucionales para la resolución de problemas de recuperación de escenas 3D.
  4. Diseñar las mejores arquitecturas de redes convolucionales para la resolución de problemas de secuencias de imágenes.
  5. Identificar las mejores representaciones para solucionar problemas de recuperación de información 3D a partir de imágenes.
  6. Identificar las mejores representaciones para solucionar problemas sobre secuencias de imágenes. 
  7. Identificar los conceptos básicos y aplicar de manera adecuada las técnicas de análisis de secuencias de imágenes.
  8. Identificar los conceptos básicos y aplicar de manera adecuada las técnicas de recuperación de información 3D en imágenes.
  9. Planificar, desarrollar, evaluar e implementar una solución a un problema particular de recuperación de información tridimensional.
  10. Planificar, desarrollar, evaluar e implementar una solución a un problema particular de secuencias de imágenes.
  11. Proponer nuevas maneras de medir el éxito o el fracaso de la implementación de propuestas o ideas innovadoras.
  12. Proponer nuevos métodos o soluciones alternativas fundamentadas.
  13. Que los estudiantes sepan aplicar sus conocimientos a su trabajo o vocación de una forma profesional y posean las competencias que suelen demostrarse por medio de la elaboración y defensa de argumentos y la resolución de problemas dentro de su área de estudio.
  14. Trabajar cooperativamente para la consecución de objetivos comunes, asumiendo la propia responsabilidad y respetando el rol de los diferentes miembros del equipo.

Contenido

3D Representations

  • Depth maps
  • Classical 3D representations: point clouds, meshes, voxels
  • Transformations between 3D representations

 Point Clouds

  • Point cloud acquisition and preprocessing
  • Deep learning on point clouds
  • Applications in 3D object classification and segmentation

 Implicit Neural Representations

  • Signed distance functions (SDFs)
  • Neural implicit functions
  • Applications in shape representation and reconstruction

 Structure from Motion

  • Camera models and calibration
  • Feature matching and geometric verification
  • Triangulation
  • Bundle adjustment

 Neural Rendering

  • Ray casting/tracing fundamentals 
  • Neural Radiance Fields (NeRF)
  • View synthesis and novel view generation
  • Conditional generation 

 Gaussian Splatting

  • 3D gaussian primitives
  • Differentiable rasterization
  • Optimization techniques
  • Applications in neural rendering

 Diffusion Models

  • Foundations of diffusion models
  • Conditioning strategies

Actividades formativas y Metodología

Título Horas ECTS Resultados de aprendizaje
Tipo: Dirigidas      
Proyecto de curso 0 0 1, 2, 3, 7, 8, 5, 9, 12, 11, 13, 14
Sesiones de laboratorio 0 0 1, 2, 3, 8, 5, 9, 12, 11, 13
Tipo: Supervisadas      
Clases de teoría 0 0 2, 3

Real-world perception challenges and applications guide 3D vision. Throughout this subject, practical applications in robotics, autonomous driving, virtual reality, and computer graphics will motivate each section and direct the organization of the contents.

There will be two types of sessions:

Theory classes: The objective of these sessions is for the teacher to explain the theoretical foundations of 3D vision and learning-based methods. For each topic studied, the underlying mathematical principles of 3D geometry and learning approaches will be explained, as well as the corresponding algorithmic implementations. Topics will range from basic 3D representations to advanced neural rendering techniques.

Laboratory sessions: Laboratory sessions aim to facilitate hands-on experience with 3D vision systems and reinforce the concepts covered in theory classes. During these sessions, students will work through practical cases that require implementing solutions using PyTorch3D and other relevant 3D vision libraries. Students will gain experience with tasks such as single-view 3D reconstruction, neural rendering, point cloud processing, and mesh manipulation. The sessions will emphasize collaborative work and provide students with direct experience applying theoretical concepts to 3D vision problems. Problem-solving will be initiated in the class and will be complemented by a weekly set of problems to work through at home.

Course Project: A course project will be carried out during the semester, where students will tackle a challenging problem in 3D. Examples of projects might include developing neural rendering techniques, creating systems for 3D reconstruction from multiple views, or implementing state-of-the-art methods for processing point clouds and meshes. Students will work in small groups of 2-3, where each member must equally contribute to the final solution. These working groups will be maintained until the end of the semester. They must be self-managed in terms of distribution of roles, work planning, assignment of tasks, management of available resources, conflicts, etc. To develop the project, the groups will work autonomously, while some of the laboratory sessions will be used (1) for the teacher to present the projects’ theme and discuss possible approaches and (2) for monitoring the status of the project 

All the information on the subject and the related documents that the students need will be available at the virtual campus (cv.uab.cat).

Nota: se reservarán 15 minutos de una clase dentro del calendario establecido por el centro o por la titulación para que el alumnado rellene las encuestas de evaluación de la actuación del profesorado y de evaluación de la asignatura o módulo.


Evaluación

Actividades de evaluación continuada

Título Peso Horas ECTS Resultados de aprendizaje
Entrega de laboratorios 10% 35 1,4 1, 2, 3, 7, 8, 5, 9, 10, 12, 13
Entrega de proyecto 50% 72 2,88 1, 2, 3, 4, 7, 8, 5, 6, 9, 10, 12, 11, 13, 14
Examenes escritos 40% 43 1,72 2, 3, 7, 8, 5, 9, 10, 12, 11, 13

The final grade assessment combines three components to evaluate theoretical knowledge, practical application, and problem-solving abilities.

Final Grade Calculation

The final grade is calculated using the following weighted formula:

 Final Grade = (40% × Theory Grade) + (10% × Problems Portfolio Grade) + (50% × Project Grade)

 To pass the course, students must achieve a minimum grade of 5.0 in both the Theory Grade and Project Grade components. While there is no minimum grade requirement for the Problems Portfolio, it's important to note a special condition: if a student's weighted calculation results in a grade of 5.0 or higher, but either their Theory Grade or Project Grade falls below 5.0, their final grade will be automatically capped at 4.5, regardless of their overall weighted average.

 Theory Grade Assessment

 The Theory Grade is designed to assess each student's individual mastery of course content through a continuous assessment model utilizing two examinations. The first is the Mid-term Examination (Exam 1), which takes place mid-semester and covers the first half of the course material. The second is the Final Examination (Exam 2), which is conducted at the end of the semester and focuses on the second half of the course materials. The Theory Grade is determined by calculating the average of these two examinations.

 Theory Grade = (Exam 1 + Exam 2) ÷ 2

 The examinations are structured to evaluate two critical components: students' problem-solving abilities using techniques covered in class, and their conceptual understanding of these techniques. There are specific requirements for the Theory Grade: students must score above 4.0 on both partial exams. If a student achieves an average of 5.0 or higher across both exams but scores below 4.0 on either exam, their theory grade will be adjusted down to 4.5 for the final grade calculation. For students who do not achieve a passing theory grade, there is an opportunity to take a recovery examination, which allows them to retake the failed portions (either Part 1, Part 2, or both) of the continuous evaluation process.

Problems Portfolio Assessment

The problems portfolio serves a dual purpose: to promote continuous engagement 

with course material and to provide opportunities for the practical application of 

theoretical concepts. To successfully complete the portfolio, students must submit at 

least 70% of all assigned problem sets. It's important to note that failing to meet this 

70% submission threshold will automatically result in a Problems Portfolio Grade of 

Zero. Students should compile and document all their completed exercises as part of the portfolio requirements.

 Project Assessment

The project stands as a core component of the course, designed to fulfill multiple educational objectives. Students are required to work collaboratively in teams, develop a comprehensive solution to a given challenge, demonstrate their teamwork capabilities, and present their findings to the class. The project grade is determined through a calculation that takes into account various evaluation components and their respective weightings.

Project Grade = (80% × Deliverables Grade) + (20% × Presentation Grade) 

The project has several mandatory requirements: students must participate in all three components (deliverables, presentation, and self-evaluation). If a student achieves a weighted calculation of 5.0 or higher but fails to participate in any component, their project grade will be adjusted down to 4.5. While students have the opportunity to resubmit failed projects for recovery, these grades will be capped at 7.0 out of 10. There are also several conditions that result in automatic failure of the project: these include non-submission of project deliverables, failure to present the project, use of copied content, oruse of synthetically generated content.


Bibliografía

  • Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge university press.
  •  Ma, Y., Soatto, S., Kosecka, J., & Sastry, S. (2003). An invitation to 3d vision: From images to models.
  •  Ma, X., Hegde, V., & Yolyan, L. (2022). 3D Deep Learning with Python: Design and develop your computer vision model with 3D data using PyTorch3D and more. Packt Publishing Ltd.
  •  Botsch, M. (2010). Polygon mesh processing. AK Peters.
  •  Pharr, M., Jakob, W., & Humphreys, G. (2023). Physically based rendering: From theory to implementation. MIT Press.

Software

Per a les activitats pràctiques del curs utilitzarem Python (NumPy, MatPlotLib, SciKit Learn) i PyTorch


Grupos e idiomas de la asignatura

La información proporcionada es provisional hasta el 30 de noviembre de 2025. A partir de esta fecha, podrá consultar el idioma de cada grupo a través de este enlace. Para acceder a la información, será necesario introducir el CÓDIGO de la asignatura

Nombre Grupo Idioma Semestre Turno
(PAUL) Prácticas de aula 711 Inglés segundo cuatrimestre manaña-mixto
(TE) Teoría 71 Inglés segundo cuatrimestre manaña-mixto