Logo UAB

Visió 3D i Anàlisi del Moviment

Codi: 106583 Crèdits: 6
2025/2026
Titulació Tipus Curs
Intel·ligència Artificial / Artificial Intelligence OT 3
Intel·ligència Artificial / Artificial Intelligence OT 4

Professor/a de contacte

Nom:
Alexandra Gomez Villa
Correu electrònic:
alexandra.gomez@uab.cat

Idiomes dels grups

Podeu consultar aquesta informació al final del document.


Prerequisits

Recommended "Fundamentals of Machine Learning", "Neural Networks and Deep Learning", and “Fundamentals of Computer Vision”


Objectius

This subject aims to provide students with a comprehensive understanding of 3D vision and its intersection with modern learning-based methods. Building upon fundamental computer vision concepts, the course explores how deep learning techniques have revolutionized our ability to perceive, model, and understand three-dimensional representations from visual data. Students will learn to work with various 3D representations, understand image formation principles, develop skills in single and multi-view 3D inference, and explore advanced topics such as neural rendering and generative 3D models. By the end of this subject, students should be able to design and implement learning-based solutions for 3D vision problems, understand the challenges and limitations of current approaches, and apply these techniques in practical applications ranging from robotics and autonomous driving to virtual reality and computer graphics.


Competències

    Intel·ligència Artificial / Artificial Intelligence
  • Analitzar i resoldre problemes de manera efectiva, i generar propostes innovadores i creatives per aconseguir els objectius.
  • Concebre, dissenyar, analitzar i implementar sistemes intel·ligents capaços d’utilitzar la visió com a mecanisme per interaccionar amb l’entorn.
  • Conceptualitzar i modelar alternatives de solucions complexes per a problemes d’aplicació de la intel·ligència artificial en diferents àmbits, i planificar i gestionar projectes per al disseny i desenvolupament de prototips que demostrin la validesa del sistema proposat.
  • Desenvolupar pensament crític per analitzar de manera fonamentada i argumentada alternatives i propostes tant pròpies com alienes.
  • Introduir canvis en els mètodes i els processos de l’àmbit de coneixement per donar respostes innovadores a les necessitats i demandes de la societat. 
  • Que els estudiants sàpiguen aplicar els coneixements propis a la seva feina o vocació d'una manera professional i tinguin les competències que se solen demostrar per mitjà de l'elaboració i la defensa d'arguments i la resolució de problemes dins de la seva àrea d'estudi.
  • Treballar cooperativament per aconseguir objectius comuns, assumint la pròpia responsabilitat i respectant el rol dels diferents membres de l’equip.

Resultats d'aprenentatge

  1. Analitzar i resoldre problemes de manera efectiva, i generar propostes innovadores i creatives per aconseguir els objectius.
  2. Desenvolupar pensament crític per analitzar de manera fonamentada i argumentada alternatives i propostes tant pròpies com alienes.
  3. Dissenyar les millors arquitectures de xarxes convolucionals per a la resolució de problemes de recuperació d’escenes 3D.
  4. Dissenyar les millors arquitectures de xarxes convolucionals per a la resolució de problemes de seqüències d’imatges.
  5. Identificar els conceptes bàsics i aplicar de manera adequada les tècniques d’anàlisi de seqüències d’imatges.
  6. Identificar els conceptes bàsics i aplicar de manera adequada les tècniques de recuperació d’informació 3D en imatges.
  7. Identificar les millors representacions per solucionar problemes de recuperació d’informació 3D a partir d’imatges.
  8. Identificar les millors representacions per solucionar problemes sobre seqüències d’imatges.
  9. Planificar, desenvolupar, avaluar i implementar una solució per a un problema particular de recuperació d’informació tridimensional.
  10. Planificar, desenvolupar, avaluar i implementar una solució per a un problema particular de seqüències d’imatges.
  11. Proposar nous mètodes o solucions alternatives fonamentades.
  12. Proposar noves maneres de mesurar l’èxit o el fracàs de la implementació de propostes o idees innovadores.
  13. Que els estudiants sàpiguen aplicar els coneixements propis a la seva feina o vocació d'una manera professional i tinguin les competències que se solen demostrar per mitjà de l'elaboració i la defensa d'arguments i la resolució de problemes dins de la seva àrea d'estudi.
  14. Treballar cooperativament per aconseguir objectius comuns, assumint la pròpia responsabilitat i respectant el rol dels diferents membres de l’equip.

Continguts

 3D Representations

  • Depth maps
  • Classical 3D representations: point clouds, meshes, voxels
  • Transformations between 3D representations

 Point Clouds

  • Point cloud acquisition and preprocessing
  • Deep learning on point clouds
  • Applications in 3D object classification and segmentation

 Implicit Neural Representations

  • Signed distance functions (SDFs)
  • Neural implicit functions
  • Applications in shape representation and reconstruction

 Structure from Motion

  • Camera models and calibration
  • Feature matching and geometric verification
  • Triangulation
  • Bundle adjustment

 Neural Rendering

  • Ray casting/tracing fundamentals 
  • Neural Radiance Fields (NeRF)
  • View synthesis and novel view generation
  • Conditional generation 

 Gaussian Splatting

  • 3D gaussian primitives
  • Differentiable rasterization
  • Optimization techniques
  • Applications in neural rendering

 Diffusion Models

  • Foundations of diffusion models
  • Conditioning strategies

Activitats formatives i Metodologia

Títol Hores ECTS Resultats d'aprenentatge
Tipus: Dirigides      
Projecte de curs 0 0 1, 2, 3, 5, 6, 7, 9, 11, 12, 13, 14
Sessions de laboratori 0 0 1, 2, 3, 6, 7, 9, 11, 12, 13
Tipus: Supervisades      
Classes de teoria 0 0 2, 3

Real-world perception challenges and applications guide 3D vision. Throughout this subject, practical applications in robotics, autonomous driving, virtual reality, and computer graphics will motivate each section and direct the organization of the contents.

There will be two types of sessions:

Theory classes: The objective of these sessions is for the teacher to explain the theoretical foundations of 3D vision and learning-based methods. For each topic studied, the underlying mathematical principles of 3D geometry and learning approaches will be explained, as well as the corresponding algorithmic implementations. Topics will range from basic 3D representations to advanced neural rendering techniques.

Laboratory sessions: Laboratory sessions aim to facilitate hands-on experience with 3D vision systems and reinforce the concepts covered in theory classes. During these sessions, students will work through practical cases that require implementing solutions using PyTorch3D and other relevant 3D vision libraries. Students will gain experience with tasks such as single-view 3D reconstruction, neural rendering, point cloud processing, and mesh manipulation. The sessions will emphasize collaborative work and provide students with direct experience applying theoretical concepts to 3D vision problems. Problem-solving will be initiated in the class and will be complemented by a weekly set of problems to work through at home.

Course Project: A course project will be carried out during the semester, where students will tackle a challenging problem in 3D. Examples of projects might include developing neural rendering techniques, creating systems for 3D reconstruction from multiple views, or implementing state-of-the-art methods for processing point clouds and meshes. Students will work in small groups of 2-3, where each member must equally contribute to the final solution. These working groups will be maintained until the end of the semester. They must be self-managed in terms of distribution of roles, work planning, assignment of tasks, management of available resources, conflicts, etc. To develop the project, the groups will work autonomously, while some of the laboratory sessions will be used (1) for the teacher to present the projects’ theme and discuss possible approaches and (2) for monitoring the status of the project 

All the information on the subject and the related documents that the students need will be available at the virtual campus (cv.uab.cat).

Nota: es reservaran 15 minuts d'una classe, dins del calendari establert pel centre/titulació, perquè els alumnes completin les enquestes d'avaluació de l'actuació del professorat i d'avaluació de l'assignatura.


Avaluació

Activitats d'avaluació continuada

Títol Pes Hores ECTS Resultats d'aprenentatge
Examens escrits 40% 43 1,72 2, 3, 5, 6, 7, 9, 10, 11, 12, 13
Lliurament de laboratoris 10% 35 1,4 1, 2, 3, 5, 6, 7, 9, 10, 11, 13
Lliurament de projecte 50% 72 2,88 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14

The final grade assessment combines three components to evaluate theoretical knowledge, practical application, and problem-solving abilities.

Final Grade Calculation

The final grade is calculated using the following weighted formula:

 Final Grade = (40% × Theory Grade) + (10% × Problems Portfolio Grade) + (50% × Project Grade)

 To pass the course, students must achieve a minimum grade of 5.0 in both the Theory Grade and Project Grade components. While there is no minimum grade requirement for the Problems Portfolio, it's important to note a special condition: if a student's weighted calculation results in a grade of 5.0 or higher, but either their Theory Grade or Project Grade falls below 5.0, their final grade will be automatically capped at 4.5, regardless of their overall weighted average.

 Theory Grade Assessment

 The Theory Grade is designed to assess each student's individual mastery of course content through a continuous assessment model utilizing two examinations. The first is the Mid-term Examination (Exam 1), which takes place mid-semester and covers the first half of the course material. The second is the Final Examination (Exam 2), which is conducted at the end of the semester and focuses on the second half of the course materials. The Theory Grade is determined by calculating the average of these two examinations.

 Theory Grade = (Exam 1 + Exam 2) ÷ 2

 The examinations are structured to evaluate two critical components: students' problem-solving abilities using techniques covered in class, and their conceptual understanding of these techniques. There are specific requirements for the Theory Grade: students must score above 4.0 on both partial exams. If a student achieves an average of 5.0 or higher across both exams but scores below 4.0 on either exam, their theory grade will be adjusted down to 4.5 for the final grade calculation. For students who do not achieve a passing theory grade, there is an opportunity to take a recovery examination, which allows them to retake the failed portions (either Part 1, Part 2, or both) of the continuous evaluation process.

Problems Portfolio Assessment

The problems portfolio serves a dual purpose: to promote continuous engagement 

with course material and to provide opportunities for the practical application of 

theoretical concepts. To successfully complete the portfolio, students must submit at 

least 70% of all assigned problem sets. It's important to note that failing to meet this 

70% submission threshold will automatically result in a Problems Portfolio Grade of 

Zero. Students should compile and document all their completed exercises as part of the portfolio requirements.

 Project Assessment

The project stands as a core component of the course, designed to fulfill multiple educational objectives. Students are required to work collaboratively in teams, develop a comprehensive solution to a given challenge, demonstrate their teamwork capabilities, and present their findings to the class. The project grade is determined through a calculation that takes into account various evaluation components and their respective weightings.

Project Grade = (80% × Deliverables Grade) + (20% × Presentation Grade) 

The project has several mandatory requirements: students must participate in all three components (deliverables, presentation, and self-evaluation). If a student achieves a weighted calculation of 5.0 or higher but fails to participate in any component, their project grade will be adjusted down to 4.5. While students have the opportunity to resubmit failed projects for recovery, these grades will be capped at 7.0 out of 10. There are also several conditions that result in automatic failure of the project: these include non-submission of project deliverables, failure to present the project, use of copied content, oruse of synthetically generated content.


Bibliografia

  • Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge university press.
  •  Ma, Y., Soatto, S., Kosecka, J., & Sastry, S. (2003). An invitation to 3d vision: From images to models.
  •  Ma, X., Hegde, V., & Yolyan, L. (2022). 3D Deep Learning with Python: Design and develop your computer vision model with 3D data using PyTorch3D and more. Packt Publishing Ltd.
  •  Botsch, M. (2010). Polygon mesh processing. AK Peters.
  •  Pharr, M., Jakob, W., & Humphreys, G. (2023). Physically based rendering: From theory to implementation. MIT Press.

Programari

Per a les activitats pràctiques del curs utilitzarem Python (NumPy, MatPlotLib, SciKit Learn) i PyTorch


Grups i idiomes de l'assignatura

La informació proporcionada és provisional fins al 30 de novembre de 2025. A partir d'aquesta data, podreu consultar l'idioma de cada grup a través d’aquest enllaç. Per accedir a la informació, caldrà introduir el CODI de l'assignatura

Nom Grup Idioma Semestre Torn
(PAUL) Pràctiques d'aula 711 Anglès segon quadrimestre matí-mixt
(TE) Teoria 71 Anglès segon quadrimestre matí-mixt