Logo UAB

Computing Acceleration in AI

Code: 106592 ECTS Credits: 6
2024/2025
Degree Type Year
2504392 Artificial Intelligence OT 3
2504392 Artificial Intelligence OT 4

Contact

Name:
Vanessa Moreno Font
Email:
vanessa.moreno@uab.cat

Teachers

Jordi Carrabina Bordoll

Teaching groups languages

You can view this information at the end of this document.


Prerequisites

No n'hi ha. En part d’aquesta assignatura es descriu el maquinari dels acceleradors d'IA que hi ha als xips de servidors, mòbils, encastats, etc. per tant cal tenir els conceptes bàsics d'arquitectura i tecnologia d'ordinadors.


Objectives and Contextualisation

This course aims to analyze the platforms that allow the acceleration of AI computing.

This acceleration is associated with different factors such as: (1) the type of operations that are executed (vector-matrix and matrix-matrix multiplication with accumulation, and complex transfer functions); (2) Data management (both in terms of memory and input-output requirements); (3) the requirements of the systems where AI must be embedded (real-time conditions, limitation of energy consumption, etc.)

As for the scope of this acceleration, although both the learning and inference phases are accelerated, and since learning is carried out on servers, we will focus mostly on platforms with limited resources (compared to servers) such as mobile or embedded platforms. The different general purpose (CPU, GPU, FPGA) and specific (DPU/TPU/NPU, ML and NN processors, bionic, neuromorphic, memristor-based and quantum chips) computational platforms will be analyzed along with the deployment methodologies.

All this in the field of the Internet of Things (IoT) made up of systems that include devices, the edge and the cloud.


Competences

    Artificial Intelligence
  • Act within the field of knowledge by evaluating the social, economic and environmental impact beforehand.
  • Analyse and solve problems effectively, generating innovative and creative proposals to achieve objectives.
  • Conceive, design, analyse and implement autonomous cyber-physical agents and systems capable of interacting with other agents and/or people in open environments, taking into account collective demands and needs.
  • Conceptualize and model alternatives of complex solutions to problems of application of artificial intelligence in different fields and create prototypes that demonstrate the validity of the proposed system.
  • Identify, analyse and evaluate the ethical and social impact, the human and cultural context, and the legal implications of the development of artificial intelligence and data manipulation applications in different fields.
  • Work cooperatively to achieve common objectives, assuming own responsibility and respecting the role of the different members of the team.

Learning Outcomes

  1. Adapt AI algorithms to implement inference in embedded platforms with limited resources and real-time and energy-efficient conditions.
  2. Analyse and solve problems effectively, generating innovative and creative proposals to achieve objectives.
  3. Analyse the sustainability indicators of academic and professional activities in the field by incorporating the social, economic and environmental factors at play.
  4. Design and validate the methodology for implementing learning and inference in general- and specific-purpose processors.
  5. Design, prototype and evaluate the performance of embedded systems in resource-constrained, real-time and energy-efficient conditions.
  6. Identify the best solutions for mapping an AI solution onto a distributed IoT system and device, both edge and cloud.
  7. Identify the ethical and social impact and the legal and regulatory implications of AI systems for sending training data to the cloud.
  8. Measure and optimise the performance of AI algorithm implementations on platforms.
  9. Propose viable projects and actions that enhance social, economic and environmental benefits.
  10. Use AI network learning acceleration technologies and services in the cloud and in peripheries.
  11. Work cooperatively to achieve common objectives, assuming own responsibility and respecting the role of the different members of the team.

Content

CONTENTS

1. IoT Platforms for AI

  • Cloud
  • Edge (mobile, embedded)
  • Device (resource-constrained)

2. Deployment methodologies and application requirements

  • Training
  • Inference: real-time, memory, energy

3. Analysis of the Computational Complexity of AI computation

  • Types of operations
  • Arithmetic of operations
  • Data management
  • Techniques to reduce computational complexity

4. Acceleration techniques and technologies

  • General-purpose platforms: CPU, GPU, FPGA
  • Application-specific platforms for ML and NN processing: DPU/TPU/NPU
  • Advanced chips: neuromorphic, memristor-based, bionic  and quantum 

LABS

Deployment of one application to (1) a mobile device (from students) and (2) embedded platform


Activities and Methodology

Title Hours ECTS Learning Outcomes
Type: Directed      
Master classes and seminars 26 1.04 1, 3, 4, 5, 6, 7, 8, 10
Type: Supervised      
Laboratories & Design Project 24 0.96 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Type: Autonomous      
Study & Homework 98 3.92 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

The learning methodology will combine: master classes, activities in tutored sessions, project based-learning and use cases, debates and other collaborative activities; and laboratory sessions.

Attendance will be mandatory for the IoT-IA design project and Laboratory sessions that will be done in groups of 2 or 3 people.

The laboratory sessions will use a guided format.

This course will use UAB's virtual campus at https://cv.uab.cat.

Annotation: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.

Annotation: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.


Assessment

Continous Assessment Activities

Title Weighting Hours ECTS Learning Outcomes
Evaluation of activities developed in tutored sessions (laboratories) 40% 0 0 1, 2, 4, 5, 6, 8, 10, 11
Individual activities (i.e. exercices) 20% 0 0 1, 2, 4, 5, 6, 8, 10
Report and defence of the design project 40% 2 0.08 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11

This course does not provide for the single assessment system (No exam).

The evaluation of the course will follow the rules of the continuous evaluation and the final grade for the course, is calculated in the following way:
A - 20% from the mark obtained by the student through the evaluation of activities (i.e. exercises). When an evaluation activity is scheduled, the evaluation indicators will be reported and its weight in this qualification.
B - 40% from the mark obtained through the evaluation of the IoT-AI design project. 
C - 40% from the mark obtained by the student of the laboratory work and reports. It is necessary to exceed 5 (out of 10) in this item to pass the subject.

All activities will require delivering report through the virtual campus.

Type A activities will be proposed along the course around lectures.

Type B activities, will require delivering partial reports every 2 weeks.

Type C activities, will require delivering two partial reports (one by mid semester and a 2nd one at the end).

To obtain MH it will be necessary that the students have an overall qualification higher than 9 with the limitations of the UAB (1MH/20students). As a reference criterion, they will be assigned in descending order.

A final weighted average mark not lower than 50% is sufficient to pass the course, provided that a score over one third of the range is attained in every one of the Marks for items B and C. If not reached, the mark will be 4.0.

Plagiarism will not be tolerated. All students involved in a plagiarism activity will be failed automatically. A final mark no higher than 30% will be assigned.

Open source code or available libraries can be used but they must be referred in the corresponding reports.

An student not having achieved a sufficient final weighted average mark, may opt to apply for remedial activities (individual work or additional synthesis examination) the subject under the following conditions:
- the student must have participated in the laboratory activities and design project, and
- the student must have a final weighted average higher than 30%, and
- the student must not have failed any activity due to plagiarism.

The student will receive a grade of "Not Evaluable" if:
- the student has not been able to be evaluated in the laboratory activities due to not attendance or not deliver the corresponding reports without justified cause.
- the student has not carried out a minimum of 50% of the activities proposed.
- the student has not done the design project.

For each assessment activity, the student or the group will be given the corresponding comments. Students can make complaints about the grade of the activity, which will be evaluated by the teaching staff responsible for the subject.

Repeating students will be able to “save” their grade in laboratory activity.


Bibliography

[1] Russell, S., & Norvig, P. (2016). Artificial Intelligence: A Modern Approach. 
[2] Li Du and Yuan Du. Hardware Accelerator Design for Machine Learning. http://dx.doi.org/10.5772/intechopen.72845
[3] Huawei Technologies Co. Artificial Intelligence Tecnology, Ltd.ISBN 978-981-19-2879-6
[4] XIAOQIANG MA et al. A Survey on Deep Learning Empowered IoT Applications. Digital Object Identifier 10.1109/ACCESS.2019.2958962
[5] A Reconfigurable CNN-Based Accelerator Design for Fast and Energy-Efficient Object Detection System on Mobile FPGA
[6] C. -B. Wu, C. -S. Wang and Y. -K. Hsiao, "Reconfigurable Hardware Architecture Design and Implementation for AI Deep Learning Accelerator," 2020 IEEE 9th Global Conference on Consumer Electronics (GCCE), Kobe, Japan, 2020, pp. 154-155, doi:10.1109/GCCE50665.2020.9291854.
[7] Robert David et al. TENSORFLOW LITE MICRO: EMBEDDED MACHINE LEARNING ON TINYML SYSTEMS. Proceedings of the 4 th MLSys Conference, San Jose, CA, USA,
[8] Pete Warden, Daniel Situnayake "TinyML: Machine Learning with TensorFlow Lite on Arduino and Ultra-Low-Power Microcontrollers" https://tinymlbook.com/


Software

We plan to use the tools of the TinyML environment : Tensorflow Lite (adapting the design flow to the platform.
 
There exists the option to use Qualcomm's tools for deployment to mobile NN accelerators.

Language list

Name Group Language Semester Turn
(PAUL) Classroom practices 1 English first semester afternoon
(TE) Theory 1 English first semester afternoon