Degree | Type | Year | Semester |
---|---|---|---|
4313136 Modelling for Science and Engineering | OT | 0 | 1 |
It is recommended to have a basic knowledge of programming languages like Python and basic skills of any Linux distribution.
The objectives of the module:
-Solve data analysis problems with open source tools
-Understand tool data management limitations and learn criteria to select suitable tools for a specific problem
-Learn data query methodologies related to each technology
-Use Cloud Computing providers to solve data analysis problems
-Apply a data analysis methodology to solve practical problems
By the end of the lectures and practical labs students should have enough knowledge to understand the requirements of typical large data analysis problems in industrial contexts. They should be able to pick some combination of tools and design a solution for a given large data analysis problem. This subject is oriented to develop data problem solving skills. Languages, tools and techniques are described in a data analysis context and students will solve a list of data problems applying the technology described at every chapter.
T1: Introduction to Distributed Systems and large data processing systems (2 hours)
T2: Cloud computing (2 hours)
T3: Cluster and supercomputer infraestructures (12 jours)
T4: Cloud Networking and Virtual Private Clouds (9 hours)
T5: Fault tolerance systems (9 hours)
T6: Database Cloud project: relational and DynamoDB implementations (9 hours)
T7: Serverless services and Lambda (9 hours)
The methodology will combine classroom work and problem solving in laboratory sessions. This planned methodology and proposed assessment could be modified depending on restrictions on physical attendance to University classrooms due to health measures.
Virtual classes and labs will take place in a class Teams virtual space where all students will be invited to access. Lab sessions will be scheduled at the beginning of the course and will use the same Teams space for the development of all practical labs. Students will use a local Linux environment: native, using VirtualBox or using a Cloud Computing instance.
Annotation: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.
Title | Hours | ECTS | Learning Outcomes |
---|---|---|---|
Type: Directed | |||
Laboratory | 24 | 0.96 | 2, 1, 4, 8, 5, 9 |
Lectures | 38 | 1.52 | 2, 1, 7, 6, 8, 3, 9 |
Type: Autonomous | |||
Practical exercise development | 62 | 2.48 | 2, 1, 8, 5 |
Evaluation will come out from the combination of work developed in the lab sessions and a final exam.
Title | Weighting | Hours | ECTS | Learning Outcomes |
---|---|---|---|---|
ELB Lab | 20% | 6 | 0.24 | 2, 7, 6, 4, 8, 5 |
Infrastructure lab | 20% | 6 | 0.24 | 2, 7, 6, 4, 8 |
Lambda Lab | 20% | 4 | 0.16 | 2, 1, 7, 4, 8, 3, 9 |
RDS Lab | 20% | 6 | 0.24 | 2, 1, 7, 6, 8, 3, 9 |
VPC Lab | 20% | 4 | 0.16 | 2, 1, 6, 8, 5 |
Martin Kleppmann. "Designing Data-Intensive Applications". O'Reilly, 2017.
A. Wittig, M. Wittig. "Amazon Web Services in Action", Manning, 2nd Edition, 2018.
G. Coulouris, J. Dollimore and T. Kinderg, "Distributed Systems. Concepts and design ", Addison-Wesley, 5th edition, 2012.
Bell, Charles; Kindahl, Mats; Thalmann, Lars. "MySQL High Availability". O'Reilly, 2010.
Chang, Fay, et al. "Bigtable: A Distributed Storage System for Structured Data." OSDI, 2006
Dewitt, David, and Jim Gray. "Parallel Database Systems: The Future of High Performance Database Processing." Communications of the ACM 35, no. 6 (1992): 85-98
Schwartz, Baron; Zaitsev, Peter; Tkachenko, Vadim; Zawodny, Jeremy D.; Lentz, Arjen; Balling, Derek J. "High Performance MySQL", O'Reilly, 2008.
Seyed M. M. "Saied" Tahaghoghi and Hugh E. Williams. Learning MySQL. O’Reilly, 2006
Nathan Haines. “Beginning Ubuntu for Windows and Mac Users”. Apress 2015. Available as electronic resource at UAB library
William E. Shotts. “The Linux Command Line”. Second Internet Edition. 2013. http://linuxcommand.org/tlcl.php
Dan C. Marinescu. “Cloud Computing. Theory and Practice”. Morgan-Kaufmann. 2018.
R. Buyya, R. N. Calheiros, A. V. Dastjerdi. “Big data. Principles and paradigms”. Morgan-Kaufmann. 2016.
In the subject, we are going to use the last version of the following software platforms and tools
-Ubuntu Linux
-SLURM
-Linux development environment