2023/2024

Degree | Type | Year | Semester |
---|---|---|---|

2503740 Computational Mathematics and Data Analytics | OT | 4 | 1 |

- Name:
- Jose Barrera Gomez
- Email:
- jose.barrera@uab.cat

You can check it through this link. To consult the language you will need to enter the CODE of the subject. Please note that this information is provisional until 30 November 2023.

The student is supposed to be familiar with the binomial and the normal distributions, as well as with R.

The main aims of the course are:

- Learn about the main types of study designs in the field of Epidemiology.

- Learn about the potential impact of both missing data and error measurement on the results of a statistical analysis.

- Learn about the main indicators to measure the presence of a disease or an exposure.

- Learn about the main indicators to measure the association between exposure and disease, specially in the case where both exposure and outcome are binary.

- Be able to identify the appropriate statistical tools for the assessment of the association between a given exposure (potential risk or protective factor) and a given health outcome, according to the characteristics of the study design, in the context of epidemiological studies.

- Learn about the design and implementation of an exact test according to the study design.

- Learn about the design and implementation of simulation studies related to concepts such as empirical power or sample size calculation.

- Be able to search scientific papers using PubMed efficiently.

- Get familiar with the reading of scientific papers.

- Be able to apply the concepts studied in the subject to solve exercises based in true epidemiological data.

- Improve the efficiency when programming in R to solve the practical tasks proposed during the course.

- Be able to write reproducible statistical reports using LaTeX and the R package knitr.

- Design, develop, maintain and evaluate software systems that allow large volumes of heterogeneous data to be represented, stored and handled in accordance with the established requirements.
- Formulate hypotheses and think up strategies to confirm or refute them.
- Make effective use of bibliographical resources and electronic resources to obtain information.
- Students must be capable of applying their knowledge to their work or vocation in a professional way and they should have building arguments and problem resolution skills within their area of study.
- Students must be capable of collecting and interpreting relevant data (usually within their area of study) in order to make statements that reflect social, scientific or ethical relevant issues.
- Students must be capable of communicating information, ideas, problems and solutions to both specialised and non-specialised audiences.
- Students must develop the necessary learning skills to undertake further training with a high degree of autonomy.
- Use computer applications for statistical analysis, numerical and symbolic computation, graphic visualisation, optimisation and other to experiment and solve problems.
- Using criteria of quality, critically evaluate the work carried out.
- Work cooperatively in a multidisciplinary context assuming and respecting the role of the different members of the team.

- Analyze data corresponding to epidemiological studies or clinical trials.
- Apply statistical methods to the analysis of gene-expression data.
- Draft the technical report based on a statistical analysis.
- Extract relevant conclusions from applied problems through the application of statistical methods.
- Extract relevant conclusions from applied problems, through the application of advanced statistical methods.
- Identify the special methodological characteristics of statistical analysis according to the distinct areas of application.
- Identify the techniques of statistical inference most commonly used in epidemiology studies.
- Identify the utility of statistical knowledge in bioinformatics and in health sciences.
- Identify, use and interpret the criteria for evaluating degree of fulfillment of the requirements needed to apply each advanced statistical procedure.
- Interpret results with advanced methodologies, and extract conclusions.
- Interpret statistical results in applied contexts.
- Make effective use of bibliographical resources and electronic resources to obtain information.
- Prepare technical reports that clearly express the results and conclusions of the study using terminology pertaining to the field of application.
- Propose statistical models appropriate for epidemiological studies.
- Recognise the most widely used databases in the field of health sciences.
- Recognize the advantages and disadvantages of distinct statistical methodologies when applied to the various disciplines.
- Students must be capable of applying their knowledge to their work or vocation in a professional way and they should have building arguments and problem resolution skills within their area of study.
- Students must be capable of collecting and interpreting relevant data (usually within their area of study) in order to make statements that reflect social, scientific or ethical relevant issues.
- Students must be capable of communicating information, ideas, problems and solutions to both specialised and non-specialised audiences.
- Students must develop the necessary learning skills to undertake further training with a high degree of autonomy.
- Understand statistical software for programming functions and advanced procedures.
- Using criteria of quality, critically evaluate the work carried out.
- Work cooperatively in a multidisciplinary context, taking on and respecting the role of the distinct members in the team.

*

1. Introduction to the contents. Introduction to reproducible research using the R package knitr.

2. PubMed: Searching scientific papers. Structure of a paper.

3. Classification of studies

(a) Topics in biostatistics

(b) Epidemiological studies

i. Notation

ii. Classification criteria

iii. Types of epidemiological study design: Randomised epidemiological trials, Cohort, Case-control, Case-crossover, Cross-sectional, Ecological

(c) Studies classification diagram

4. Classification of variables and related regression models

(a) According to the measure type

(b) According to the role in the study

(c) Types of explanatory variables

(d) Types of regression models according to the metric of the response variable

(e) Response variables of type time

5. Dealing with missing data

(a) Introduction

(b) Types of missing data

(c) Dealing with missing data

6. Example of statistical methods in Health Sciences: Integration of multiple imputation in cluster analysis

(a) Overview of cluster analysis

(b) Overview of multiple imputation

(c) Integration of multiple imputation in cluster analysis

(d) Software

7. Measures of disease presence

(a) Introduction

(b) Prevalence

i. Definition

ii. Estimation

iii. Comments

(c) Cumulative incidence

i. Definition

ii. Comments

(d) Incidence rate

i. Definition

ii. Comments

iii. Comparing two incidence rates

8. Measures of association between exposure and disease

(a) Introduction

(b) The relative risk

i. Definition

ii. Comments

(c) The odds ratio

i. The odds

ii. The odds ratio

iii. Comments

(d)Confidence intervals for OR and RR

(e) The attributable risk

i. Population attributable risk

ii. Exposure attributable risk

9. Causality, confusion and interaction

(a) Introduction

(b) Causality

(c) Confusion

(d) Interaction

10. Example of statistical methods in Health Sciences: Regression models with transformed variables. Interpretation and software

(a) Overview of the linear regression model

(b) Logarithm transformation in linear regression models. Why?

(c) Interpretation of results in the original scale of the variables

(d) Software

* Unless the requirements enforced by the health authorities demand a prioritization or reduction of these contents.

- Theory sessions: In these sessions, the different concepts of the subject as well as illustrative examples are introduced. Also, some exercises are proposed to be solved (usually requiring R usage). The methodology is based in the presentation and discussion of slides as well as the presentation of some additional materials (mainly news published in online media and scientific papers searched in PubMed).

- Practice sessions: In these sessions, several practical examples and exercises will be proposed. Activities related to R usage, PubMed search, papers reading and statistical analyses will be developed. Some of the proposed exercises will be mandatory.

- Seminars attendance: The Department of Mathematics and the UAB Statistical Service organize statistical seminars. The students and the teacher would attend some of them, depending on the topic and the schedule.

* The proposed teaching methodology may experience some modifications depending on the restrictions to face-to-face activities enforced by health authorities.

**Annotation**: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.

Title | Hours | ECTS | Learning Outcomes |
---|---|---|---|

Type: Directed | |||

Theory sessions | 28 | 1.12 | |

Type: Supervised | |||

Practice sessions | 28 | 1.12 | |

Type: Autonomous | |||

Personal work | 94 | 3.76 |

- Assiignments in grup during the course. Teacher could assess individual participation with oral questions.

- Exam (face-to-face).

- Optional compensatory exam (face-to-face). If the student attend the compensatory exam, its qualification will substitute the score in the previous, ordinary exam, regardless of the score obtained in bothe exams.

- The final scoring of the course out of 10, Q, will be:

Q = min{T, E}, if T is less than 4 or E is less than 3.5,

Q = (T + E) / 2, if T is greater than or equal to 4 and E is greater than or equal to 3.5,

where T and E are the scoring, out of 10, of the assignments and the exam, respectively.

- This subject does not offer the possibility of a single assessment (i.e. "evaluación única").

* Student’s assessment may experience some modifications depending on the restrictions to face-to-face activities enforced by health authorities.

Title | Weighting | Hours | ECTS | Learning Outcomes |
---|---|---|---|---|

Assignments in group | 30% | 0 | 0 | 1, 2, 22, 21, 13, 5, 4, 8, 6, 7, 9, 11, 10, 14, 20, 19, 17, 18, 16, 15, 3, 23, 12 |

Exam (or compensatory exam) | 50% | 0 | 0 | 5, 4, 8, 6, 7, 11, 10, 19, 16 |

Exercises in group | 20% | 0 | 0 | 1, 21, 13, 5, 4, 8, 6, 7, 9, 11, 10, 14, 19, 17, 18, 16, 3, 23, 12 |

Basic: All concepts developed in the class sessions will be published at Moodle, including the slides that will be discussed in the theory sessions.

Further readings: Students interested in going further can explore the following items.

- Agresti, Alan. Categorical Data Analysis. Wiley, 3rd Edition, 2013.

- Breslow, N., N. Day. Statistical methods in cancer research. International Agency for Research on Cancer, 1980.

- Clayton D., Hills, M. Statistical models in epidemiology. Oxford University Press, 1993.

- Dalgaard, P. Introductory Statistics with R. Springer, 3rd Edition, 2002.

- dos Santos, I. Cancer epidemiology: principles and methods. International Agency for Research on Cancer, 1999.

- Gordis, L. Epidemiology. W.B. Saunders, 2004.

- Lachin, J.M. Biostatistical Methods: The Assessment of Relative Risks. Wiley, 2000.

- Motulsky, H.J. Intuitive Biostatistics. Oxford University Press, 1995.

- Rothman, K., Greenland, S. Modern epidemiology. Lippincott Williams & Wilkins, 1998.

- Rothman, K. Epidemiology: an introduction. Oxford University Press, 2002.

- Wassertheil-Smoller, S. Biostatistics and epidemiology: a primer for health and biomedical prefessionals. Springer, 3rd Edition, 2004.

- R
- RStudio
- LaTeX