Logo UAB
2020/2021

Modelling and Inference

Code: 104392 ECTS Credits: 6
Degree Type Year Semester
2503740 Computational Mathematics and Data Analytics OB 2 1
The proposed teaching and assessment methodology that appear in the guide may be subject to changes as a result of the restrictions to face-to-face class attendance imposed by the health authorities.

Contact

Name:
David Moriņa Soler
Email:
David.Morina@uab.cat

Use of Languages

Principal working language:
catalan (cat)
Some groups entirely in English:
No
Some groups entirely in Catalan:
Yes
Some groups entirely in Spanish:
No

Prerequisites

A good knowledge of the contents of the subjects studied during the first course is considered very important, especially those of Probability and Calculus.

Objectives and Contextualisation

This subject is the first of the Degree dedicated to Statistical Inference, which is the part of the Statistics that allows to obtain, in a controlled way, information about a population based on the data of a "representative" sample. The subject has a central character within the studies, as different concepts and techniques that will be used in many of the subjects that will be studied from now on are introduced here. Specifically, an introduction to the Statistics will begin, and then the estimation of parameters, both punctual and by confidence intervals, will be treated, as well as classical parametric hypotheses testing, both for one and two normal and dichotomous populations, and independence tests. Finally, the simple linear regression model will be introduced.

The presentiality of the teaching and of the evaluable activities will be adapted following the recommendations of the health authorities, in order to guarantee the safety of all.

Competences

  • Calculate and reproduce certain mathematical routines and processes with ease.
  • Formulate hypotheses and think up strategies to confirm or refute them.
  • Relate new mathematical objects with other known objects and deduce their properties.
  • Students must be capable of applying their knowledge to their work or vocation in a professional way and they should have building arguments and problem resolution skills within their area of study.
  • Students must be capable of collecting and interpreting relevant data (usually within their area of study) in order to make statements that reflect social, scientific or ethical relevant issues.
  • Students must have and understand knowledge of an area of study built on the basis of general secondary education, and while it relies on some advanced textbooks it also includes some aspects coming from the forefront of its field of study.
  • Use computer applications for statistical analysis, numerical and symbolic computation, graphic visualisation, optimisation and other to experiment and solve problems.
  • Using criteria of quality, critically evaluate the work carried out.

Learning Outcomes

  1. Analyse data using inference techniques for one or two samples.
  2. Choose the appropriate statistical software to analyse the data through inference techniques.
  3. Describe the basic properties of timestamp and interval estimators.
  4. Identify statistical distributions.
  5. Identify statistical inference as an instrument of prognosis and prediction.
  6. Interpret obtained results and provide conclusions that refer to the experimental hypothesis.
  7. Students must be capable of applying their knowledge to their work or vocation in a professional way and they should have building arguments and problem resolution skills within their area of study.
  8. Students must be capable of collecting and interpreting relevant data (usually within their area of study) in order to make statements that reflect social, scientific or ethical relevant issues.
  9. Students must have and understand knowledge of an area of study built on the basis of general secondary education, and while it relies on some advanced textbooks it also includes some aspects coming from the forefront of its field of study.
  10. Understabd the distinct methods of data collection.
  11. Use statistical software to manage databases.
  12. Use statistical software to obtain the summary indexes of study variables.
  13. Use the properties of distribution function.
  14. Use the properties of the density function.
  15. Using criteria of quality, critically evaluate the work carried out.
  16. Validate and manage information to carry out statistical processing on this.

Content

Preliminaries of Probability (reminder): Probability and random variables. Law concept. Discrete distributions. Density and probability functions. Expectation and variance. Moment generating function. Examples.

Topic 1. Introduction to Statistics.

1. Descriptive statistics and inferential statistics.

1.1. Basic concepts in inference: statistical population and sample; parameters, statistics and estimators.

1.2. Statistical models: parametric and non-parametric.

2. Most common statistics: the sample moments. The order statistics.

3. Distribution of some statistics.

3.1. From a sample of a Normal population: Fisher's theorem.

3.2. The Central Limit Theorem: asymptotic normality of sample moments and proportion.

Topic 2. Estimation by confidence intervals.

1. Concept of confidence interval.

2. The "pivot" method for the construction of confidence intervals.

3. Confidence intervals for the parameters of a population.

3.1. For the mean of a Normal population with known deviation.

3.2. For the mean of a Normal population with unknown deviation.

3.3. For the variance of a Normal population with unknown mean.

3.4. For the variance of a Normal population with known mean.

3.5. Asymptotic confidence intervals.

4. Confidence intervals using the inequality of Txevixev.

5. Confidence intervals for the parameters of two populations.

5.1. Confidence intervals with independent samples.

5.2. Confidence intervals for the difference of means of two Normal populations with paired data.

Topic 3: Point estimation.

1. Point estimators: definition and "good" properties.

1.1. Bias

1.2. Comparison of estimators without bias. Relative efficiency

1.3. The Cramér-Rao bound.

1.4. Comparison of estimators with bias: the mean square error.

1.5. Consistency of an estimator.

2. Methods to obtain estimators.

2.1. Method of moments.

2.2. Method of maximum likelihood.

Topic 4: Tests of hypothesis.

1. Introduction.

2. Tests for the parameters of a population.

2.1. For the mean of a Normal population with known deviation.

2.2. For the mean of a Normal population with unknown deviation.

2.3. Asymptotic tests for the mean of a population when the sample is large.

2.4. For the variance of a Normal population.

3. Tests for the parameters of two populations.

3.1. Hypothesis tests with independent samples.

3.2. Tests of hypotheses with paired data.

Topic 5. Simple linear regression model.

1. Purpose of the model.

2. Ordinary least squares (OLS) estimators.

3. Inference based on the linear regression model.

4. Forecasting

 

IMPORTANT: In teaching, the gender perspective involves reviewing androcentric biases and questioning the assumptions and hidden gender stereotypes. This revision involves including the contents of the subject the knowledge produced by scientific women, often forgotten, seeking the recognition of their contributions,as well as that of their works in the bibliographical references. Efforts will also be made to introduce the most practical part of the subject, the analysis and comparison of statistical data by sex, commenting on the classroom causes and the social and cultural mechanisms that can sustain the observed inequalities.

Methodology

The subject is structured from theory classes, problems and practices.

In theory classes we will introduce the concepts and techniques described in the course program. Considering that the content is essentially the standard of a first course of statistical inference, one can follow the course making use of the recommended basic bibliography. The material corresponding to each topic explained in the classroom will also be posted on the Virtual Campus.

The classes of problems are intended to work and understand statistical concepts. In the Virtual Campus the lists of problems will be posted and, when they have already been solved in class, also the solutions.

The objective of the practices is the use of statistical software R, to obtain and clarify the results of the procedures that have been introduced in theory classes and problems. In the Virtual Campus the statement of each practice will be posted in advance.

IMPORTANT: To work more comfortably with R, it is recommended to use the RStudio interface: it is free, "Open source" and works with Windows, Mac and Linux. https://www.rstudio.com/

OBSERVATION: The gender perspective in teaching goes beyond the contents of the subjects, since it also implies a revision of the teaching methodologies and of the interactions between the students and the teaching staff, both in the classroom and outside. In this sense, participatory teaching methodologies, where an egalitarian, less hierarchical environment is generated in the classroom, avoiding stereotyped examples in gender and sexist vocabulary, with the aim of developing critical reasoning and respect for the diversity and plurality of ideas, people and situations, tend to be more favorable to the integration and full participationof the students in the classroom, and therefore their effective implementation in this subject will be sought.

Activities

Title Hours ECTS Learning Outcomes
Type: Directed      
Practical classes 12 0.48 1, 15, 2, 5, 6, 9, 7, 8, 11, 12, 16
Problems class 18 0.72 15, 10, 3, 4, 5, 6, 9, 7, 8, 14, 13
Theory classes 30 1.2 1, 15, 10, 3, 4, 5, 6, 9, 7, 8, 14, 13
Type: Autonomous      
Exams 15 0.6 1, 15, 10, 3, 4, 5, 6, 9, 7, 8, 14, 13
Problems resolution 25 1 1, 15, 10, 3, 4, 5, 6, 9, 7, 8, 14, 13
Workshop resolution 20 0.8 1, 15, 10, 3, 2, 4, 5, 6, 9, 7, 8, 14, 13, 11, 12, 16

Assessment

The continuous evaluation note will be obtained from a control of the problems that will give a note C, and from a control of the practices of the subject that will give a note P. Note C has a weight of 20% and note P a weight of 30%. The final exam grade E1 is worth 50% of the final grade. With the notes C, P and E1 you get the grade of the subject, G, as follows:

G = 0.50 × E1 + 0.20 × C + 0.30 × P

Recovery and / or improvement of the exam note:

The student passes the subject if G is greater than or equal to 5. Otherwise, or if the student wants to improve note, there is a possibility to improve the part of the E1 exam grade by a recovery exam, the grade will be E2. Thus, from this recovery note you get the final grade of the subject:

FG = 0.50 × max (E1, E2) + 0.20 × C + 0.30 × P

Observation 1: C and P continuous assessment grades are not recoverable.

Observation 2: It is considered that the student has submitted to the announcement of the subject if any of the two exams that give rise to the E1 or E2 notes are presented. in case otherwise, it will be a Non Presented, even if it has a continuous evaluation grade (C and / or P).

Assessment Activities

Title Weighting Hours ECTS Learning Outcomes
Final exam / Reassessment (E) 0,50 10 0.4 1, 15, 10, 3, 4, 5, 6, 9, 7, 8, 14, 13
Practical exam (P) 0,30 12 0.48 1, 15, 10, 3, 2, 4, 5, 6, 9, 7, 8, 14, 13, 11, 12, 16
Problems delivery (C) 0,20 8 0.32 1, 15, 10, 3, 4, 5, 6, 9, 7, 8, 14, 13

Bibliography

Berger, R.L., Casella, G.: Statistical Inference. Duxury Advanced Series. 2002.

Daalgard, P.: Introductory Statistics with R. Springer. 2008.

Daniel, W.W.: Biostatistics. Wiley. 1974.

DeGroot, M. H.: Schervish, M.J. Probability and Statistics. Pearson Academic. 2010.

Peña, D.: Estadística. Fundamentos de estadística. Alianza Universidad. 2001.

R Tutorial. An introduction to Statistics. https://cran.r-project.org/manuals.html. juny 2019.

Silvey, S.D.: Statistical Inference. Chapman&Hall. 1975.