This version of the course guide is provisional until the period for editing the new course guides ends.

Logo UAB

Statistical Inference 1

Code: 104855 ECTS Credits: 6
2024/2025
Degree Type Year
2503852 Applied Statistics FB 1

Contact

Name:
Anna Lopez Ratera
Email:
anna.lopez.ratera@uab.cat

Teachers

Queralt Miro Catalina

Teaching groups languages

You can view this information at the end of this document.


Prerequisites

A good knowledge of the contents of the subjects studied during the first semester is considered very important, especially those of  Introduction to Probability, Calculus 1 and Exloratory data analysis.


Objectives and Contextualisation

This subject is the first of the Degree dedicated to Statistical Inference, which is the part of the Statistics that allows to obtain, in a controlled way, information about a population based on the data of a "representative" sample. The subject has a central character within the studies, as different concepts and techniques that will be used in many of the subjects that will be studied from now on are introduced here. Specifically, an introduction to the Statistics will begin, and then the estimation of parameters, both punctual and by confidence intervals, will be treated, as well as classical parametric hypotheses testing, both for one and two normal and dichotomous populations, ending with the chi-square tests.


Learning Outcomes

  1. CM08 (Competence) Determine the sample size and the sampling strategies required to conduct a specific study in the field of applications.
  2. KM09 (Knowledge) Discover the fundamental properties of estimators: invariance, sufficiency, efficiency, bias, mean square error and asymptotic properties, in the classical and Bayesian domains.
  3. KM11 (Knowledge) Identify exact and asymptotic sampling distributions of different statistics.
  4. SM09 (Skill) Analyse data through different inference techniques using statistical software.
  5. SM10 (Skill) Use different estimation methods depending on the context of application.

Content

1. Statistical inference: introduction and basic concepts

1.1. Introduction, objectives and program of the assignment

1.2. Population and sample.

1.3. Statistics.

1.4. Distribution of the proportion, mean and sample variance: normal distribution and Central Limit Theorem.

2. Punctual forecasting

2.1. The problem of punctual estimation. Parameter and estimator

2.2. Properties of Estimators

2.2. Estimator of a proportion

2.3. Estimator of a mean and population variance.  

2.4. How to find a good estimator: the method of moments and the method of maximum likelihood.

3. Estimated by confidence intervals

3.1. Concept of Confidence Interval

3.2. Confidence interval for a proportion

3.3. Confidence intervals for the average (with known population variance or unknown population variance). Normal case and general case

3.4. Interval of confidence for variance. The Normal Case

3.5. Interval for the difference in averages (paired data or independent samples with known, unknown and equal population variances, or unknown and different). Normal case and general case

4. Proof of hypothesis for a population. Basic Concepts

         4.1. The problem of a hypothesis test. Type of hypothesis. Type I and II errors

4.2. Level of significance and critical region. The P-value. The Power Function

4.3. Contrast to a proportion.

4.4. Contrast to the average population. The Z-test and the Student t-test

4.5. Determining sample size to ensure a given level of confidence and accuracy

4.6. Contrast to Variation

4.7. Relationship between the region of acceptance of a hypothesis test and the confidence interval

5. Contracts of hypothesis and confidence intervals to compare two populations

5.1. Comparison of the proportions of two independent populations

5.2. Comparison of the averages of two populations from paired data

5.3. Comparison of the averages of two independent populations

5.4. Comparison of the proportions of two independent populations

5.5. Comparison of the variances of two independent normal populations. The F-Test

6. Non-parametric tests based on the law of khi squared

6.1. Pearson's square khi test for sample adjustment to a distribution

6.3. The khi squared test of independence for categorical data

6.4. The test of chi square homogeneity for categorical data

7. Comparison of three or more averages

7.1. ANOVA contrast of a factor

7.2. Multiple comparison of averages (Tuckey test)

8. The Simple Linear Regression Model

8.1. Linear relationship between two numerical variables: Pearson's correlation coefficient

8.2. The Linear Regression Model

8.3. Estimation of parameters (minimum squares) and hypotheses contrasts

8.3. Good fit with the model. Coefficient of determination and range of values.

8.4. Punctual and Interval Forecasting

8.5. The logarithm function to improve the linear relationship

 

 

IMPORTANT: In teaching, the gender perspective involves reviewing androcentric biases and questioning the assumptions and hidden gender stereotypes. This revision involves including the contents of the subject the knowledge produced by scientific women, often forgotten, seeking the recognition of their contributions,as well as that of their works in the bibliographical references. Efforts will also be made to introduce the most practical part of the subject, the analysis and comparison of statistical data by sex, commenting on the classroom causes and the social and cultural mechanisms that can sustain the observed inequalities.


Activities and Methodology

Title Hours ECTS Learning Outcomes
Type: Directed      
Practical classes 12 0.48
Problems classes 18 0.72
Theory classes 30 1.2
Type: Autonomous      
Exams 15 0.6
Problems resolution 25 1
Workshop resolution 20 0.8

The subject is structured from theory classes, problems and practices.

In theory classes we will introduce the concepts and techniques described in the course program. Considering that the content is essentially the standard of a first course of statistical inference, one can follow the course making use of the recommended basic bibliography. The material corresponding to each topic explained in the classroom will also be posted on the Virtual Campus.

The classes of problems are intended to work and understand statistical concepts. In the Virtual Campus the lists of problems will be posted and, when they have already been solved in class, also the solutions.

The objective of the practices is the use of statistical software R, to obtain and clarify the results of the procedures that have been introduced in theory classes and problems. In the Virtual Campus the statement of each practice will be posted in advance.

IMPORTANT: To work more comfortably with R, it is recommended to use the RStudio interface: it is free, "Open source" and works with Windows, Mac and Linux. https://www.rstudio.com/

OBSERVATION: The gender perspective in teaching goes beyond the contents of the subjects, since it also implies a revision of the teaching methodologies and of the interactions between the students and the teaching staff, both in the classroom and outside. In this sense, participatory teaching methodologies, where an egalitarian, less hierarchical environment is generated in the classroom, avoiding stereotyped examples in gender and sexist vocabulary, with the aim of developing critical reasoning and respect for the diversity and plurality of ideas, people and situations, tend to be more favorable to the integration and full participation of the students in the classroom, and therefore their effective implementation in this subject will be sought.

Annotation: Within the schedule set by the centre or degree programme, 15 minutes of one class will be reserved for students to evaluate their lecturers and their courses or modules through questionnaires.


Assessment

Continous Assessment Activities

Title Weighting Hours ECTS Learning Outcomes
Final exam / Reassessment (E) 0,50 10 0.4 KM09, KM11, SM10
Practical exam (P) 0,30 12 0.48 CM08, KM11, SM09, SM10
Problems delivery (C) 0,20 8 0.32 CM08, KM09, KM11, SM10

The continuous evaluation note will be obtained from a control of the problems that will give a note C, and from a control of the practices of the subject that will give a note P. Note C has a weight of 20% and note P a weight of 30%. The final exam grade E1 is worth 50% of the final grade. With the notes C, P and E1 you get the grade of the subject, G, as follows:

G = 0.50 × E1 + 0.20 × C + 0.30 × P

Important: The evaluation can be:

  • or Continuous Evaluation where problem delivery is resolved on three different days during the course, the practice exam on a date close to the end of the course and the final exam on another date different from the date of the practice exam, 
  • or Single Evaluation where on the same day of the final exam of the Continuing Evaluation the student must submit to all three tests at the same time (E1, C and P). Important, to be able to submit to the Single Evaluation you must use the channel and submit the application on the deadlines established by the Faculty of Sciences. 

Given the quantitative nature of the subject, the Continuous Evaluation is recommended by the teachers of the subject and by the studies.

Recovery and/or improvement of the exam grade:

The student passes the subject if N is greater than or equal to 5 and, at the same time, E1 is greater than 4. Otherwise or if the student wants to improve grade, there is a possibility to improve the part of the E1 exam through a recovery exam, whose grade will be E2. Thus, from this recovery note you get the final grade of the subject:

FG = 0.50 × max (E1, E2) + 0.20 × C + 0.30 × P

Observation 1: C and P continuous assessment grades are not recoverable.

Observation 2: It is considered that the student has submitted to the announcement of the subject ifany of the two exams that give rise to the E1 or E2 notes are presented. in case otherwise, it will be a Non Presented, even if it has a continuous evaluation grade (C and / or P).


Bibliography

Novales, A.: Econometria. McGraw-Hill 2000

Peña, D.: Estadística. Fundamentos de estadística. Alianza Universidad. 2001.

Berger, R.L., Casella, G.: Statistical Inference. Duxury Advanced Series. 2002.

Daalgard, P.: Introductory Statistics with R. Springer. 2008.

Daniel, W.W.: Biostatistics. Wiley. 1974.

DeGroot, M. H.: Schervish, M.J. Probability and Statistics. Pearson Academic. 2010.

R Tutorial. An introduction to Statistics. https://cran.r-project.org/manuals.html. juny 2019.

Silvey, S.D.: Statistical Inference. Chapman&Hall. 1975.


Software

The software to be used to work with the data will be Excel and the statistical program R.


Language list

Name Group Language Semester Turn
(PAUL) Classroom practices 1 Catalan second semester afternoon
(PLAB) Practical laboratories 1 Catalan second semester afternoon
(PLAB) Practical laboratories 2 Catalan second semester afternoon
(TE) Theory 1 Catalan second semester afternoon