Degree | Type | Year | Semester |
---|---|---|---|
2503852 Applied Statistics | OB | 2 | 1 |
Fundamentals of descriptive and inferential statistics and probabilities, as well as knowing the rudiments of programming with the R language.
The objective of the course is to study the modeling and analysis of data using the theory of Linear Models, as well as applications in various fields (economics, health, engineering, and science in general). The methods and techniques are introduced based on examples and developed by means of the resolution of a number of proposed problems together with computer work with to be developed in the R environment. First, the simple regression model is presented due to its numerous applications and because it is a good prologue to the understanding of the multiple model. The multiple regression includes some variants (polynomial, with interactions, using fictitious regressive variables, etc.) and constitutes the second part of the course. In all the modeling procedures, the goodness of fit and the correct model’ specification, the theoretical assumptions and the detection of "special" (anomalous and influential) data are analyzed, and possible solutions are proposed in the case that a flagrant violation of the model hypotheses is found.
1. The simple linear regression model.
- Introduction to regression: Exploring data.
- Simple linear regression: Model, hypotheses and parameters.
- Punctual estimation: Least squares and the maximum likelihood methods.
- Inference about the parameters under the Gauss-Markov hypothesis: Intervals and tests.
- New observations: The confidence interval for the mean response and the prediction intervals. Simultaneous inferences. Confidence and prediction bands.
- Analysis of the variance (ANOVA) in simple regression.
- Model diagnostics: Graphical evaluation of the linearity and the model hypotheses through the analysis of the residuals. The lack of fit test.
- Anomalous and influential data.
2. Multiple linear regression
- Previous steps in multiple regression: Exploration of data with multidimensional visualizing tools.
- Model and estimators of the coefficients by least squares. Interpretation of the coefficients in the multiple linear model.
- Laws of estimators of coefficients, predictions and residuals: application of the properties of idempotent matrices.
- Inference in the multiple linear model. The model anova.
- Linear constraints on the coefficients: The incremental variability principle.
- Discussion on the model hypotheses: Analysis of the residuals. Box-Cox transformations.
- The multicollinearity problem: Detection and solutions.
- Dummy variables in regression.
- Variables selection: Mallows Cp statistic, cross validation and automatic stepwise selection procedures.
The subject has two weekly hours of theory and problems, where linear methods and tools are introduced and analyzed. Problem lists will be supplied along the course, to be delivered. Practical sessions will be carried out using the R programming language. Tasks to be delivered are proposed related to the theoretical exercices and to the computer practical work. The student will also perform extra autonomous consisting of bibliographical research and exams preparation.
The course material (theory notes, lists of problems and computer tasks) will be available in the moodle classroom.
The gender perspective goes beyond the contents of courses, since it implies also a revision of teachingmethodologies and interactions between students and lecturers, both inside and outside the classroom. In this sense, participative teaching methodologies that give rise to an equality environment, less hierarchical in theclassroom, avoiding examples stereotyped in gender and sexist vocabulary, are usually more favorable to the full integration and participation of female students in the classroom. Because of this, their effective implementation will be attempted in this course.
Title | Hours | ECTS | Learning Outcomes |
---|---|---|---|
Type: Directed | |||
Supervised computer sessions | 26 | 1.04 | 1, 13, 15, 6, 14, 22 |
Theoretical classes | 26 | 1.04 | 3, 2, 5, 4, 18, 7, 10, 11, 12, 13, 6, 25 |
Type: Autonomous | |||
Computer work | 32 | 1.28 | 3, 2, 5, 4, 1, 8, 9, 24, 11, 16, 17, 19, 20, 6, 23, 22, 25 |
Personal work | 36 | 1.44 | 18, 7, 14 |
Problem solving | 18 | 0.72 | 4, 10, 11, 12, 13, 21, 19, 6, 25 |
PR: Delivery of the theoretical and practical (with R) exercises. Maximum PR rating: 2 points. This part is not recoverable.
P1: Partial test of simple regression (theory, exercises, and practices). Maximum rating of P1: 3 points.
P2: Multiple regression partial test (theory, exercises and practices). Maximum rating of P2: 5 points.
The course grade will be calculated: NC = PR + P1 + P2. It is mandatory for NC be equal to or greater than 5 and that the grades of each partial be greater than or equal to 3.5 (out of 10).
At the end of the semester there will be a recovery test that will be a synthesis test, PS, (theory, exercises and practices) of the contents of the entire course with a maximum score of 8 points, by students who have not passed by course or want to improve the note. Only students who have participated in 2/3 of the evaluation activities may be submitted to the synthesis test.
The final grade of those presented to the synthesis test will be calculated: NF = PR + max (PS, P1 + P2).
Honor grades will be granted at the first complete evaluation. Once given, they will no be withdrawn even if another student obtains a larger grade after consideration of the PS exam.
Attention: "Without prejudice to other disciplinary measures deemed appropriate, and in accordance with current academic regulations, will be scored with a zero the irregularities committed by the student that may lead to a variation of the rating of an evaluation act. Therefore, plagiarizing, copying or letting a practice copy or any other evaluation activity involve suspending with a zero and cannot be recovered in the same academic year. If this activity has a minimum associated score, then the subject will be suspended. "
Title | Weighting | Hours | ECTS | Learning Outcomes |
---|---|---|---|---|
Final test | 80% (recovery partial exams) | 4 | 0.16 | 4, 1, 10, 11, 12, 13, 15, 6, 22, 25 |
Partial exam 1 | 30% | 4 | 0.16 | 2, 4, 1, 10, 24, 13, 17, 22, 25 |
Partial exam 2 | 50% | 4 | 0.16 | 3, 2, 5, 4, 1, 8, 9, 10, 11, 13, 15, 16, 17, 6, 22, 25 |
Tasks delivery | 20% | 0 | 0 | 2, 5, 4, 1, 18, 7, 8, 9, 24, 15, 17, 21, 19, 20, 6, 14, 23, 22, 25 |
Montgomery, D. Peck, A. Vining, G.; Introduction to Linear Regression Analysis. Wiley, 2001.
Clarke, B.R.; Linear Models:The Theory and Applications of Analysis of variance. Wiley, 2008.
Christopher Hay-Jahans; An R Companion to Linear Statistical Models. Chapman and Hall, 2012.
Fox, J. and Weisberg, S.; An R Companion to Applied Regression. Sage Publications, 2nd edition, 2011.
Peña, D.; Regresión y diseño de Experimentos. Alianza Editorial (Manuales de Ciencias Sociales), 2002.
Complementary references:
Sen, A., Srivastava, M.;Regression Analysis: Theory, Methods and Applications. Springer, 1990.
Neter, M. H. Kutner, C. J. Nachtsheim, W. Wasserman; .Applied Linear Models. Irwin (4th edition), 1996.
Faraway, J.; Linear Models with R. Chapman&Hall/CRC (2nd ed), 2014.
Rao, C. R., Toutenburg, H., Shalabh, Heumann, C; Linear Models and generalizations. Springer, 2008.