Formation/Cours

Logo UCL monochrome

Applied Statistics – Level 2

Etablissement : ESPOL European School of Political and Social Sciences

Langue : Anglais

Période : S4

Completed Applied Statistics Level 1

Course description

Quantitative research methods are a central tool for finding scientific answers to our research questions. The research design and applied statistics are central components of all scholarly work and guides the research process, from formulating concrete and testable research questions to the empirical examination of theoretical arguments and hypotheses. The PPE curriculum includes three compulsory courses in Applied Statistics to provide with a solid basis for both, descriptive statistics and the logic of causal inferences. To achieve this goal, the Applied Statistics Level 2 course starts by providing a recap to research design basics and the logic of causal inference. It then delves into quantitative techniques focusing on regression analysis. This course builds on the insights from Applied Statistics Level 1 with a stronger focus on multivariate regression analysis and causal inference. The course also focuses on the presentation and interpretation of research output. Data visualisation and analysis skills are of high practical value for students, also beyond academia. Thus, a big part of the course is dedicated to these skills. Students will learn the statistical software R using the online platform Datacamp.

Learning objectives

By the end of this course, students should…

… have acquired an advanced knowledge of quantitative research methods in social sciences,

… be able to conduct basic empirical quantitative research for their research papers and BA thesis,

… feel empowered to learn and apply new methods,

… be able to effectively communicate their analyses both verbally and in writing,

… have improved skills in the statistical package R.

  1. Introduction (Session 1 – 15.01.2025)

We discuss the course contents, structure, and assessments. Students will be briefly introduced to the statistical package R, and the graphical interface RStudio. In addition, students will be familiarised with data camp.

2. Causal explanations and models (Session 2 – 22.01.2025)

The first part of the session is devoted to refresh students’ knowledge about methodology in social sciences and research design. We start with the two pillars of social science – theory and observation; we then look at the main blocks (or the cycle) of research design; and different types of research questions (e.g. descriptive, explanatory) requiring different statistical inferences. After that, we underline the importance of descriptive statistics and data visualization (recap from Applied Statistics Level 1), and the possibility of making causal inferences.

In the second part, now that we know the importance of description, we move on to explanation. We discuss the logic of causal inference, the main threats to causal inference, and the importance of theory driven empirical research (the link with theory and hypotheses). We conclude by discussing which research questions we want to answer, draw schematic representations of models that we want to analyse during the next sessions, and collect data.

3. The simple regression model: bivariate regressions (Session 3 – 29.01.2025)

Now that we have data, how do we know whether our data is consistent with our theoretical argument? This session provides with a recap from ‘Applied Statistics Level 1’ discussing the concept of hypothesis testing, p-values, and the correlation coefficient, their meaning and implications, and their potential limitations. We then discuss the underlying logic of regression analysis, and work through several examples of linear and non-linear regressions. We also discuss how to present and interpret the regression output.

4. Multivariate regressions: estimation & interpretation (Session 4 – 05.02.2025)

Now that we have a basic understanding of the core ideas underlying linear regression, we can continue by connecting our prior knowledge on research design with specific functions of multiple regression. We will focus on the omitted variable bias as a core threat to causal inference and discuss how this may be addressed through control variables. We will then discuss the difference between statistical significance (hypothesis testing, p-value, confidence intervals) and practical significance (magnitude, effect size). Finally, we will investigate the estimation and interpretation of linear multivariate regression analyses, and reporting of results.

5. Multivariate regressions: beyond linear models (Session 5 – 26.02.2025)

In this session we continue working on multivariate regression analyses. We will delve into the assumptions underlying regression analysis and how to diagnose violations of these assumptions. We will discuss non-numeric variables, and nonlinear effects and how to model them. We will also discuss what to do if the outcome variable is not continuous. We often incur dichotomous data we would like to analyse (such as employment status, election participation) and nominal variables (such as life satisfaction, vote choice). In this session, students will hear about several potential models for approaching different types of variables. The focus of this session is on the understanding that different models exist and how to select them. We will look at several examples and practice how to report regression results and how to interpret the estimates.

6. Mediation and moderation analyses: interaction effects (Session 6 – 5.03.2025)

We now move our focus from controlling for confounders to estimating medi