Learning Resources for Social Science Students: Introduction to Quantitative Methods
what is statistics? it’s about us
Social science researchers use qualitative and quantitative methods to approximate the truth and/or answer their research questions(s) — often questions about populations.
Statistics is a quantitative method. Statisticians have built tools/tests/measures to account for managing: uncertainty; precision; determining whether data can be trusted e.g. their confidence in the data; variability in the dataset; how to describe how data is distributed; and possible margins of error.
I created this post to serve as a resource for students in introductory quantitative courses in the social sciences. I vetted the internet for resources that provided solid conceptual understanding without being too bogged down in the mathematical details.
All photos are linked to the sites where they were originally found. I will be continuously editing and updating this post with resources.
fyi: statistics at an introductory level:
Statisticians have created their own language/concepts to describe data and types of patterns seen in data. Any time data is charted, the only thing it says is about the chart and nothing else about its context. This is why good statisticians do not rely on charts to convey what the data means. Statisticians need to explain to their audience how to read the chart and what it means. All types of visualizations are just tools that researcher use to help explain or display data.
All researcher who use statistics are very cautious about making any claims of truth or casual relationships. This may be reflected in language that is redundant or confusing e.g. “fail to reject the null hypothesis”
Statistics is all about managing uncertainty and accounting for it. How statisticians evaluate the validity and reliability of their data/dataset contributes to how they talk about the limitations of their data, and thus the observations or meanings they infer from it.
It’s important to remember in social science research, quantitative data is often decontextualized from the circumstances of where it was collected. For instance, when asked about the same question people may understand/perceive it differently and may not answer the question way we intended the question to be understood. Furthermore in statistics, social scientists often have to categorize things e.g. gender, ethnicity, race, and etc. By putting things into categories, we are effectively delineating what something is and what it isn’t; while we do our best to be inclusive or accurate in our definition of certain categories, our categories cannot be perfect because social life is messy. This is why when we use secondary data i.e. data that we have not collected, we must analyze the circumstances of how the data was collected; how were questions asked; how concepts were defined; and etc. This is because no secondary dataset will be perfect in helping us answer or research question. However, some datasets are better than others depending on our needs based on the context of our research question(s).
All researchers and how they interpret data are not free from bias. Bias can even exist in many layers of the research process: from the get go of how data collection instruments were created to how statisticians evaluate and make meaning of it, especially in relation to research questions. Numbers can also give a false sense of precision. Therefore, in order to reduce bias one must be aware of it first.
Researchers who use statistics are trained to think about whether a relationship or a pattern they perceive based on how data is shown does actually exist. We must always remember that correlation does not mean causation. This is especially imperative in the application of statistics to social science research, as there is so much complexity that needs to be accounted for before we make certain ascertains of ‘truth’.
data collection & sampling
What is a distribution?
mean, median, mode, range
the normal distribution
measures of spread
z-scores // confidence intervals
univariate // bivariate analysis
null hypothesis // type-i, type-ii errors
P-value // z-score // alpha
misleading // Bad data
There are a lot of statistics resources out there, so I suggest reading/looking at ways people have discussed or approach a particular concept/measure. Because someone's explanation may click better for your understanding than how its presented in the course.
Great online general statistics resources
the DeSTRESSproject promoting statistical literacy by sharing, adapting and creating resources to contextualise statistics for social sciences
Some supplementary SPSS resources - note these resources probably are based on an older version of SPSS so it may look a bit different:
Kent State University has detailed resources on working with data in SPSS.
Look at Part 1, it's the most relevant to this assignment regarding data cleaning
Empire State College also has great resources on how to look at variables and data in SPSS.