Why does an evil teacher force students to learn statistics? - Chapter 1

In this chapter the importance of statistics is discussed, as well as its fundamental concepts. Data is required to answer various questions. Therefore, a teacher emphasizes the importance of working with numbers to students, because these numbers are a form of data and a part of the research process.

In addition to numbers, other forms of data exist. Studies based on figures use a quantitative method to do research, while studies that are mainly based on language research use a qualitative method. The qualitative and quantitative method are complementary to each other. This means that they can be used to enhance or emphasize each other.

What does the research process look like?

The research process consists of a number of steps. The first step is observation; something is observed that evokes curiosity. Consequently, a researcher has a question that he or she would like to answer. To determine if the observation is correct, data must be collected. A researcher needs variables to collect this data. A variable is something that is measured to answer the question of the researcher.

The research process is as follows: Formulate a research question -> test a theory -> write a hypotheses -> make predictions -> collect data to test the predictions -> analyze this data.

How do people find something that needs explanation?

One can find something that needs to be explained in many different ways. For example, when watching the news on television, a research question may arise about something that is going on in the world. To formulate an answer to this question, data must then be collected. To collect this data, one must also collect variables that have to be set and defined.

How are hypotheses tested?

After formulating a research question, the next step is to test a theory and to write a hypothesis. A hypothesis is an explanation for a certain phenomenon or a set of observations. A hypothesis is set by explaining data, and data can be explained by using a theory. Based on this theory, a prediction can be made. This prediction based on a theory is called a hypothesis. You can only speak of a hypothesis when it is a statement that can be proven or rejected by using scientific methods. If the collected data contradicts the theory or the hypothesis, a falsification occurs.

What is the difference between a dependent and an independent variable?

If people want to collect data, it is important that we ask two things: (1) what is measured and (2) how is it measured? To test the hypotheses, the variables must be measured. Variables are things that can vary between  people, between situations or over time. With most hypotheses there are two variables; the cause and the outcome.

The variable that is seen as the cause of a certain effect is called the independent variable or the predictor. In an experimental set-up, this term is used to emphasize that the researcher has manipulated this variable. The variable that changes due to changes in the independent variable is called the dependent variable or outcome variable.

What is meant by a measurement level?

Variables can be measured in various ways. The relationship between what is measured and the numbers that express what you are measuring is called the level of measurement.

Variables can be categorical or continuous and can have different measurement levels. A categorical variable consists of different categories. An example of a categorical variable is the division between men and women. In this case the variable has only two categories; a man or a woman. You can't be both. A variable with two categories is called a binary variable.

If a variable consists of more than two categories that are linked to each other, it is called a nominal variable. An example of a nominal variable is religion (Judaism, Christianity, Islam, etc.). Although these categories can also be represented with numbers, it is not possible to perform mathematical calculations with these numbers. These figures do not indicate a ranking with a nominal variable. An example of a nominal variable that is represented by numbers is the back number of a player in a team sport. A higher back number does not mean that someone is a better player. Nominal data can only be used to look at frequencies, for example how often a certain player scores, or how many people have a certain belief.

With an ordinal variable there are also different categories. These categories have a certain rankin, like a specific order. It is however not specified how big the difference is between the categories. A top three in a competition indicates who has done it better than the other. There is a sequence but it does not say how much better the winner was than the number two and three.

At the next measurement level you no longer have a categorical variable but continuous variables. A continuous variable is a score that can assume any value that is used on the measurement scale. The interval variable is a form of a continuous variable. With the interval variable, the difference between all numbers is the same. An example of this is a scale where you indicate how nice think someone is on a five-point scale. The difference between 1 and 2 is the same as the difference between 4 and 5. This measurement level is most often used for statistical tests.

The next measurement level is the ratio variable. The ratio variable has the same conditions as the interval variable, but the ratio variable has an absolute and meaningful zero point. This means that you can multiply the numbers of a ratio variable. An example of this is reaction time; a millisecond always lasts the same length, so the differences between the milliseconds are the same, but you can also say that 200 milliseconds is twice as long as 100 milliseconds. A continuous variable does not always have to be continuous, it can also be a discrete variable. A real continuous variable can take on all possible values but with a discrete variable only certain values can be chosen ​​(usually only rounded numbers). If you indicate how nice you think someone is on a five-point scale, it is a continuum, where 2.98 is a meaningful value, but you can only choose the numbers 1, 2, 3, 4 and 5. You cannot actually enter 2.98.

What is a measurement error?

Researchers prefer a measurement that is the same over time and in different situations. He or she would prefer an accurate measurement that is not influenced by who or where the measurement is made. There is often a difference between the measured value and the actual value. You call this difference the measurement error. If you have a good instrument, the measurement error is small. Questionnaires about sensitive topics often give larger measurement errors because not only the actual situation influences the answers of the participants, other factors such as social desirability also play a role.

What is meant by the concepts of validity and reliability?

One way to minimize the measurement error is to establish qualities of the measuring instrument that say something about how well the measuring instrument is performing. One way to determine that is validity. Validity is whether the instrument actually measures what you wanted to measure. Criterion validity means that you can determine whether your instrument measures what you want to measure based on objective criteria.

This can happen in two ways. If you simultaneously collect data with the new instrument and test existing criteria, you measure the simultaneous validity. If you use the data from your new instrument to predict later observations, you measure predictive validity. The problem with criterion validity is that it cannot always be used because there are often no objective criteria for what you want to measure, such as when you want to know how much someone is liked by others. Another form of validity is content validity. This is about the extent to which the items on a questionnaire match the construct and whether the questions fully cover the phenomenon.

An instrument must be valid but that is not enough. An instrument must also be reliable. Reliability means that the instrument gives the same result under the same conditions. So a reliable scale always gives the same weight if the actual weight is the same. If an instrument is not reliable, it cannot be valid either. Because an instrument that generates different outcomes under the same circumstances does not, by definition, measure what it should measure. So that means that the instrument is not valid. The easiest way to test reliability is to repeat the test (test-retest reliability). A reliable instrument should give the same results on both tests.

What is meant by a correlational research method?

Correlational research observes what is happening in the world without manipulating it. This is good for the ecological validity because the natural situation is observed. Some studies can only be performed in this correlational way because it is impossible or unethical to manipulate certain variables. However, the disadvantage of this method is that it is not possible to make a statement about causality.

What is meant by an experimental research method?

In experimental research, a variable is manipulated to see if it influences the other variables. Many studies look at whether one variable (the predictor/independent variable) is the cause of the other variable (the dependent variable/outcome). According to Hume, one can only speak of a causal connection if:

  • Cause and effect closely follow each other in time.

  • The cause precedes the consequence.

  • The effect never occurs without the cause occurring.

In many studies, the variables are measured simultaneously. In many cases it is not known which variable is the cause and which variable represents the effect. It is possible that there is a third variable in the game (tertium quid) that is the cause of both other variables. This is also called the confusing variable. An example is the connection between breast implants and suicide. Low self-esteem is the cause of taking a breast enlargement and the cause of attempting suicide. So, low self-esteem is the confusing variable.

John Stuart Mill (1865) has added another criterion to Hume's criteria, namely that all other explanations of the cause-effect effect must be excluded. If the cause is absent, the effect may not be present either. The purpose of experimental research is to find the cause-effect relationship between variables in as much detail as possible. Experiments compare situations (conditions or treatments) where the alleged cause is absent with the situation in which the cause is present. Participants can participate in an experiment in two different ways: in an in-group design and in a between-group design.

Which two methods of data collection exist?

As mentioned above, participants can participate in an experiment in two different ways. In a within-groups design (also called: within-subject or repeated measures design) the same participants do the experiment a number of times in different conditions. In a between-groups design (also called: between-subjects or independent design) you have different participants in different conditions.

Which two types of variation are there?

Non-systematic variation is the small difference in performance between two conditions that cannot be explained by known factors. Even if all variables remain the same, there is usually still a slight difference in scores between different conditions or different moments. This is due, for example, to differences in skill in the task between people or different times of the day. Systematic variation is the difference between the two conditions that results from the manipulation of the condition. For example, in one condition chimpanzees receive a reward for their behavior, and in the other condition they don't. The difference in behavior is now caused by the manipulation. This is called the systematic variation. The role of statistics is to discover how much difference there is in performance, which part of the variation is systematic and which part is non-systematic. There is less non-systematic variation in the in-group design than in the between-group design. With the between-group design, the people in the different groups can have different characteristics.

What is meant by the concept of randomization?

In order to keep the non-systematic variation as small as possible and to make the test as accurate as possible, scientists use randomization. Randomization is important because other sources of systematic variation are removed, making sure that changes are caused by experimental manipulation. In the inner group design (in-group design) there are two more important sources of systematic variation. These are the two types of effects:

  1. Practice effects. This means that participants can behave differently during the test because they have become familiar with the test.

  2. Boredom effects. This means that participants can behave differently in the second test because they have become bored and/or tired of the first test.

To keep these effects as small as possible, a different order of conditions is given to different participants, this is called counterbalancing. This means that one person first gets condition 1 and then 2, and the other person first gets condition 2 and then 1. Who gets which condition first is randomized, the participants are randomly assigned to either condition 1 or 2.

In the between-group design, randomization is done by randomly assigning the participants to the different conditions. After all, people differ in characteristics, which are possible confounders (confusing variables). If the participants are randomly distributed over the conditions, this variation is part of the non-systematic variation. The groups do not differ from each other in a systematic way other than in the experimental manipulation.

What are frequency distributions?

Once a researcher has collected all the data, he or she wants to analyze the data. In order to do so, it is useful to make a graphical representation of the data. This is possible with a frequency distribution (also called a histogram). This graph shows how often a certain score occurs in your data. This graph is useful when calculating the proportions.

A researcher gets the ideal situation when a vertical line is drawn through the center of a histogram. Both halves are symmetrical. This is called a normal distribution. A normal distribution is a bell-shaped curve, meaning that most scores are around the middle of the distribution. Many phenomena are normally distributed. The frequency is usually on the vertical axis and the scores on the horizontal axis.

When a histogram is not symmetrical, it is skewed. If the histogram has many scores on the left, it is skewed positive. If it has many scores on the right, the histogram is skewed negative. Kurtosis indicates the extent to which the scores are in the tails of the distribution. This can be seen from how pointed the histogram is. With a leptokurtic distribution, the kurtosis is positive and the histogram runs in a pointed graph. With a platykurtic distribution, the kurtosis is negative and the histogram is flatter than normal.

What is meant by the mode?

One can calculate where the center of the frequency distribution lies (central tendency). The simplest method for this is the mode. This is the score with the highest frequency, so the score that occurs most often. There can be multiple scores with the same frequency. If there are two most common scores, the distribution is bimodal. With more than two modes, the distribution is multimodal.

What is meant by the median?

The second way to calculate the center of the distribution is with the median. The median is the middle score when you put all the scores in terms of frequency from small to large. The position of the median can be calculated with the formula: (n + 1) / 2. If a researcher has 11 scores, the median is therefore the sixth digit. If a researcher has an even number of scores, the median falls between two scores. The median is then calculated by taking the average of those two scores. Extreme scores and a skewed distribution have little influence on the median. The median can be used with data at ordinal, interval and ratio measurement level. It cannot be used for nominal data, because this data cannot be arranged in order from small to large.

What is meant by the mean?

The third way to calculate the center of the distribution is with the mean. You calculate the average by adding up all scores and dividing them by the number of participants. In formula form:

x̅ = Σ (x / n)

x̅ is the mean, Σ is the sum sign (sigma). With this, the addition of all scores is called x. The x is the score of a participant. n is the number of participants, also called the sample size.

The average can be influenced by extreme scores and by a skewed distribution. It can also only be used with interval or ratio data. The advantage of the average above the median and mode is that you take all scores into account when calculating them. With the median and mode, most scores in the dataset are ignored.

What does the spread of a distribution look like?

In addition to the middle of the distribution, someone may also be interested in the way the scores are distributed. The range of the scores is the highest score minus the lowest score. Because you only use the highest and lowest score to calculate the range, extreme scores have a lot of influence. To reduce this influence the top 25% and the bottom 25% are removed, leaving you with the middle 50%. This is called the interquartile distance. Quartiles are the three values ​​that divide the division into four equal pieces. The median of the data is the second quartile. The first quartile is the median of the lowest half, the third quartile is the median of the upper half.

The disadvantage of the range is that half of the data is not used. If a researcher chooses to use all data, he or she can see how far each score is from the center of the distribution. This deviation is calculated with: the score - the average. The total deviation is calculated by adding up all deviations.

Because some scores are above average and others below average, the total deviation is always 0. This means that deviation scores are not added. The values ​​are first squared and then added. These summed squares are called the squared sum or sum of squares (SS). The negative values ​​become positive squared. This means that the SS is always greater than zero. SS can also be exactly zero if all scores are exactly equal.

The problem with this squared sum is that its size depends on the number of scores that are added. As a result, the squares of different sample sizes cannot be compared with each other. That is why an average spread measure is used, which is called the variance. The variance (s²) is the squared sum divided by the sample size - 1.

The disadvantage of the variance is that it is a squared measure. Therefore, the root of the variance is usually used. This is called the standard deviation (s).

The sums of squares (SS), the variance (s²) and the standard deviation (s) are all measures that indicate how far the data is spread around the average. A large standard deviation means a large spread, with many scores far from the average. With a small standard deviation, all scores are around the average. With a large standard deviation the distribution becomes flatter but with a small standard deviation the distribution is more pointed. This may resemble a platykurtic or leptokurtic distribution while it is not.

In what other way can a frequency distribution be used?

The frequency distributions cannot only be used to see how often certain scores actually occurred, but also to make a statement about how likely it is that something will occur. If there were 172 suicides, of which 36 were between 30 and 35, it gives a proportion of 36/172 = 0.21, or 21%. With these proportions you can estimate how likely it is that a certain score will occur. Opportunities take a value between 0 (there is no chance that it will happen) and 1 (it will certainly happen). To calculate these opportunities, you use a probability distribution. The area under a part of the probability distribution indicates the probability that a certain value will be obtained. The standard distribution often uses a standard distribution (z distribution). The average is always 0 and the standard deviation is always 1. All data sets can be converted into such a standard distribution. You do this by changing the scores to z-scores. This can be done by doing the score – (minus) the average, and then dividing by the standard deviation.

These z-scores can be used to look up the corresponding proportions in the table of the standard normal distribution. The proportion in the standard normal curve is the same as the probability of that value. A z-score of 2.6 means a score that is 2.6 standard deviations above average. For example, a z-score of 2.6 corresponds to a proportion in the normal curve of 0.0044. That means that there is a 0.44% chance of this value. It also means that 99.56% of the curve is below this value, because the entire curve is 1 (100%).

The table with the standard normal distribution can also be used to answer the question about which range has the middle 95%. Because the normal curve is symmetrical. This means that there is 2.5% of the distribution on both sides of the middle part. To find out which z-score belongs to the 0.025 proportion, look for this value in the table under the "smaller portion" column. The corresponding z-score is 1.96. Because the distribution is symmetrical, -1.96 is the other limit. The middle 95% of the curve is therefore between the z-scores -1.96 and 1.96.

How to report the data?

After the research has been completed, a report must be written containing the findings and sent to a scientific journal. A scientific journal is a collection of articles written by researchers and published in scientific journals. These scientific articles describe a new research, publish a review of existing articles or describe a new theory.

It is important that a researcher knows how a research, including the data, must be presented and reported. For reporting, the rules and guidelines of the American Psychological Association (APA) are usually followed. Standards may vary per journal.

When data is reported, it is important to determine whether text, a graph or a table will be used. A researcher must be consistent. APA offers the following guidelines:

  • Choose a method of presentation that ensures that the data is understood as well as possible.

  • When you present three or fewer numbers, do so in the form of a sentence.

  • When you present between 4 and 20 numbers, use a table.

  • If you present more than 20 numbers, use a graph. This is more useful than a table.

Voor toegang tot deze pagina kan je inloggen

 

Voor volledige toegang tot deze pagina kan je inloggen

 

Inloggen (als je al bij JoHo bent aangesloten)

   Aansluiten   (voor online toegang tot alle webpagina's)

 

Hoe het werkt

 

Aanmelden bij JoHo

 

 

JoHo: crossroads via de bundel

  Chapters 

Teksten & Informatie

JoHo: paginawijzer

JoHo 'chapter 'pagina

 

Wat vind je op een JoHo 'chapter' pagina?

  •   JoHo chapters zijn tekstblokken en hoofdstukken rond een specifieke vraag of een deelonderwerp

Crossroad: volgen

  • Via een beperkt aantal geselecteerde webpagina's kan je verder reizen op de JoHo website

Crossroad: kiezen

  • Via alle aan het chapter verbonden webpagina's kan je verder lezen in een volgend hoofdstuk of tekstonderdeel.

Footprints: bewaren

  • Je kunt deze pagina bewaren in je persoonlijke lijsten zoals: je eigen paginabundel, je to-do-list, je checklist of bijvoorbeeld je meeneem(pack)lijst. Je vindt jouw persoonlijke  lijsten onderaan vrijwel elke webpagina of op je userpage
  • Dit is een service voor JoHo donateurs en abonnees.

Abonnement: nemen

  • Hier kun je naar de pagina om je aan te sluiten bij JoHo, JoHo te steunen en zelf en volledig gebruik te kunnen maken van alle teksten en tools.

Abonnement: checken

  • Hier vind je wat jouw status is als JoHo donateur of abonnee

Prints: maken

  • Dit is een service voor wie bij JoHo is aangesloten. Wil je een tekst overzichtelijk printen, gebruik dan deze knop.
JoHo: footprint achterlaten