Link to The Stroop Effect (https://en.wikipedia.org/wiki/Stroop_effect)
In a Stroop task, participants are presented with a list of words, with each word displayed in a color of ink. The participant’s task is to say out loud the color of the ink in which the word is printed. The task has two conditions: a congruent words condition, and an incongruent words condition. In the congruent words condition, the words being displayed are color words whose names match the colors in which they are printed: for example RED, BLUE. In the incongruent words condition, the words displayed are color words whose names do not match the colors in which they are printed: for example PURPLE, ORANGE. In each case, we measure the time it takes to name the ink colors in equally-sized lists. Each participant will go through and record a time from each condition.
In the congruent words condition, the words being displayed are color words whose names match the colors in which they are printed. The value of the variable in this case "True".
In the incongruent words condition, the words displayed are color words whose names do not match the colors in which they are printed. The value of the variable - "False".
The set of hypotheses:
1) Supposing the value of the variable "Condition" (congruent words or incongruent words) does not significantly affect the value of the variable "Reading time", let us formulate the null hypothesis.
2) This time let us suppose that in the incongruent words condition the value of the variable "Reading time" does not equal (exceeds) the value of this indicator for the congruent words condition. The alternative hypothesis is
Here we can use the one-sided alternative hypothesis, since the tendency of increasing the time of reading for the incongruent words condition is clearly traceable.
I think a t-test will be needed in this case.
This test usually helps us to compare whether two groups have different average values. The t-test asks can be a difference between averages of two groups because of random chance in sample selection or not. A difference is meaningful if it'is large or the sample size is big or responses are not widely spread out (the standard deviation is small).
Perhaps, it should be a test of the null hypothesis that the difference between two responses measured on the same statistical unit has a mean value of zero. This is often referred to as the "paired" or "repeated measures" t-test.
And about the confidence intervals. Of course, in applied practice they are typically stated at the 95% confidence level. But here it's possible to use the level 99% because of the obvious data trend.
Let's do a basic statistical calculation on the data using code.
import pandas as pd
path = r'~/Downloads/stroopdata.csv'
dataFrame = pd.read_csv(path)
dataFrame.head(3)
The data shows a significant difference between the time of reading: the congruent words condition has a higher speed. This conclusion is confirmed by the description of each database column. For the congruent words condition all the important indicators (mean, min, max) for the time of reading are less.
dataFrame['Congruent'].describe()
dataFrame['Incongruent'].describe()
Having seen clearly the central tendency, we can find the difference between the two measures and evaluate it using the t-test.
difference = dataFrame['Incongruent'] - dataFrame['Congruent']
difference.describe()
from scipy.stats import ttest_rel
a = dataFrame['Incongruent']
b = dataFrame['Congruent']
t_statistic, pvalue_2tailed = ttest_rel(a, b, axis=0)
t_statistic
pvalue_2tailed
pvalue_1tailed = pvalue_2tailed*2
pvalue_1tailed
Let us also appreciate the difference in percentage between the indicators for two states.
difference_in_percentages = 100*(dataFrame['Incongruent'] - dataFrame['Congruent']) / dataFrame['Congruent']
difference_in_percentages.describe()
It is possible to describe how many times one indicator is superior to the another.
coefficient = dataFrame['Incongruent']/dataFrame['Congruent']
coefficient.describe()
Let's unite the obtained data into one table of the indicators.
difference_df = pd.DataFrame(data={'Coefficient': coefficient,
'Difference in percentages': difference_in_percentages,
'Difference': difference,})
difference_df
%pylab inline
import matplotlib.pyplot as plt
import seaborn as sns
The difference in the time of reading is clearly visible on the graph.
plt.rcParams['figure.figsize'] = (12, 4)
dataFrame.plot()
This indicator is always greater than zero.
plt.rcParams['figure.figsize'] = (12, 4)
difference.plot()
How many times the time of reading under the incongruent words condition is more than in another condition is reviewed at the next graph.
plt.rcParams['figure.figsize'] = (12, 4)
coefficient.plot()
It is easy to see this coefficient is always greater than 1.
We can see the t-test result has a large value. The probability of the events μ1 = μ2 "the population means for two measures (for the congruent words condition and for the incongruent words condition) are equal" is p = 8.2060011714223563e-08. This goes beyond the confidence interval at the level 99% (α = 0.01). We must reject the null hypothesis at all. This result is in line with expectations and easily predictable.
In practice, it means that the speed of word reading under the congruent words condition differs significantly from this indicator under the incongruent words condition.
During this experiment, the time for reading under the incongruent words condition was on average 1.6 times greater than the time under the different condition. In the case of other research groups, this coefficient will obviously take other values due to differences in capabilities of cognitive control.
The Stroop test helps to identify the flexibility / rigidity of cognitive control. This describing corresponds to the most simple explanation of the effect.
The results of the test can characterize the degree of subjective difficulties in changing ways of processing information in a situation of cognitive conflict. Rigid control indicates difficulties in the transition from verbal functions to sensory-perceptual functions due to the low degree of automation. Flexible control shows ease of this transition because of the high degree of automation.
However, the Stroop task and its associated effects are a multi-dimensional phenomenon that can not be fully explained by the concepts of cognitive control. Thorough understanding of other mechanisms requires more researches.
Of course, many modifications of this test can be useful in measuring and improving in the spheres of selective attention, cognitive flexibility and processing speed. Here are some concrete examples:
1) It can be applied in the study of bilingualism. Participants can read on the card the words in one language, then on the other card - in another language, on the third card - the words printed in two languages one after another. This will help to understand whether one language dominates the another and what difficulties a person experiences during switching.
2) Exploring the emotional sphere also could be interesting. Offering to read the words on the 4 different cards (positive, negative, neutral and mixed sets of words) will show some results as well, I am sure.
3) Exercises in the alternate accounting in the different number systems (for example, decimal and binary, one after another) can significantly extend the possibility of a particular individual.