I need to split my data by dates. They want to split it in the middle of the year, before and after june. These are then made into 6 groups, before and after in the years 1999, 2000, 2001. They want to compare how much money they got before and after the interventions for all these years. However, they are also thinking of doing t-tests on control groups on these dates. For example, the control group did not have any interventions. They are then thinking of comparing the difference in the control group compared to the difference in the experimental group. This seems like a lot of t-tests which can increase type 1 error. Would a one way anova be the best way to go about doing a test like this. Is there a better way to do this? Also, is there a point of having a control group in this situation.Would the before count as a control group. This is very confusing, but i am hoping a stats guru can help.
Why do a before and after in a control group on a t-test
-
0You are correct to be concerned about the overall true significance level of a collection of several t tests. – 2017-02-04
2 Answers
From what you say I would propose the following two-factor ANOVA model, with interaction.
Time effect ($\alpha_i$) , two levels, fixed: early and late in the year
Intervention effect ($\beta_j$), two levels: intervention and control
Interaction: ($\gamma_{i,j}$), four cells.
Replication: three years (in each cell).
$$Y_{ijk} = \mu + \alpha_i + \beta_j + \gamma_{ij} + e_{ijk},$$ where $i = 1,2;\, j = 1,2;, k = 1,2,3;$ and $e_{ijk} \stackrel{\text{indep}}{\sim} \mathsf{Norm}(0, \sigma).$
There are $2 \times 2 \times 3 = 12$ observations (dollars received).
The ANOVA table will have rows as shown below:
Source DF SS MS F
---------------------------------------------
Time 1
Intervention 1
Interaction 1
Error 8
-------------------------------
Total 11
This is assuming there are three total income figures (one for each year) in each of four Time $\times$ Intervention cells in the data table.
From what you say, it seems there may be a significant interaction effect. If so, it will be difficult to interpret the 'main' effects Time and Intervention. There is a question whether incomes will be normally distributed. One would have to test for that.
If there is more to the structure of the study than what I have included, it's because you have not yet provided that information. Please leave a Comment or edit the question to include any additional relevant information.
-
0@MichaelHardy. Thanks for whatever edit. Now I'm fixing a few typos and awkward phrasing. – 2017-02-04
I am not sure who are 'they' in your answer. However, I believe you are talking about difference in difference estimation. The wiki should get you started but it is not a substitute for a decent econometrics book/course.
With this in mind let me explain the basic logic of difference in difference (DiD) estimation. Through this, I am hoping to answer your "Also, is there a point of having a control group in this situation. Would the before count as a control group."
Suppose you do an intervention and as a result of that intervention your variable of interest increases for the treated group from $10$ to $20$. If you are using "before counts as a control group" you would conclude that the effect of the intervention was to increase your variable of interest by $10$. However, your variable of interest can increase for reasons unrelated to your intervention. Maybe, it increased as a part of general trend.
One way to figure out whether your variable of interest increased due to your intervention or as a part of general trend (or for any other reason), you use control group. Suppose in the control group the variable of interest increased from $30$ to $50$. This let's you conclude that the effect of general trend is $+20$ and hence your intervention in fact decreased the variable of interest, since its "net of the trend change" is $(20-10)-(50-30)=-10$. Notice how the $-10$ is calculated, as a DiD.
Let me add two more things. First, the explanation just provided shows the key assumption of the DiD. That, absent the intervention, the effect of the trend would have been the same in the control and in the treated group. This is an assumption of the model and cannot be tested. Let me call it an "equal trend assumption".
Second, there is a way to judge how likely the equal trend assumption is to be the correct one if one has data from more than one year prior to the intervention. Suppose you have, for both groups, data from, say, three years prior to the intervention. Then you can check that the trend in the data is the same for the both groups and hence how likely it is that the trend would have been the same absent the intervention.
-
0A possibly relevant link, but this is a Comment, not an Answer. – 2017-02-03
-
0@BruceET Completely agree, I was too brief and not helpful enough. I expanded the answer hoping not to repeat what you've included in yours. – 2017-02-04