1
$\begingroup$

Background

I am trying to extract data from scientific publications. Sometimes an experiment can test two factors, e.g. a standard two-way factorial ANOVA.

Call these factors $A$ and $B$. If factor $A$ has two levels and $B$ has three, there are six total treatments.

If the effects of $A$ and $B$ are significant, but there is no interaction, only the 'Main' effects might be presented, e.g. five results, one for each level of $A$ and one for each level of $B$, averaged across all of the levels of the other factor.

Here is an example from Ma 2001 Table 2, in which $A$ would be the row spacing and $B$ would be the nitrogen rate.

enter image description here

Thus,

$7577 = \frac{X_{A_{20},B_{0}} + X_{A_{20},B_{112}} + X_{A_{20},B_{224}}} {3}$

$9186 = \frac{X_{A_{80},B_{0}} + X_{A_{80},B_{112}} + X_{A_{80},B_{224}}} {3}$

$3706 = \frac{X_{A_{20},B_{0}} + X_{A_{80},B_{0}}} {2}$ $9402 = \frac{X_{A_{20},B_{112}} + X_{A_{80},B_{112}}} {2}$ $12038 = \frac{X_{A_{20},B_{224}} + X_{A_{80},B_{224}}} {2}$

Question

Is it possible to calculate the means of each of the six treatments $X_{A,B}$, for $A\in[20,80]$ by $B\in[0,112,224]$ from these results?

  • 0
    These are actually only four independent equations, because thrice the sum of the first two must equal twice the sum of the last three (it's just the total of all six variables).2011-01-25

3 Answers 3

3

You can do it if you make some assumption to reduce the number of unknowns to five. You are saying you have an array

$\begin{array} {ccc} & 20 & 80 \\ 0 & a & b \\ 112 & c & d \\ 224 & e & f \end{array}$

where $a$ through $f$ are what you want to solve for. If the effects are independent and additive, you would expect $b-a=d-c=f-e$, $e-c=f-d$, and $c-a=d-b$. These reduce the data to only three values, which you can check for consistency. But without at least one more relation, you will get a one-dimensional continuum of solutions.

It sounds like you believe 3706 is some sort of weighted average of $a$ and $b$ and similarly for the other entries. Is that right?

  • 0
    thanks for helping me work through this.2011-01-26
2

The general rule is that $n$ equations allows you to solve for $n$ unknowns. So I don't think you'll be able to recover each of the original 6 data points. The best you can do is produce a set of constraints that, given any one of the 6 data points (or a relationship between them that isn't redundant with what you already have), would allow you to find the remaining 5.

  • 0
    @picakhu: of course, but I declined to discuss the exceptions here because as far as I can remember, special cases like that don't occur for linear equations like the ones in this question.2011-01-24
1

[Update Note] I saw the related question at mathoverflow. There it seemed we were dealing with a frequencies-table, so I repeated that scheme here. But now I see the question is focused on means in a two-factor anova. I'll see, whether that two concepts can be interchanged here; for instance a 1:1-reference should only be possible if the coefficients under treatments (means?) are based on the same number of observations. Possibly it is better to delete this answer later [end note]

Here is a solution. I computed the "expected frequencies" based on your values, where I compensated the * .../3* and the * .../2*-divisions. Also I corrected 9186 to 9817 to make the totals consistent.
$ \begin{array} {rrrrrrr} & & B: '0' & '112' & '224' &| & (all) & \\\ --- &+&----&-----&-----&+&---& &\\\ A:'20' &|& 3350.08& 8499.04 & 10881.88 &|& 22731 &/3 = 7577 \\\ '80' &|& 4061.92& 10304.96 & 13194.12 &|& 27561 & /3 = 9187 \\\ --- &+&----&-----&-----&+&--- & \\\ (all) & & 7412& 18804 & 24076 &|& 50292 & \\\ & & /2=3706& /2=9402 & /2=12038 & & & \\\ \end{array} $

  • 0
    @David: The last example in Wikipedia just reproduces the "expected-frequencies"-method. However, from the given 2x2-table example I do not recognize whether this is **always** the case (then the solution would be simple, but I doubt) Possibly any meaningful assumption for the measures in the treatment-groups should assume explicitely a chi-square=zero for the frequencies.2011-01-25