6
$\begingroup$

In psychology we sometimes use balanced latin squares for the order of our tests to counterbalance first-order carry-over effects (fatigue, learning, etc.) .

For our current study we want to pretest 12 stimuli (let's call them A-F) to see whether they're useful for a later study. We don't want to bore our participants, so we wanted to give them only half of all the material we need to test. We're indifferent about the size of the subset of 12 as long as it is anything between 4-8 stimuli per participant.

For a different reason (to achieve sufficient statistical power) we need at least 132 participants (at least 11 runs where each stimulus occurs first), we don't want to exceed this too heavily.

A balanced latin square 6*6 isn't too hard to construct. There is a Matlab script as well.

A   B   F   C   E   D B   C   A   D   F   E C   D   B   E   A   F D   E   C   F   B   A E   F   D   A   C   B F   A   E   B   D   C 

But is it also possible to construct a balanced (latin) rectangle (6 columns wide), where each letter is followed by another letter an equal amount of times? How many rows (participants) would this yield?

Maybe somebody with a bit more handle on this problem will enjoy the puzzle!

Sorry if my language is too idiosyncratic, if I can clarify with the appropriate jargon I'll duly comply, this is quite outside my field.

Splitting it in the middle and then adding the broken-up orders seemed the wrong approach to me.


Edit: Can I find one computationally? I have no idea how ridiculous that question is, but the sheer number of permutations (479 001 600) does seem daunting.


Edit 2: I didn't want to make this question too much about our experiment, but apparently that made it less clear. I'm sorry. I edited the clarifications into the question.

  • 0
    I don't understand the question. First of all, you write about numbers followed by other numbers, but I don't see any numbers - do you mean letters? Second, what does it mean for one number to follow another? Does "12 stimuli" mean you actually have 12 letters and you're trying to find a something-by-6 array such that each letter occurs equally often, no letter occurring twice in any row or column? Maybe you could give an example with 12 and 6 replaced by, say, 4 and 2 so we could see what you really want.2011-07-01

3 Answers 3

2

I don't understand the question, so this is probably not an answer, but maybe any objection will clarify the question.

A B D G K E   B C E H L F   C D F I A G   D E G J B H   E F H K C I   F G I L D J   G H J A E K   H I K B F L   I J L C G A   J K A D H B   K L B E I C   L A C F J D   

There are 12 symbols in 6 columns, each symbol occurs exactly once in each column, no symbol appears twice in any row, no symbol appears with the same symbol immediately to its right more than once.

  • 2
    @Ruben, OK, we're getting somewhere. I'll give it some more thought (but I suspect that if Douglas thinks it's hard, I'll get nowhere fast).2011-07-01
3

If I understand the question correctly, you are seeking a $12 \times 6$ Latin rectangle (strictly speaking, this is the transpose of a Latin rectangle) in which each of 12 symbols (each representing a stimulus) occurs exactly once in each column (representing a round of experimentation) and at most once in each row (representing the participants). There is an additional condition that the Latin rectangle be balanced (i.e. each possible pair of symbols occurs in horizontally neighbouring cells).

This cannot be achieved since there will be exactly 6 copies of any symbol in the Latin rectangle, but there will be 11 other symbols. Simply put, they won't fit. The closest you could get is to take a balanced Latin square and chop off the last 6 columns (this is probably not suitable for your experiment).

A natural next step would be to modify the scheme so as to incorporate more rows (i.e. participants) and weaken the "Latin" property. I.e., several participants may receive the same stimulus at the same time. In such a modified scheme, we could balance the carry-over effects by ensuring that each ordered pair of symbols (A,B) appear in horizontally adjacent cells an equal number of times (as opposed to exactly once in the Latin square case). If you stick to exactly 6 rounds of testing, then each row contains exactly 5 ordered pairs. Unfortunately, there are $12 \times 11=132$ ordered pairs in total (which is coprime with 5), implying that the smallest balanced such scheme would require at least 132 participants.

If you're flexible on the number of rounds, if you had 7 rounds of testing, it's likely going to be possible to find such a scheme with only 22 participants (since there are 6 ordered pairs per row, and 132 is divisible by 6). That is, it seems likely there will be a $22 \times 7$ matrix containing the symbols $1,2,\ldots,12$ that is balanced (although actually finding one might be a bit tricky).

  • 0
    [Sorry for slow response, I'm travelling at the moment] @Jyrki: ah, good point. So it sounds like the 132 x 6 is the most suitable (although requires quite a few participants).2011-07-03
3

I need to change your parameters a bit, but it sounded like you would be flexible, so let me suggest the following idea.

The idea only works when the number of stimuli is a prime number $p$. So if you need to test exactly 12 stimuli, then this is useless, but may be you can leave one out, or add a placebo/null test to the mix, and use this idea with $p=11$ or $p=13$.

The scheme has $p(p-1)$ rows and $k$ columns, where $k$ is any number between $2$ and $p$ inclusive.

Instead of letters A,B,... I use numbers $0,1,\ldots,p-1$ as table entries. One of the standard constructions for Latin squares is the following: First pick a parameter $m$ that is an integer in the range $1\le m. Then put on row #$i$ and column #$j$ the number that equals the remainder of $i+mj$ when divided by $p$. This gives us a $p\times p$ Latin square. Call it $L(m,p)$. In this Latin square all the pairs of consecutive entries on all rows differ by $m$. Therefore this square alone is the very opposite of balanced. Within this square a zero is always followed by an $m$, a one by $m+1$ et cetera. Note that we count cyclically $\pmod p$, so an entry $p-m$ is always followed by $(p-m)+m=p\equiv 0$, and so forth.

Here comes the trick. We build a larger table with $p(p-1)$ rows by putting all the Latin squares $L(1,p)$, $L(2,p)$, $\ldots$, $L(p-1,p)$ on top of each other. By using all the possible values of $m$ we get a balanced table in the end!

We can take the $k$ first columns of this large table, and be done with it. Each entry occurs on all the columns exactly $p-1$ times. If two distinct entries, say $a$ and $b$, differ by $m=b-a$, the pair $(a,b)$ appears in the rows of $L(m,p)$ exactly $k-1$ times - once per each pair of adjacent columns, and doesn't appear on any other rows. There are no repetitions of stimuli within rows, as the rows are parts of rows of a Latin square.

Why doesn't this work with $p=12$? IOW, why do I insist that $p$ must be a prime? The reason is that the formula $i+m*j \bmod p$ gives a Latin square only, when $m$ is coprime to $p$. For example, if $p=12$ and $m=6$, then the rows of $L(6,12)$ look like 0,6,0,6,...; 1,7,1,7,...

With $p=13$ you get 156 rows, so the table may be too large for you. Another possibly troubling feature of this construction is the following. The rows of the Latin square $L(1,p)$ look like 0,1,2,...; 1,2,3,...; so they have intersections of $k-1$ elements. This may be bad for eliminating secondary correlations from your test. If you do $k=6$ tests per participant, then five participants will see stimulus #1 followed by stimulus #2. That's ok, but I am a bit troubled by the fact that four out of those five will see stimulus #3 next. And the fifth person won't see anything, because his/her day ends after stimulus #2. Similar patterns appear in other component squares $L(m,p)$. If these shortcomings make the construction unusable, then I apologize for wasting your time.

[Edit: Oh boy, I should learn to proofread and not post in such haste. I apologize for the mostly illegible first version :-(]

  • 2
    @Ruben: Correct. That formula gives a $p\times p$ Latin square $L(m,p)$, whenever $m$ and $p$ are coprime. But to get a balanced design, we need to use all $m$ in the range $[1,p-1]$, and as you noticed, this forces $p$ to be a prime.2011-07-04