Suppose that we have access to only the clues of a crossword puzzle along with the number of letters that the answers are supposed to be. Is there an algorithm that we can use to reconstruct the crossword puzzle? Apart from the clues and the length of answers we also know the grid size of the crossword.
Is there an algorithm to recover a crossword grid based on the clues alone?
-
0@TheChaz I do not see why multiple answers matter for the purpose of reconstructing the *empty* grid. – 2011-10-28
4 Answers
Edit: Here are two American crossword shapes which have the same clue numbers and lengths. They are rotationally symmetric, every letter is "checked", the white squares are connected, and the board is a square with odd side length (7).
- Across 1(3), 4(6), 7(6), 8(3), 9(6), 12(6), 13(3)
- Down 1(7), 2(7), 3(7), 4(2), 5(2), 6(2), 9(2), 10(2), 11(2)
$\left[\begin{array}{rrrrrrr} ◼& ◼& 1 & 2 & 3 & ◼ & ◼ \\ ◼& 4& ◻ & ◻ & ◻ & 5 & 6 \\ ◼& 7& ◻ & ◻ & ◻ & ◻ & ◻ \\ ◼& ◼& 8 & ◻ & ◻ & ◼ & ◼ \\ 9&10& ◻ & ◻ & ◻ &11 & ◼ \\ 12& ◻& ◻ & ◻ & ◻ & ◻ & ◼ \\ ◼& ◼&13 & ◻ & ◻ & ◼ & ◼ \\ \end{array}\right]\qquad \left[\begin{array}{rrrrrrr} ◼ & ◼ & 1 & 2 & 3 & ◼ & ◼ \\ 4 & 5 & ◻ & ◻ & ◻ & 6 & ◼ \\ 7 & ◻ & ◻ & ◻ & ◻ & ◻ & ◼ \\ ◼ & ◼ & 8 & ◻ & ◻ & ◼ & ◼ \\ ◼ & 9 & ◻ & ◻ & ◻ &10 &11 \\ ◼ &12 & ◻ & ◻ & ◻ & ◻ & ◻ \\ ◼ & ◼ &13 & ◻ & ◻ & ◼ & ◼ \\ \end{array}\right]$
I think the result is not unique in general. You might want to look for some smaller-but-realistic crosswords and see to what extent "reasonable" examples are unique. Here is a very short crossword:
The crossword is 2×3
- 1 Across [3 letters]
- 2 Down [2 letters]
- 3 Across [1 letters]
Solution (empty grid) #1: $\begin{array}{r} \square\square\square\\ \blacksquare\square\blacksquare\\ \end{array}$
Solution (empty grid) #2: $\begin{array}{r} \square\square\square\\ \blacksquare\blacksquare\square\\ \end{array}$
You might like a similar type of puzzle: Griddlers.
There they tell you which row the across clues are in and which column the down clues are in. If we knew that for the 2×3 example, then the answer would be unique. A "griddler" wouldn't have to be unique, but I believe the website requires posted puzzles to have unique solutions.
Apparently I don't know much about how crosswords are supposed to look, btu in my defense it looks like there are some serious cultural differences in crosswords. See the wikipedia article and a bit lower on the diagramless rules.
-
0hmm, that seems to work a bit better but the crossword is not square. :-) Anyway, I guess you are right about the lack of a unique solution. – 2011-10-28
Kathryn Lybarger constructed a small example to show that having the clues and the grid size, but not the clue lengths, is not enough information to reconstruct the grid for an American crossword puzzle.
Consider the following two grids:
$ \begin{array}{rrrrr} \blacksquare \square \square \square \blacksquare \\ \square \square \square \square \square \\ \square \square \blacksquare \square \square \\ \square \square \square \square \square \\ \blacksquare \square \square \square \blacksquare \\ \end{array} \quad \text{and}\quad \begin{array}{rrrrr} \blacksquare \blacksquare \square \square \square \\ \square \square \square \square \square \\ \square \square \blacksquare \square \square \\ \square \square \square \square \square \\ \square \square \square \blacksquare \blacksquare \\ \end{array} $
Then one can confirm that for both puzzles, the Across clues are 1, 4, 6, 7, 8, and 10, while the Down clues are 1, 2, 3, 4, 5, and 9. Of course, the clue lengths differ in this example.
Not an answer, but too much for a comment: This is similar to (but not the same as) the problem of doing diagramless crossword puzzles. In that case, you are given the size of the grid and the numbers of each clue. The number of the second down clue gives you the length of the first across clue, and the clue numbering continues to be important as you solve the puzzle. It is very important in that case to specify that you are playing by crossword rules: diagram symmetric under 180 degree rotation, all letters checked, no two letter words. One thing to note is that puzzles get harder as the number of black squares increases. I haven't done a study, but strongly suspect that you cannot recover the diagram just from the clue number information. If you can, it isn't obvious.
Providing the number of letters in the answer is in the spirit of British or cryptic crosswords, though in those cases the information is also available from the diagram. Many variety cryptics provide that information and it is an important clue.
I've done a few crosswords under such conditions before (clues with word-lengths, missing grid) and I would make a few points:
(1) Uniqueness of grid is certainly not guaranteed - this is especially obvious if you are also in the position of not knowing the size of the grid.
(2) Therefore any program to produce the right grid will probably require at least some human intervention to narrow down the possible solutions.
(3) In practical terms, you may as well enjoy the combined puzzle of solving the crossword and reconstructing the grid simultaneously - once you are reasonably confident of a couple of the answers, the need for intersecting letters to agree will rapidly reduce the range of possibilities. (And after all, the grid mostly only helps solve the clues once you have made a start). But use a pencil - you'll have to move some of the words around.
(4) I'll probably now go off and try to write a computer program to make a 'best guess' at the grid for a given set of clues, so if you're really keen to have such a program, email me sometime on i.bendr@gmail.com and I'll report on my progress.