30
$\begingroup$

100 prisoners are imprisoned in solitary cells. Each cell is windowless and soundproof. There's a central living room with one light bulb; the bulb is initially off. No prisoner can see the light bulb from his or her own cell. Each day, the warden picks a prisoner equally at random, and that prisoner visits the central living room; at the end of the day the prisoner is returned to his cell. While in the living room, the prisoner can toggle the bulb if he or she wishes. Also, the prisoner has the option of asserting the claim that all 100 prisoners have been to the living room. If this assertion is false (that is, some prisoners still haven't been to the living room), all 100 prisoners will be shot for their stupidity. However, if it is indeed true, all prisoners are set free and inducted into MENSA, since the world can always use more smart people. Thus, the assertion should only be made if the prisoner is 100% certain of its validity.

Before this whole procedure begins, the prisoners are allowed to get together in the courtyard to discuss a plan. What is the optimal plan they can agree on, so that eventually, someone will make a correct assertion?

http://www.ocf.berkeley.edu/~wwu/riddles/hard.shtml

I was wondering does anyone know the solution for the 4,000 days solution. Also, anyone got any ideas on how to solve this?

Just wanted to know if anyone got any ideas how to solve this. http://www.siam.org/ is offering prize money for a proof of a optimal solution for it. I suppose to make it clear the people want a plan that would mean on average it would take least amount of time to free the prisoner. The point is the average run time is as low as possible.

  • 2
    Depends entirely on the values of the outcomes (freedom, death) and cost of another day in prison.2012-03-04
  • 3
    Why did someone downvote this. The riddle is well defined and if you click on the link it's okay. Plus there is a riddle tag.2012-03-04
  • 0
    @AlexBecker no . The point is on average you want a strategry that takes the least amount of time so everyone is freed.2012-03-04
  • 2
    @Alex The cost of death is infinity; the cost of another day in prison is 1; and the value of freedom doesn't matter, because any strategy that avoids infinite punishment will result in freedom.2012-03-04
  • 1
    That SIAM link is a bit unhelpful, steadyboy! Can you be more specific?2012-03-04
  • 2
    @AlexBecker, it's a hard condition that failure must he impossible. Here's a simple strategy that provides a small postive chance of success and zero risk of failure: prisoner 0 turns the lamp on on days divisible by 100 and off on all other days. No other prisoner ever turns it on. Prisoner $n$ for $0$n$ modulo 100. Prisoner 99 declares success if he's in the living room on day 99 modulo 100 and sees the light being on. Expected running time about $100^{101}$ days; will succeed eventually with probability 1. – 2012-03-04
  • 1
    @TonyK: Indeed. Google turns up only one other mention of this and the wording is almost identical with no link to any "prize". Color me skeptical for various reasons.2012-03-04
  • 0
    This question is highly flawed. You state no prisoner can see the light bulb from his/her own cell but I am wondering can they see the light? If so, then any of the $100$ prisoners that can can be a "counter" and any "new" person that enters the central living area can toggle the light to indicate it is their first time in the room. Problems are you didn't mention if the warden can also toggle the light to mess up the system. You also stated prisoners can be shot but you didn't say with what. Perhaps a camera?2015-01-03
  • 0
    I have a hard time thinking of a [nice] plan. But if I were forced into this I would advise each prisoner to decline a second invite. Remove random, 101 days minimum. Sounds like Alcatraz.2017-07-11

6 Answers 6

18

The "standard" solution for the puzzle is the following. Prisoners choose one of them, lets call him "The Counter". The job of The Counter is to count the prisoners who have been in living room. Other prisoners' job is to give a signal to The Counter that they have been in the living room. This is done as follows. The Counter is the only person who can turn the light off. Others can only turn the light on and do it only once. That is if a prisoner walks into the living and the light bulb is off and he has never turned the bulb on, then he turns it on. The bulb will stay on, until The Counter comes and turns it off. When The Counter has turned the bulb off 99 times, he knows all prisoners have been in the living room.

I don't know what is efficiency of this solution, but in your link it is stated that the standard solution takes 27-28 years on average. So I suspect that is efficiency of this solution.

  • 0
    Back of the envelope calculuation: Each time the bulb is off we have to wait until a yet uncounted prisoner comes around. This takes longer as time goes by, but in total we should expect about $$\frac{100}{99}+\frac{100}{98}+\frac{100}{97}+\cdots+\frac{100}{1} \approx 100\log 99$$ days of waiting in this phase. Once it is on we must wait for the counter to show up, on average 100 days per prisoner we count. This works out to between 28 and 29 years, but the logarithmic estimate may be slightly inflated.2012-03-04
  • 0
    @Henning: Each time the Counter is waiting to go to the living room, the probability that the light will be turned on is the probability that one of the uncounted prisoners gets to the living room before the Counter. If there are $u$ uncounted prisoners, this probability is $\dfrac{u}{u+1}$. Thus, the number of passes needed for the Counter to see the light on is $\dfrac{u+1}{u}$.2012-03-04
  • 0
    @Henning: Therefore, the expected number of days to count all $99$ other prisoners is $$\left(\frac{100}{99}+\frac{99}{98}+\frac{98}{97}+\dots+\frac32+\frac21\right) \cdot100$$ and this is approximately $(99+\log(99)+\gamma)\cdot100\approx10417$ days, which is about $28.5$ years.2012-03-04
  • 0
    @Henning: now that I reread your answer, I notice something I missed before, and I believe that our answers are the same. Sorry for the interruption :-)2012-03-05
  • 0
    If we would not prescribe the person doing the counting, but instead take the person who is with the warden the second day as the counter, we would improve right?2017-04-09
9

Here's one way to improve on Levon's solution. I don't have time to do a full probability analysis just now, but my very rough estimates suggest that with careful tuning, techniques like this should be able to get the average running time down to around 4000 days:

It's not certain but "pretty likely" that every prisoner gets to visit the living room at least once during the first, say 800 days. So divide the strategy into several phases:

  • Day 1 to 800: Everyone toggles the lamp the first time they see the living room. If we're lucky, after day 800 there will be 50 prisoners who have ever turned off the lamp. They are the "active" prisoners. (If we're unlucky and there's someone who hasn't been in the living room yet, there will be less than 50 active prisoners).

  • Day 801 to 1600: Ever prisoner who is active after the first phase toggles the lamp the first time they see the living room during this period. (The prisoner who enters on day 801 makes sure the lamp is already off before he does his part -- if so, something will have gone wrong but proceed according to plan anyway). The prisoners who turned the lamp off during this phase are still active. If we're lucky there are 25 off them.

  • Day 1601 to 2400: Halve the number of active prisones once again. This time those who turned the lamp on stay active; there are now 13 active prisoners unless something went wrong.

  • Day 2401 to 6000: Count the 13 still active prisoners using the standard algorithm, with a Counter selected in advance.

If anybody finds themselves still in prison on day 6001, everyone shrugs and starts the entire algorithm over from day 1. Since 3600 days should be plenty of time to count 13 active prisoners, the main risk is that someone who needed to didn't get to participate in one of the first three phases, and since the risk of that is small, it won't affect the expected running time of the algorithm much.

With the parameters given here (and several approximation shortcuts in the calculation) this strategy has an average running time of 4353 days. But there's room for some optimizations by tweaking the numbers. It appears that 800 days is close to optimal for the first halving, but the later halvings can be shortened a bit without incurring quite as large a total risk of missing an active prisoner and having to restart.

EDIT: After a semi-systematic search for better parameters it seems that the best this method can get is an average running time of 4227 days (about 11½ years), which is achieved by three halving phases of 840, 770 and 700 days, followed by a counting phase of 2200 days. Neither more halving phases nor fewer ones improved the overall efficiency. Back to the drawing board ...

1

A survey of various strategies can be found at https://www.math.washington.edu/~morrow/336_11/papers/yisong.pdf. It describes a strategy ("two-stage counting") which is believed, based on simulations, to yield expected run times between 3500 and 4000 days, and references a hybrid strategy by a Bertram Felgenhauer which is supposedly proved to run in 3949 days average.

1

A random person chosen could mean that all $100$ prisoners will never all be chosen in their lifetime so it seems to me that there is no minimum amount of time to satisfy the "$100$% certain" situation (in the general case) so what is the point of this question? It may take anywhere from $100$ days to a lifetime. Why don't they just each get a number assigned to them from $1$ to $100$ and make a mark in the living room when they are in it? Prisoners are good at that type of stuff. Perhaps scratch a binary (stick men) code?

Also, what happens if one of the unchosen ones dies before all $100$ are chosen? How is it verified if all the prisoners have visited the living room or not? If it is written down somewhere then maybe the prisoners should find out where and use that info.

Too many unknowns in this badly worded question to come up with a reasonable answer.

  • 2
    You can do it as a practical exercise (highly unlikely to ever occur in real life) or you can do it as a mathematical puzzle. In the latter case you idealize the problem so that it works as designed.2015-01-08
  • 1
    Changing the problem by "idealizing" it will then change the answer depending on what "liberties" the person solving it takes. This problem could have been worded much better to clearly state any restrictions and reasonable assumptions. Without those, people can just make assumptions (even unreasonable ones). For example, if you allow me liberties I have a $100$ day solution. I will assume the warden is new and doesn't know the prisoners so when they are "randomly" called, the prisoners will "come forward" in order (from $1$ to $100$) that they preagreed on. What do I win for my solution?2015-01-08
  • 0
    I think most people would interpret "imprisoned in solitary cells" to mean that the prisoners spend _all_ their time there except the explicitly specified visits to the living room, so one can hardly "step forward" when one's day arrives. You can assume the prisoners can communicate reliably via the light switch and _only_ via the light switch, or you can assume other communication; if you don't assume well-defined limits on that communication then the problem becomes uninteresting, but no doubt there are interesting variations that can be devised.2015-01-08
  • 1
    My point is either ask the question clearly or expect to have answers to "changed" questions cuz many different interpretations/variations are possible. I've seen a similar problem on a show called Brain Games I think. They were able to toggle 3 lights with 3 switches but could not see the lights and they had only 1 opportunity to ID which was light 1, which was 2, and which was 3. The common problem I see with word problems like this is they have flaws in them that can be exploited easily. My solution was to just set up mirrors to see which lights were being toggled. The host "forgot" that.2015-01-08
  • 0
    Some mathematical problems are much easier to state with words, as in this case. If you provide the statement as some kind of Markov chain; it becomes less "engaging" and perhaps harder to visualize. Steven Pinker (indirectly) holds that this way of conveying ideas is much better than the directly mathematical version.2017-02-06
1

The shortest time solutions that I am aware of are based on this scheme, first suggested by Paul Hammond here.

Time is broken into Rounds, and each round is further broken into two Stages. In Paul Hammond's original scheme, the first stage in each round is 2600 days, while the second is 2700 days.

Each prisoner is considered to start with a virtual "token". If the light is off, they can leave their token in the room by turning the light on (so they no longer have a token themselves). They can pick up a token left in the room by turning the light off. At the initial meeting, A head counter and 9 assistant counters are picked. Everyone else is a "drone". Drones never pick up a token (except on the last day of a stage). They can only leave one.

During the first stage, the drones leave their token in the room when the light is off. If the light is on, or they have already given away their token, they do nothing. All counters pick up any tokens left in the room until they have 10 tokens (including their own). After that, they do nothing in this stage. On the last day of the stage, whoever visits the room picks up any token that was there (if this was a drone, and he had not yet given his own token away, then he will have two tokens to pass on in the next round, which he must do one at a time).

During the 2nd stage, tokens are passed 10 at a time from assistant counters to the head counter. Drones never turn the light on or off. Asssistant counters can only turn on the light if they have 10 tokens to pass. After passing, they become drones. The head counter turns the light off on his next visit, and adds 10 tokens to his inventory. If he has all 100 tokens, he declares everyone has visited. As with the first stage, if the light is on on the last day of the stage, the visitor picks up the 10 tokens there, and saves them for stage 2 of the next round, when he will act as an assistant counter.

If the round is unsuccessful, an additional round with new stage 1 and stage 2 take place, and so on, until the head counter finally reaches 100.

Paul Hammond originally had them start over with each new round (everyone gets their tokens back and all counters reset their counts to just their own token). Computer simulations estimate the expected run time at 3964 days. AlexH suggested saving the tokens as I described above. This change and tweaking of stage lengths (which need not be the same between rounds) produced a claimed estimated expected run time of 3600 days. However, the person posting that result did not say what stage lengths he used. However a later variant of the scheme which picks the counters during a special 40 day period at the start and uses initial stage lengths of 2000 and 1500, while all later stages are 300 days each, results in an estimated expected run time of 3489 days.

A later scheme appears to have dropped that under 3400 days, but I have not yet examined that scheme.

0

The halving strategy seems to work well - that is in a frrst phase of x1 days everyone toggles the lamp on first entry (lamp is always turned off on the last day of a phase). On the next phase i+1 everyone that toggled the lamp to a certain state in phase i, continue. This certain state depends on the number of players in phase i - if it was even (as initially - 100), then everyone who turned off the lamp continue for i+1, if it was odd - then everyone that turned it on - this is to ensure noone "slipping". This should continue until there is a phase with 2 players. Then the if one of the two enters and sees the lamp on, he knows that he is second, and in retrospect that in each previous phase all the active players managed to enter the room, including the first phase, so everyone has been at least once there.

The phases are with 100, 50, 25, 13, 7, 4, and 2 players, so if we try to solve the following minimization problem - minimize the sum of the phase lengths, with the constraint that the total probability of all phase players go into the rooms at least once on each phase is exactly (or at least - depending on what minimization algorithm is used) 0.5. Of course, a different target probability can be set.

There are two approaches: 1. solve a constrained problem where objective is total days, and constraint is total probability = 0.5 (or something else). 2. solve unconstrained problem where the objective function somehow includes total length and target probability (this allows greater variety of methods).

In all cases there is a general problem that the actual domain is integers and minimizing functions over integers is hard to do, so at least I defaulted to approximations. Additionally it seems that the constrained task is hard and the unconstrained task is nonconvex.

For Nelder-Mead and somewhat contrived target function I get different solutions (all very close to p=0.5) depending on starting point. The minimum, achieved from a sane starting point is around 3439 days. (See below for sane starting point).

SLSQP and COBYLA seem to behave more consistently once some numerical problems are resolved (both in objective function and constraint) and give 3455.

Regarding sane initial guess - it seems reasonable to guess that if target probability is 0.5. then we can pick initial phase lengths so that all phase probabilities are equal - to 0.5^(1/7). Experimentation confirmed that this may be a good initial guess.

  • 0
    P.S. Of course this is minimizing only the length for *this* particular scheme. I have no guess on a more generic approach.2016-03-15
  • 0
    I found this [paper](https://www.ocf.berkeley.edu/~wwu/papers/100prisonersLightBulb.pdf) that features an O(n*ln(n)^2) solution - that would be 2120 day..2016-03-15
  • 0
    $O$ notation doesn't work that way. It cannot give you a specific value for any fixed $n$. It only tells you how the sequence grows as $n$ increases. In this case, William Wu produced an upper bound formula for the expected time, and used that formula to justify $O(n(\ln n)^2)$. If you calculate the upper bound formula for $n = 100$, the value is $7585$ days. But in fact, if you read through the Wu Forums thread, you will see that computer testing of this scheme estimates the expected time to be around $4250$ or $4500$ days. Some other schemes given are $< 4000$ days.2016-05-15