2
$\begingroup$

I am studying the article at the following link,

http://www-stat.stanford.edu/~cgates/PERSI/papers/MCMCRev.pdf

Which applies Monte Carlo analysis to a decryption problem. The math is admittedly over my head at this point but I believe I understand the basics.

The goal is to compute a 'plausibility' (scalar) value, I think, that represents how close a decoded message is to making sense in English. The benchmark is defined by a 2-dimensional matrix that represents the frequency that characters occur together in a string of english language.

Is there a method using the dot-product of two matrices that produces a measurement of similarity between those two matrices on a scale of 0 - 1?

(The relevant function is on the first page of the pdf I linked).

Thanks.

1 Answers 1

1

So in the equation there, $M$ is a 26x26 (say) matrix. For letters $\ell_1$ and $\ell_2$, $M(\ell_1, \ell_2)$ is the conditional probability that the next letter is going to be $\ell_2$ given that the current letter is $\ell_1$. For instance:

We take a book, and count the number of times the letter 'e' occurs in that book -- say 15,000. Then we count the number of times the next letter is 'f' -- say, 200. With this in hand, we put

M(\text{'e'}, \text{'f'}) = 200/15000 = 1.33%.

Now the plausibility score,

$\text{Pl}(f) = \Pi_i M(f(s_i), f(s_{i+1}))$,

is just the associated probability of picking $f(s)$ among all possible strings of text of that length, based on our model that says that the only thing that influences what letter appears at a particular position is the letter before it. So it's nothing as complicated as you were imagining.

  • 0
    @user3296: I see. Thanks for the explanation. Since he accepted your answer, I presume you understood the problem better than I did.2011-09-14