1
$\begingroup$

say, i have got 3x'a', 5x'b', 2x'c',4x'd', as char collection.

2 strings are formed, each consists of all the chars given. eg. string A = 'aaabbbbbccdddd' B='abcdabcdabbbdd'

so both strings have length 14

now, we can randomly choose 3 consecutive chars from each string. say, I choose X ='aaa' from A, and Y = 'abc' from B.

define a function good(), which compares the two selected string segments. original flag = false. if the first char from X = the second char from Y then flag = true. if the second char from X = the third char from Y then flag = true. if the second char from X = the second char from Y then flag = true (yes it is not symmetric). good() returns 'good' only if the flag remains false after all the checks.

I know i can write a computer program to figure out the 'good' probability given A and B. But now I would like to find out the max and min of the good probability of different A and B from the char collection. Iterate all possible permutation would be physically infeasible. so I seek an approximate approach.

Any thoughts?

2 Answers 2

1

All three comparisons involve one of the two second characters. Only segments that differ at the second character can be good, and we want to minimize/maximize the chance that if they do, the second character also differs from the neighbouring characters on the other segment. That means we should try to minimize/maximize the number of repetitions. Also, the last character in A and the first character in B never get matched; you should put rare/frequent characters there to minimize/maximize the chances of the remaining ones differing from others. I suspect that any strings that obey these two constraints (minimal/maximal number of repetitions, most rare/most frequent characters in the two special slots) will have the same minimal/maximal chance of producing good pairs, or at least to a good approximation.

0

I think another way to formulate your problem is to number the letters within strings $A$ and $B$ from $1$ to $14$, and let $A_m$ and $B_n$ denote letters number $m$ and $n$ from the two strings. Define the function $g$ over $A,$ $B,$ and $n$ by $ g(A,B,n) = \begin{cases} 1 & A_n \neq B_{n+1} \land A_{n+1} \neq B_{n+2} \land A_{n+1} \neq B_{n+1}, \\ 0 & \text{otherwise}. \end{cases} $ Then you wish to choose fixed strings $A$ and $B$ in order to minimize (or maximize) $P(g(A,B,n)=1),$ where $n$ is a random integer such that $1\leq n\leq 12.$

Note that if we just set $B_n=A_{n-1}$ for $n=2,3,\ldots,13,$ you guarantee that $g(A,B,n)=0.$ For example, let $A=aaabbbbbccdddd$ and $B=daaabbbbbccddd.$ Then clearly $P(g(A,B,n)=1)=0,$ which is the minimum possible probability.

On the other hand, suppose you let $A=aaabbbbbccdddd$ and $B=bbccddddaaabbb.$ Then since $A_n \neq B_{n+1}$ whenever $1\leq n\leq 13$ and $A_n \neq B_n$ whenever $2\leq n\leq 13,$ You have guaranteed that $g(A,B,n)=1,$ that is, $P(g(A,B,n)=1)=1,$ which is the maximum possible probability.