2
$\begingroup$

I would like to construct a length $N$ string over a $k$-letter alphabet, $S$, such that any substring of $P$ sequential characters in $S$ is unique for as small a value of $P$ as possible. To clarify, "unique" means that the fixed-length substring or its transpose occurs at most once in $S$.

For fixed $N$ and $k$, what is the smallest possible value of $P$, and how do I construct $S$ in such a manner as to minimize $P$?

  • 0
    I can't parse this question. I don't know what it means for a substring to be unique - I suppose you mean you want to minimize the value of $P$ such that no substring of length $P$ appears twice? I don't know what "between any substring" means. Usually, "between" takes *two* objects - between such-and-such *and* somehting else. Also, you're trying to minimize one thing while at the same time maximizing something else, which seems like asking the impossible. Maybe an example or three would clarify things.2011-09-06
  • 0
    @Gerry, "Also, you're trying to minimize one thing while at the same time maximizing something else, which seems like asking the impossible." - fair criticism, I've removed the optimization component of the problem. As for "between any substring" I mean any pair of fixed-length arbitrarily selected substrings in 'S'. Hopefully I clarified in the above problem what I meant by "unique".2011-09-06
  • 0
    This might be relevant: http://en.wikipedia.org/wiki/De_Bruijn_sequence2011-09-06
  • 0
    What is the "transpose" of a string?2011-09-07
  • 0
    Perhaps http://math.stackexchange.com/questions/15510/what-is-the-shortest-string-that-contains-all-permutations-of-an-alphabet is of interest.2011-09-07

1 Answers 1

1

If P is odd, there are $\frac{k^P}{2}+\frac{k^\left({\frac{P+1}{2}}\right)}{2}$ distinct substrings (avoiding transpositions). If P is even, there are $\frac{k^P}{2}+\frac{k^\left({\frac{P}{2}}\right)}{2}$. $N$ must be at most this plus $P-1$. Constructing such a string shouldn't be hard, but I don't have an algorithm. Maybe this naive algorithm will work-append a character that doesn't produce a duplicate substring and within those, the one used least so far.

  • 0
    thanks for your answer! My main interest here, and the difficulty I'm having, is in finding an algorithm that is guaranteed to construct optimal 'S'...2011-09-07