Let's say I have a string set such as "AAAABBBBAAAABAAAA" and I want to have some quantitative measure of most likely subset length. In the above example (4 sets of length 4, one set of length 1), it is "humanly" evident the next set will be 4 in length, but just making an average yields length of 3.4.
What would be best method to get "the answer closer to 4" ?