1
$\begingroup$

Disclaimer: I do not have any education on strings/matrices, so I am having trouble finding an answer for this question that does not rely on that precondition. If that is the only way to answer the question, any assistance in explaining the notation, etc. would be appreciated.

Problem: In genetics, there are four nucleotides in RNA. A combination of three nucleotides is called a codon, and there are 64 ($4^3$) codons possible. These 64 codons are paired with amino acids to construct the codon table. There are 20 amino acids and 1 STOP value, each of which can be assigned to any given codon. The only requirement for a "viable" codon table is that each of the 21 values must be assigned to at least one codon, but beyond that any level of repetition is allowed. There would be $21^{64}$ total arrangements, but how do I calculate the nonviable arrangements which do not include each value at least once?

1 Answers 1

0

Start by including all $21^{64}$ genetic codes (codon tables). Then inclusion/exclusion starts:

  • Exclude those codes that code for only 20 meanings. There are $\binom{21}{20}$ ways to choose those meanings, then $20^{64}$ ways to assign the meanings we picked to the codons.
  • Include the codes with 19 meanings: $+\binom{21}{19}×19^{64}$.
  • Exclude the codes with 18 meanings: $-\binom{21}{18}×18^{64}$, etc.

This continues until we get to the codes with only one meaning. Therefore our final count works out to be $$N=\sum_{j=1}^{21}(-1)^{21-j}\binom{21}jj^{64}\approx1.51×10^{84}$$

In a generalised setting, with $k$ meanings and $n$ codons, the number of viable codes is $$N=k!\left\{n\atop k\right\}=\sum_{j=1}^k(-1)^{k-j}\binom kjj^n$$ where $\left\{n\atop k\right\}$ is the Stirling number of the second kind, the number of ways to partition $n$ distinct objects into $k$ non-empty subsets.

  • 0
    I really appreciate your response, but I have a lot of questions. Firstly, why would we exclude the codes with only twenty meanings, but include the codes with 19 meanings?2017-02-08
  • 0
    @tomkat364 By naïvely subtracting the codes with 20 meanings we subtract too much, so we must add back the codes with 19 meanings. But then we have too much, so we must subtract the codes with 18. And so on. This is the principle of inclusion/exclusion.2017-02-08
  • 0
    Sorry, as I said I'm very new to this. So your notation (21 over 20) means we are looking at a set of 20 positions with 21 values to place in those positions. This effectively excludes one value from each set, and actually equals 21 sets. Having excluded one, we really only have 20 values to choose from, and so we now have 20^64 arrangements that are missing one of those values.2017-02-08
  • 0
    This continues for excluding 2 values (19 values to choose), excluding 3 values (18 values to choose), etc. Each of these is considered a separate set, so from the entire 21^64 set, we begin by excluding all the 20s, but this is too few, so we include the 19s, but that's too many, so we exclude 18s, and so on and so forth?2017-02-08
  • 0
    @tomkat364 Absolutely correct.2017-02-08
  • 0
    Okay, I follow so far. But when I actually do that calculation (longhand because I have no experience with summations) I get 4 x 10^84. By longhand I mean 21^64 - 20^64 + 19^64...-2^64 +1^64 = 4.010 e 84. I'm including the odds and excluding the evens, but my answer is slightly off. Any reason the longhand is not accurate?2017-02-08
  • 0
    @tomkat364 You need to multiply the relevant binomial coefficients with the powers, as I explained above.2017-02-08
  • 0
    (21 over 19) being 420, and (21 over 18) being 7980... So longhand should really be 21^64 - 21(20^64) + 420(19^64) - 7980(18^64) ... Now I see why you use summations :) Thanks a lot.2017-02-08