0
$\begingroup$

I collected data from an experiment. This data is basically a set of sequences. For instance:

seq 1: 1,3,4,4,5,4,5,4,...,5
seq 2: 3,4,1,5,7,8,9,4,...,7
seq 3: 2,3,2,3,2,3,2,2,...,3
...
seq N: 1,2,3,4,5,6,10,20,...,50

I would like to know if a given set is nearly uniformly distributed.

A naive way would be comparing the difference between the maximum and minimum values. If the difference is relatively small, this approach is okay (eg, like in seq 3). However, in the opposite case, this method fails since the numbers can grow linearly, logarithmically, or exponentially.

  • 0
    Google "qq plot"2017-01-04
  • 0
    @N74 thanks for the suggestion but I need to quantify how normally distributed the sequence (not visualizing).2017-01-04
  • 0
    So make a linear regression on the qq plot and use the $\rho^2$ as the quality factor.2017-01-04

1 Answers 1

0

I'm not sure abut your suggestion - the difference between the largest and smallest outcome (the last and first order statistic) will give you an idea of the range (support) of the data, but not about the distribution on this range.

There are a number of statistical testing procedures that can be used to test if a given sample $X_1,\ldots,X_n$ belongs to a distribution function $F$, see for example:

https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

https://en.wikipedia.org/wiki/Anderson%E2%80%93Darling_test

https://en.wikipedia.org/wiki/Cram%C3%A9r%E2%80%93von_Mises_criterion

They have in common that they all compare the hypothesized distribution $F$ (in your case a uniform distribution on some subset of the natural numbers I guess) to the empirical distribution function $F_n$ which you obtain from your sample(s).