2
$\begingroup$

I have a uniformly collected sample of 10000 data points from a population of about 200000. I'd like to find out what the distribution of the population is. How can I do this rigourously?

  • 0
    The question as it is posed is not clear, what do you mean by "distribution of population"?2011-04-07
  • 0
    Ah, sorry stats isn't my first language, so please excuse me. The elements of the population may be, say, normally distributed, or uniformly distributed etc. I would like to asertain if there is a well known distribution that models the distribution of the points in my population. Is there a more succinct way to articulate this in statsy lingo?2011-04-07
  • 0
    are these "elements" numbers on a real number line?2011-04-07
  • 0
    Yes, they are natural numbers.2011-04-07
  • 0
    typically analysis is done the other way around, where you know the distribution. This is an interesting question!2011-04-07
  • 0
    Can you sample it for 1 time? Or you can have a lot of independent samples from this data set?2011-04-07
  • 0
    I'm at liberty to make as many samples of size < 10000 as I like. Although I haven't proven it, I suspect that my samples are independent.2011-04-07
  • 0
    Yes - I do not know if you can prove it, but let us make such an assumption.2011-04-07
  • 0
    @Undercover You will get helpful answers by migrating this question to the [stats site](http://stats.stackexchange.com/questions).2011-04-07

1 Answers 1

2

There are several tests you can perform to test the hypothesis that the sample is equal to a predefined distribution. The simplest way is to bin the data, and then compare the frequency counts to the expected counts of any distribution: (normal, uniform, poisson etc.), using chi-square. Alternatively, you can also use the more powerful Kolmogorov-Smirnov test.

http://en.wikipedia.org/wiki/Kolmogorov-Smirnov_test