1
$\begingroup$

I want to know is it possible to scale a set of numbers without knowning the upper limit.

Say for example I have 1000 number values. I want to plot each of these values within a range of 0- 90. Is this possible to do without finding out what the biggest value in the set of 1000 is?

  • 0
    If it's a normal distribution, you could always sample it and estimate the range of values from that sample.2011-07-22
  • 0
    do you mean pick some random values from the set and use the maximum of that?2011-07-22
  • 0
    For instance, or you could simply calculate the sample standard deviation and use that as a basis for setting your scale.2011-07-22
  • 0
    could you put an example of caculating the sample standard deviation in an answer please so I can accept it?2011-07-22
  • 0
    Do you want to linearly scale the answer, or are you happy with a nonlinear scaling? It seems to me that computing the sample standard deviation as a basis for setting the range is *more* work than just calculating the minimum and maximum, and using those for your scale!2011-07-22

1 Answers 1

1

Here's a suggestion:

Take a sample of $n$ values, the larger $n$ the better the estimate of the true variance will be, but if you have the option to choose $n$ very large, you might as well want to just compute the minimum and maximum for the whole set already.

Compute the sample variance:

$$s^2 = \frac{1}{n-1}\sum_{i=1}^n (x_i - \bar{x})^2 \; ,$$

where $$\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i$$ is the sample mean.

The square root of that number is the sample standard deviation and a good estimate for the population standard deviation (i.e. the standard deviation of all your data). Now, if your data are normally distributed, over 99% of your data will be within three standard deviations of the mean. which gives you a good range that you can then scale to the interval you wish. If your data are not normally distributed, you might want to take 5 standard deviations or more.