0
$\begingroup$

I have been going in rounds with this problem... I may be thinking "complicated", any advice?

I have the mean and total sample size (=number of data points) and I need to know what is the standard deviation (SD).

I know I can calculate back the sum of individual scores from the formal formula for calculation of the mean, i.e.:

$M = \frac{\Sigma X}{N}$

where X=individual data points N=number of data points

However after this step I am stuck. To find the SD using the variance I need to know the individual data points and which I don't have.

I then end up with two "unknown" variables, $S^2$ and $X$ in this formula:

$S^2 = \frac{\Sigma(X-M)^2}{N - 1}$

Thanks!


Thank you André and Jonathan. I now got some extra information: I am given the N and mean(maximum), e.g.: N=596, mean(maximum): 5.86(39.1); any extra advice?

  • 0
    You are stuck. If the data points are all equal (they might be) your sample variance would be $0$. If they wiggle all over the place, the sample variance would be high.2012-12-01
  • 0
    What do you mean by "mean(maximum)"?2012-12-01

2 Answers 2

1

If all you know is the mean and sample size, then no. The standard deviation could be 6 or $1.5\times 10^{10^{10}}$.

  • 0
    Thank you! Andre and Jonathan. I now got some extra information: I am given the N and mean(maximum), e.g.: N=596, mean(maximum): 5.86(39.1); any extra advice?2012-12-01
  • 0
    If you knew what distribution the data came from then you could estimate the variance using the distribution of the maximum order statistic, probably, but it won't be a very good estimator. If you don't know what distribution it comes from then you're still in a pretty bad place. What are you trying to do here, anyway?2012-12-01
  • 0
    Thanks Jonathan. I am calculating back from reported data, see if I can pool a few studies together, once i have same data.2012-12-02
  • 0
    Ah, I see. I don't think you'll be able to get any estimate of the standard deviation reasonable enough to pool with, unfortunately. Maximum order statistics tend to have fairly disperse distributions.2012-12-03
1

If you know:

  • the number of observations is $596$
  • the sample mean of the observations is $5.86$
  • the sample maximum of the observations is $39.1$

then the variance and standard deviation will be minimised by having one observation of $39.1$ and $595$ observations of about $5.804134$. With your sample variance formula, this would give a sample variance of about $1.86$ and a sample standard deviation of about $1.36$.

If additionally you knew:

  • all the observations are non-negative

then the variance and standard deviation will be minimised by having $89$ observations of $39.1$, and $506$ of $0$, and one observation of $12.66$. With your sample variance formula, this would give a sample variance of about $194.55$ and a sample standard deviation of about $13.95$.

Without more information, the sample variances and standard deviations could be anywhere in these ranges. With less information (e.g. allowing the observations possibly to be negative), they could be higher still