4
$\begingroup$

Guys I am having trouble with the standard normal distribution.

http://www.regentsprep.org/Regents/math/algtrig/ATS2/NormalLesson.htm

We know the X values run from approx $-\infty$ to $+\infty$ but what are the y values?? The normal distribution takes two parameters $\mathcal{N}(\mu, \sigma^2)$ but what is the range of y?

$y>0$ obviously and the "y" will depend on the mean and variance you picked as $y=\frac{\exp(-z^2)}{\sqrt{2\pi\sigma^2}}$. But I have trouble understanding what it means. If I take the S&P500 and I difference the series (SPX-SPX(-1)) the histogram of the returns will have an approximate normal distributions and will list out the number of times I have a return of -1%,-.5%,0%,.5%, 1% , etc throughout the history. So is the "y" of the normal distribution the number of times I have had that x as a value? Should I think of the normal distribution in practical terms the number of times that one point event has occurred? I look at some normal distributions and the Y ranges from 0-4, others I see the y ranging from 0 to 1, as a probability should. I know the area underneath the curve should sum to 1 but shouldnt the y values always be less than 1?

https://statistics.laerd.com/statistical-guides/standard-score.php

Thanks guys!

2 Answers 2

3

You may be thinking of the cumulative distribution function, which takes on all values in the interval $(0,1)$. Or else you may be thinking of the (probability) density function $\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}},$ the familiar "bell-shaped" curve. This density function is positive, but not necessarily between $0$ and $1$. It reaches a maximum when $x=\mu$. The maximum value (what your post would call the maximum $y$-value) is $\dfrac{1}{\sqrt{2\pi}\sigma}$. The range of the density function is the interval $\left(0,\frac{1}{\sqrt{2\pi}\sigma}\right]$.

In particular, when $\sigma$ is small, the maximum value can be quite large: the density function reaches a sharp high peak. If $\sigma$ is large, the density function, though still characteristically bell-shaped, is flat and low. The area under the density curve, and above the $x$-axis, is always $1$. So if the density function is near $0$ very soon (small variance,) it is intuitively clear that the curve must reach quite high.

Remark: Let $f(x)$ be our probability density function. Then for small $h$, the probability that our random variable lies between $x$ and $x+h$ is approximately $hf(x)$. In that sense, you can pick up a pretty good picture of $f(x)$ if you have a largish number of data points.

  • 0
    I have added a little to the post. Hope it helps answer your question.2012-10-20
2

To add some commentary, the "bell curve" shape is governed by the PDF, as @AndreNicolas pointed out. However, the actual "y"-value of this curve is itself more or less meaningless. The integral of the PDF $f(x)$ gives the probability that your random variable is less than some value: $P(x < X) = \int_{-\infty}^X f(x)dx$. This is known as the CDF, or cumulative distribution function. By the fundamental theorem of calculus, the PDF is then the derivative of the CDF; that is, the PDF is the derivative of a function that returns a probability. So what is that intuitively? Honestly... it's not really anything. The "units" of the vertical axis in the PDF plot don't lead to anything intuitive; they are meaningful, but only in a derived, mathematical sense.

Some people wish to think that $f(X)$ is the probability that $x = X$, but this is untrue for continuous distributions ($P(x = X) = 0$). However, for the PDF's discrete analog, the Probability Mass Function (PMF), this statement is quite true.

  • 0
    @gabriel The _area_ under the pdf equals $1$. _If_ the pdf value $f(x)$ exceeds $1$ for some and indeed many values of $x$, that is perfectly fine: but $f(x)$ cannot exceed $1$ for **all** $x$ in an _interval_ $I$ of length exceeding $1$. If the latter condition were to hold, then \int_I f(x)\,\mathrm dx=\text{area under pdf in interval}~ I>1 in violation of the constraint that the _total_ area is $1$. The value of $f(x)$ is _not_ a probability. The _units_ of $f(x)$ are _probability per unit length_ and you _must_ multiply by length (more generally, find an area) to get a probability.2012-10-20