2
$\begingroup$

The figure shows the Q-Q plot of a theoretical and empirical standardized Normal distribution generated through the $qqnorm()$ function of R statistical tool.

enter image description here

How can I describe the right tail (top right) that does not follow the red reference line? What does it mean when there is a "trend" that running away from the line?

Thank you

  • 1
    While I think the question is OK for math.SE, it would even be better on stats.SE .2011-08-26
  • 0
    Ok thank you, i also had the doubt..! Ok if i'm not lucky here i will post it on stats.SE :)2011-08-26
  • 0
    Could this be a software bug?2011-08-26
  • 0
    What R commands did you use to get the "theoretical quantiles"?2011-08-26
  • 0
    I see: that's built in to the qqnorm command. But when I tried this, I got respectable-looking results.2011-08-26
  • 0
    Oh, now I see: your data are _not_ coming from rnorm or the like; they're actual data. So there's no reason to suspect a software bug.2011-08-26
  • 0
    @Michael Hardy, yes my data are not coming from rnorm :) Sorry I forgot to specify it!2011-08-27

2 Answers 2

4

It means that in the right tail, your data do not fit normal well, specifically, there are far less numbers there would be in a normal sample. If the black curved up, there would be more than in a typical normal sample.

You can think of the black curve as a graph of a function that , if applied to your data, would make them like a normal sample.

In the following image, random sample is generated by applying Ilmari Karonen's function to normal sample.

Screen shot of Mathematica session

  • 0
    thank you! Please correct me if i say something wrong. In my case in the right tail of my distribution (sample quantiles) i have less numbers than the theoretical one (theoretical quantiles) because the "black points" curve down. I don t want to make confusione with the axis :)2011-08-26
  • 0
    And suppose that i can count the values of my distribution, in the right tail of data sample distribution i will have a small number of values than the values in the theoretical normal distribution. That is what the qq-plot says..Sorry for the explanation of the first elementary school.2011-08-26
  • 1
    @Maurizio You got it right, I am merely rephrasing your statement to make it slightly more precise. The count of points at the right extreme in your data is lower, than expected in a sample drawn from normal distribution.2011-08-26
2

Looks like your data has a cutoff at $4$. You could probably fit the samples you plotted fairly well with a curve such as

$$y = \frac 1 2 \left( x + 4 - \sqrt{c+(x-4)^2} \right),$$

where $c > 0$ is a free parameter that describes the sharpness of the cutoff. Just by eyeballing the graph, I'd guess that $c \approx 0.1$ for your data.

  • 0
    Thank you! but if i only want to explain in words what should I say? for example, that, respect to the theoretical one, there is a bigger probability to have data in the right tail of my empirical normal distribution? (But anyway there is not a problem in goodness of fit tests, i ve already verified!)2011-08-26
  • 0
    @Maurizio: I'm no statistician, but I'd just say that the real data is not quite normally distributed because it has a (soft) cutoff at 4. Since the cutoff is quite far out in the tail of the distribution, I suppose it doesn't affect the fit much. Do you have any idea what might be causing it?2011-08-26
  • 0
    i don t know why, i'm analyzing the distribution of a voip traffic generator that should follows a normal distribution but first i have to understand how can i "read" the qq plots :)2011-08-26