I had previously posted this question at StackOVerflow but given it seems more about probability than programming I thought this forum mind give it a more suitable environment.
I need to understand the Survival Analysis concept of MEAN RESIDUAL LIFE.
I haven't found anything that to my mind gives me a complete and clear definition. Definitions I have found embedded in papers include:
The mean residual life function (mrlf) of a subject is defined as the average residual lifetime of the subject given that the subject has survived up to a given time point.
"http://www.stat.ncsu.edu/information/library/papers/mimeo2612.pdf"
Another:
The function E[S|t], the expected additional lifetime given that a component has survived until time t, is called the mean residual lifetime (MRL) function.
"http://www.jstor.org/pss/3213124"
one more:
In life testing situations, the expected additional lifetime given that a component has survived until time t is a function of t, called the mean residual life. More specifically, if the random variable X represents the life of a component, then the mean residual life is given by m(t) = E(X - t|X>t).
"http://bmif.unde.ro/docs/20082/5%20DIsbasoiu.pdf"
Buried in Mathematica's documentation I found two references for calculating mean residual life, which I've put into similar functions.
MRL1[variable_, distribution_] := NExpectation[X \[Conditioned] X > variable, X \[Distributed] distribution] - variable MRL2[variable_, distribution_] := NExpectation[X - variable \[Conditioned] X > variable, X \[Distributed] distribution]
You can find the references at:
"http : // www.wolfram.com/mathematica/new - in - 8/nonparametric - derived - and - formula - distributions/analyze - left-- right-- and - interval - censored - data.html"
and
"http://reference.wolfram.com/mathematica/ref/SkewNormalDistribution.html"
The second of these corresponds in form to the third of the above definitions. One can see that the 2 functions produce the same results:
dist = LogNormalDistribution[1.75, 0.65]; var = 6; MRL1[var, dist] MRL2[var, dist] 4.80314 4.80314
For those of you with access to Mathematica, I put together a manipulate to illustrate this idea using the first of the above functions:
dist = LogNormalDistribution[1.75, 0.65]; Manipulate[ Module[{mrl}, mrl = MRL1[t , dist]; Column[{"MEAN RESIDUAL LIFE?", " ", "PDF", Show[ListPlot[{{t , N[PDF[dist, t ]]}, {mrl, N[PDF[dist, mrl]]}}, ImageSize -> 350, PlotRange -> All, Prolog -> {Text["MRL", {mrl, .01}], Text["t", {t, .02}]}, Filling -> Axis, FillingStyle -> Lighter[Red], AxesOrigin -> {0, 0}], Plot[PDF[dist, x], {x, 0, 40}]], " ", "CDF", Show[ListPlot[{{t , N[CDF[dist, t ]]}, {mrl, N[CDF[dist, mrl]]}}, ImageSize -> 350, PlotRange -> All, Prolog -> {Text["MRL", {mrl, .1}], Text["t", {t, .2}]}, Filling -> Axis, FillingStyle -> Lighter[Red], AxesOrigin -> {0, 0}], Plot[CDF[dist, x], {x, 0, 40}]], Grid[{{"t ", t}, {"MRL ", mrl}, {"Mean[dist] ", Mean[dist]}, {"PDF at t", N[PDF[dist, t ]]}, {"CDF at t", N[CDF[dist, t ]]}}, Alignment -> Left]}]], {t, 0, 40}]
To put this a some context, the x axis could represent time (life expectancy, time to failure of a part) or some other quantity such as distance (miles a car runs before breaking down or how far a trend in a stock can run).
The value "t" corresponding to the x axis initializes at 0.
Some observations:
Mean[dist] = 7.10821.
Where t = 0, MRL = 7.10819 (pretty close to the mean).
As t increases towards MRL, MRL decreases.
At t = 4.7985, t = MRL (at least to 4 decimal places).
As t increases beyond MRL it drags MRL higher.
When (for this distribution) t reaches a bit over 25, MRL begins to exceed the mean of the distribution.
When I first came across mean residual life, I thought it would represent the expectation of everything beyond t and that as t increased MRL would have to increase. If someone lives to 80 years old, wouldn't they have some (if small) expectation of living longer? Some average in the tail of the distribution?
Maybe MRL doesn't make sense if t > MRL.
If 0 < t < MRL why should one's expectation be less than the mean of the distribution?
I feel a bit thick headed on this, as if I've missed some basic intuitive grasp of the idea. I just hoped someone on the forum could give me an insight into this and how to apply it.
Thanks, J