0
$\begingroup$

I have a piecewise linear function $f$ and I'd like to calculate the probability density or cumulative distribution function of $f(X)$, where $X$ has uniform distribution over the regions where $f$ is defined. For example,

$f(x)=\begin{cases} 6x+1 & 0\leq x < 1 \\ -3x+10 & 1\leq x < 2 \end{cases} $

Let me emphasize that this function is not a probability density. I want to obtain the density of $f(X)$. I don't care about $f$ outside of the limits given, which is why I haven't specified an otherwise condition.

I'm able to obtain the density of $f(X)$ empirically. I do this by generating a million numbers $x_1, x_2,\ldots, x_{10^6}$ uniformly from $0$ to $2$ and then creating the histogram of the $f(x_i)$. However, I'd like an analytical solution, for any piecewise function.

I'm going to implement this solution as an algorithm, so if the solution is best explained as a program, even better.

  • 0
    You mean by probability the area under the graph (as a first approach as CDF)?2017-01-27
  • 0
    @Masacroso No, I think that would make $f(x)$ a probability density itself, but I want *its* probability density, if that makes sense.2017-01-27
  • 0
    probability of what events?2017-01-27
  • 0
    @Masacroso I guess you can think of them as events, an infinite number of them, whose values are $f(x)$.2017-01-27
  • 0
    @Masacroso Perhaps a better way to explain it is that $X$ is a random variable from a uniform distribution (from 0 to 2). What is the density function of $f(X)$?2017-01-27
  • 1
    @Masacroso The question has been clarified by grand_chat.2017-01-28
  • 0
    I see. I voted to reopen the question. I think you must add your comment about $X$ as a uniform distribution over the x-axis to the question.2017-01-28
  • 0
    @Masacroso I believe grand_chat added that comment in the question title ("$X$ is uniform") and in the first sentence of the question ("$X$ has uniform distribution over the regions where $f$ is defined").2017-01-28

2 Answers 2

2

If the piecewise linear function $f$ is defined on an interval, and $X$ is chosen uniformly at random over that interval, then $f(X)$ will have the distribution of a finite mixture of uniform random variables. The density of $f(X)$ will look like a stack of blocks, which is not that easy to specify analytically, though you should be able to code up an algorithm to compute the density.

Suppose the subintervals where the pieces of $f$ are being defined are $[a_1,a_2]$ and $[a_2, a_3]$, etc, assumed disjoint. If the randomly chosen $X$ lands in the first of these subintervals, say subinterval $[a_1,a_2]$, it will be transformed linearly, so the result will have a uniform distribution with endpoints corresponding to a new interval whose endpoints will be $f_1(a_1)$ and $f_1(a_2)$, where $f_1$ is the equation of the line defined on subinterval $[a_1, a_2]$. Similarly if $X$ lands in the second subinterval, the result will be uniform over the interval with endpoints $f_2(a_2)$ and $f_2(a_3)$. You will need to swap the order of these endpoints if the linear function has a negative slope over the subinterval.

The end result, the density for $f(X)$, will be a mixture of the densities of these new uniform variables. Now a single uniform distribution has density $1/(\text{width of interval})$, which is easy to code because it's basically a rectangular block. However, if you have a mixture of these you need to scale the height of each block according to the relative width of the starting subinterval. So you have to scale the output uniform density arising from starting subinterval $[a_1,a_2]$ by the fraction of time that $X$ would land in $[a_1, a_2]$, and similarly scale the output density from the second subinterval by the fraction of time that $X$ would land in $[a_2, a_3]$. In the simple case where the starting subintervals all have the same length, the density of $f(X)$ will be the average of the individual output uniform densities.

In your example, the subintervals are $[a_1,a_2]=[0,1]$ and $[a_2,a_3]=[1,2]$. These get mapped to a uniform distribution over $[1,7]$ and a uniform distribution over $[4,7]$ respectively. The corresponding densities would be rectangular blocks with widths $6$ and $3$. Their heights would be $1/6$ and $1/3$ respectively. The mixture weights would be $1/2$ and $1/2$ respectively because the two subintervals have equal length, which changes the block heights to $1/12$ and $1/6$ respectively. The final density of $f(X)$ is the pointwise sum of these two block functions.

  • 0
    Great explanation, thank you! I hope I understood correctly. I've drawn what I think you mean: http://imgur.com/a/fdN0I. Is that right?2017-01-27
  • 0
    @redcurry Exactly right! More fun is the case where the blocks you are stacking happen to overlap, and some will get broken into pieces. Then you end up with possibly more blocklike shapes than starting subintervals.2017-01-28
  • 0
    Yes, my actual problem will have hundreds of subintervals, so it will be quite interesting to implement! Do you have any recommendations to how to phrase my question better? It was put on hold for being confusing, yet you and Nafiz understood it perfectly.2017-01-28
  • 0
    @redcurry I've edited your question to clarify what you are seeking. Is the function $f$ piecewise linear, or can the pieces be any function, not necessarily linear?2017-01-28
  • 0
    Thank you! For my problem, the function $f$ is piecewise linear.2017-01-28
2

As already mentioned and explained by grand_chat, the final answer is not analytic, rather piecewise. We could also bluntly compute it as follows:

Let us generalize a little bit first. Suppose we are given a piecewise function: $$f(x) := g_i(x) \mbox{ if } x \in I_i\,,$$ where $I_i$'s are various mutually disjoint intervals and for each interval $I_i$ we have a function $g_i$. Let us define the union of all these intervals: $$I := \bigcup_i I_i\,,$$ and the two extremal values of $f$: $$ A := \mathrm{min}\{f(x)\, |\, x \in I\}\,, \qquad B := \mathrm{max}\{f(x)\, |\, x \in I\}\,. $$ Given two numbers $a, b \in I$ such that $a \le b$, let us define the intervals where the values of $f$ lies between $a$ and $b$: $$ J^{ab}_i := \{x \in I_i\,|\,a \le f(x) \le b\}\,. $$ We will denote the length of an interval by $L$, i.e., $L([a,b]) = b-a$, $L(I) = \sum_i L(I_i)$, etc. Now going back to the probabilities we can write: $$ P(a \le f(x) \le b) = \sum_i P(a \le g_i(x) \le b) = \frac{\sum_i L(J_i^{ab})}{\sum_j L(I_j)}\,. \tag{P} $$ If the functions $g_i$ are sufficiently nice (for example, if they are linear as in the given problem) we can compute the above sum. Before doing so for the particular problem you gave let us write down the pdf and the cdf: $$ \mathrm{cdf}_f(a) = P(A \le f(x) \le a)\,, \qquad \mathrm{pdf}_f(a) = \frac{\mathrm d}{\mathrm d a} \mathrm{cdf}_f(a)\,. $$

In the given problem there are two intervals with two functions: \begin{align} I_1 = [0,1]\,, &\qquad I_2 = [1,2]\,, \\ g_1(x) = 6x+1\,, &\qquad g_2(x) = -3x+10\,. \end{align} The minima and the maxima of the function $f$ occurs at $x=0$ and $x=1$ respectively, and the values are: $$A = 1, \qquad B=7\,.$$ To find the intervals $J^{ab}_1$ and $J^{ab}_2$ we just need to solve the following equations: $$6x+1=a\,, \quad 6x+1=b\,, \quad \mbox{and,} \quad -3x+10=a\,, \quad -3x+10=b\,.$$ From the solutions we construct the intervals (with the assumption $1 \le a \le b \le 7$): ‌\begin{align} J_1^{ab} =&\; \left[\frac{a-1}{6}, \mathrm{min}\left(\frac{b-1}{6},1\right)\right]\,, \\ J_2^{ab} =&\; \left[\mathrm{min}\left(\mathrm{max}\left(\frac{10-b}{3},1\right),2\right), \mathrm{min}\left(\frac{10-a}{3},2\right)\right]\,. \end{align} This construction is only valid for linear functions of course. The presence of the min and max looks awkward, it will get slightly better. Using (P) we get: \begin{align} P(a \le f(x) \le b) =&\; \frac{1}{2}(L(J_1^{ab}) + L(J_2^{ab}))\,, \\ =&\; \frac{1}{12}\left( b-a+6\, \mathrm{min}\left( \frac{10-a}{3},2 \right) -6\, \mathrm{min}\left( \frac{10-b}{3},2 \right) \right)\,. \end{align} So the cdf is given by a piecewise function: $$ \mathrm{cdf}_f(a) = P(1 \le f(x) \le a) = \left\{ \begin{array}{ll} \frac{a-1}{12} & 1 \le a \le 4 \\ \frac{a-3}{4} & 4 \le a \le 7 \end{array} \right. \,. $$ And, finally for the pdf we get: $$ \mathrm{pdf}_f(a) = \left\{ \begin{array}{ll} \frac{1}{12} & 1 \le a \le 4 \\ \frac{1}{4} & 4 \le a \le 7 \end{array} \right. \,. $$

  • 0
    This is also an excellent explanation, thanks! I wish I could mark both your and grand_chat's explanations as answers, but I must choose one. I chose grand_chat's because I can more clearly see how to implement it in code.2017-01-28
  • 0
    @redcurry you're welcome! :)2017-01-28