0
$\begingroup$

Could someone please help me understand this KDE equation by working out an example?

$=\frac{1}{n*h}\sum_{i=1}^{n} K(\frac{x - x_i}{h})$

say, for example, I have the data for the following: $ x = 1.2, \text{ then density is 12}.\\ x = 2.2, \text{then density is 24}.\\ x = 3.8, \text{then density is 18}.\\ x = 6.5, \text{then density is 6}.\\ x = 7.0, \text{then density is 12}.\\ $

I want the density when $x = 4.5$. Assume $h$ is $0.5$.

  • 0
    Then R is a good choice, the syntax is very close to matlab2013-04-07

1 Answers 1

0

I mentally visualize KDE like pitching a bunch of tents whose widths are controlled through the bandwidth $h$. The point of using KDE is to estimate a function, specifically a pdf, from its samples. In go the numbers, out comes a (typically continuous) function.

The two questions you have to ponder when using KDE are

  1. What kernel should I be using?
  2. What should I set the bandwidth to?

A great way to grasp this is to generate some samples in MATLAB from a distribution of your selection, and try to recreate the pdf using KDE. See the effect of changing the bandwidth, or using one kernel instead of another.

For $x_i = \{1.2, 2.2, 3.8, 6.5, 7.0\}$, $h=0.5$, and $x=4.5$:

$f_X(4.5)=\frac{1}{4 * 0.5} \left( K( 2(4.5-1.2) ) + K( 2(4.5-2.2) ) + K( 2(4.5-3.8) ) + K( 2(4.5-6.5) ) + K( 2(4.5-7.0) ) \right)$

  • 0
    You feed the $x_i$ (as many as you have) as input, then the resulting function of $x$ is the pdf.2011-05-03