0
$\begingroup$

i am from non_statistics background and working on probability density function (PDF) and Kernel Density Estimation (KDE). i have data set for total 842 records as :

     frequency  | DAY
     ---------------------
        1          2
        1          20
        1          21
        1          5
        3          10
        3          28
        5          60
        7          7
        10         14
        11         180
        50         90
        309        0
        440        30

so as you can see i have day=30 which is most repeated day in my data set and have only one 20,21 days but in the pdf and kde curve ,i see the 20 and 21 are high. i know the pdf and kde are different from probability but i cant understand the garph shows 20 , and 21 ... i expected for these points i should have lower curve since their population in data set is very low which is one only. could you tell me the reason in simple word please. the graphs are attached.

PDF gragh KDE graph

  • 0
    I agree that it is not obvious how the graphs were made from the data. It might help to know what 'frequency' mean here. What is being counted?2017-02-17
  • 0
    @BruceET frequencey of the ds is about ....x2:number of days have been prescribed for this_medicine_id ,,,,,x1:frequency of x2.....eg: 440 prescriptions have been issued for 30days2017-02-20

0 Answers 0