1
$\begingroup$

I've been trying to a find a statistics-esque formula for calculating the rate of change for html tags which are either added or removed from various websites.

So, for example, with the scraper I'm writing, I obtain the initial tag count and then cache that value. Later, on the next round, I compare the current tag count obtained with the past tag count, and calculate a percentage based on the differences between the two in terms of rate of change.

Other factors are included here, such as the number of times the website has been scraped, as well the dates these scrapes occur, etc.

What would be the ideal formula for something of this nature?

Note: guessing the tags here, not sure if these are correct if there are more options.

1 Answers 1

1

Say you have a sequence, $a_n$, that represents the number of tags from the $t^{th}$ increment of time. The rate of change between times $t$ and $t+1$ is

$\Delta a=a_{t+1}-a_t$


Another way to try is to create an $m^{th}$ degree least-squares fit polynomial that create a smooth functions for the points. The approximate derivative at time $t$ can be obtained (exponent rule for derivatives or by other approximations).

  • 0
    @MichaelChernick I am not finding a percentage. I am finding the approximate rate of change of the sequence. I treat the sequence like a continuous function and use the definition of the derivative to find $\frac{d}{dt} a_t \approx \frac{a_{t+h}-a_t}{h}$ for small $h$. Letting $h=1$, the result follows2012-08-14