|





| |
Project Abstract
Historically it has been difficult to measure the deviation in the notion of
a concept.The central notion of all these efforts is to detect the change
point where the data mining model deviates significantly with respect to the
data characteristics that it was trained or built on. The process of
detecting such change points is often termed as concept drift. Current state
of algorithms (a) assume attribute independence (b) view the problem as a
supervised learning problem and need tagged data. The proposed algorithm
does not make any assumption among attribute independence and uses the
covariance summary to detect concept drift in an unsupervised setting. The
algorithm proposed in this thesis monitors the underlying characteristics of
the input data, maintains data summaries of the various snapshots in time
and develops effective distance metrics to determine when concept drift
occurs.We evaluate our technique against synthetic and real data sets
|
|
| |
|