Abstract:
This article describes a new algorithm to treat time
incremental data by a hierarchical clustering. Although
hierarchical clustering techniques enable one to
automatically determine the number of clusters in a data
set, they are rarely used in industrial applications, because
a large amount of memory is required when treating more
than 10,000 elements. To solve this problem, the proposed
method proceeds by updating the hierarchical
representation of the data instead of re-computing the
whole tree when new patterns have to be taken into
account. Memory gains, evaluated for a real problem
(handwritten digit recognition) allow to treat databases
containing 7 times more data than the classical algorithm.