next up previous contents
Next: Exponentially Decaying Learning Rate Up: Hard Competitive Learning Previous: Constant Learning Rate

k-means

Instead of having a constant learning rate, we can also decrease it over time. A particularly interesting way of doing so is to have a separate learning rate for each unit tex2html_wrap_inline4601 and to set it according to the harmonic series:
equation983
Thereby, the time parameter t stands for the number of input signals for which this particular unit has been winner so far. This algorithm is known as k-means (MacQueen, 1967), which is a rather appropriate name, because each reference vector tex2html_wrap_inline4605 is always the exact arithmetic mean of the input signals tex2html_wrap_inline4607 it has been winner for so far. The sequence of successive values of tex2html_wrap_inline4609 is the following:


eqnarray989

One should note that the set of signals tex2html_wrap_inline4613 for which a particular unit c has been winner may contain elements which lie outside the current Voronoi region of c. The reason is that each adaptation of tex2html_wrap_inline4619 changes the borders of the Voronoi region tex2html_wrap_inline4621. Therefore, although tex2html_wrap_inline4623 represents the arithmetic mean of the signals it has been winner for, at time t some of these signal may well lie in Voronoi regions belonging to other units.

Another important point about k-means is, that there is no strict convergence (as is present e.g. in LBG), the reason being that the sum of the harmonic series diverges:
equation997

Because of this divergence, even after a large number of input signals and correspondingly low values of the learning rate tex2html_wrap_inline4629 arbitrarily large modifications of each input vector may occur in principal. Such large modification, however, are very improbable and in simulations where the signal distribution is stationary the reference vectors usually rather quickly take on values which are not much changed in the following. In fact, it has been shown that k-means does converge asymptotically to a configuration where each reference vector tex2html_wrap_inline4633 is positioned such that it coincides with the expectation value
 equation1003
of its Voronoi region tex2html_wrap_inline4643 (MacQueen, 1965). One can note that (4.17) is the continuous variant of the centroid condition (4.2). Figure 4.5 shows some stages of a simulation for a simple ring-shaped data distribution. Figure 4.6 displays the final results after 40000 adaptation steps for three other distribution.

 figure1075
Figure 4.5:   k-means simulation sequence for a ring-shaped uniform probability distribution. a) Initial state. b-f) Intermediate states. g) Final state. h) Voronoi tessellation corresponding to the final state. The final distribtion of the reference vectors still reflects the clusters present in the initial state (see in particular the region of higher vector density at the lower left).

 figure1164
Figure:   k-means simulation results after 40000 input signals for three different probability distributions (described in the caption of figure 4.4).


next up previous contents
Next: Exponentially Decaying Learning Rate Up: Hard Competitive Learning Previous: Constant Learning Rate

Bernd Fritzke
Sat Apr 5 18:17:58 MET DST 1997