Deprecated: Array and string offset access syntax with curly braces is deprecated in /var/www/web100356/html/old_website/old_wiki/inc/init.php on line 535

Warning: Declaration of action_plugin_captcha::register(&$controller) should be compatible with DokuWiki_Action_Plugin::register(Doku_Event_Handler $controller) in /var/www/web100356/html/old_website/old_wiki/lib/plugins/captcha/action.php on line 20

Warning: Declaration of action_plugin_wrap::register(&$controller) should be compatible with DokuWiki_Action_Plugin::register(Doku_Event_Handler $controller) in /var/www/web100356/html/old_website/old_wiki/lib/plugins/wrap/action.php on line 22

Warning: Declaration of action_plugin_discussion::register(&$contr) should be compatible with DokuWiki_Action_Plugin::register(Doku_Event_Handler $controller) in /var/www/web100356/html/old_website/old_wiki/lib/plugins/discussion/action.php on line 33

Warning: Declaration of action_plugin_blockquote::register(&$controller) should be compatible with DokuWiki_Action_Plugin::register(Doku_Event_Handler $controller) in /var/www/web100356/html/old_website/old_wiki/lib/plugins/blockquote/action.php on line 42

Warning: Cannot modify header information - headers already sent by (output started at /var/www/web100356/html/old_website/old_wiki/inc/init.php:535) in /var/www/web100356/html/old_website/old_wiki/inc/auth.php on line 532

Warning: Cannot modify header information - headers already sent by (output started at /var/www/web100356/html/old_website/old_wiki/inc/init.php:535) in /var/www/web100356/html/old_website/old_wiki/inc/actions.php on line 216
public:som [juergen's work wiki]

====== Self Organizing Map ====== The [[http://en.wikipedia.org/wiki/Self-organizing_map|Self Organizing Map]] (SOM, Kohonen map) can be considered as a tool for [[http://en.wikipedia.org/wiki/Dimension_reduction|dimension reduction]]. It was invented by [[http://www.cis.hut.fi/teuvo/|Teuvo Kohonen]]. High-dimensional input vectors are mapped to a (typically) 2D map such that input vectors that are 'near' in the high-dimensional input space are mapped to 'near' positions in the 2D map. In this sense the mapping is neighborhood-preserving. That is a nice property since if we move slowly in this low-dimensional latent space, we also move - more or less - continously in the high-dimensional space. To achieve this, two mechanisms are combined: - [[http://en.wikipedia.org/wiki/Vector_quantization|vector quantization]] to prototype nodes - simultaneous adaptation of neighbored prototype nodes to the current input vector The following visualizations should help to understand the coaction of these two mechanisms better. There is also a [[http://www.cis.hut.fi/somtoolbox/theory/somalgorithm.shtml|short intro]] to SOMs written by Teuvo Kohonen. ===== Vector quantization only ===== * blue circles: sample 2D input vectors from a [[http://en.wikipedia.org/wiki/Swiss_roll|2D swiss roll]] * red circles: current prototype vector represented by a prototype SOM node * yellow lines between 2 nodes: these nodes are neighbored in the 2D map If we just initialize 10x10=100 protoype nodes with zero vectors and adapt only the best matching unit (BMU) a little bit into the direction of the current training vector, we get that: {{:public:som_01_adapt_just_bmu_init_nodes_with_zero_vecs.gif|}} Only 2 of 100 nodes are used to represent the input data. If we initialize the 100 nodes with random vectors around the mean input vector, some nodes more are used - but still only a small part of all nodes that we could use to represent the input data, i.e. we do not use the full representative power of all SOM nodes. {{:public:som_02_adapt_just_bmu_init_nodes_with_mean_vec.gif|}} Neighborhood adaptation coefficients used: <file> 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 </file> Note, that Teuvo Kohonen stressed the point that a random initialization of the SOM can be regarded as some 'historical ballast': <blockquote> \\ Due to the many stages in the development of the SOM method and its variations, there is often useless historical ballast in the computations. For instance, one old ineffective principle is random initialization of the model vectors tex2html_wrap_inline117. Random initialization was originally used to show that there exists a strong self-organizing tendency in the SOM, so that the order can even emerge when starting from a completely unordered state, but this need not be demonstrated every time. On the contrary, if the initial values for the model vectors are selected as a regular array of vectorial values that lie on the subspace spanned by the eigenvectors corresponding to the two largest principal components of input data, computation of the SOM can be made orders of magnitude faster, since (i) the SOM is then already approximately organized in the beginning, (ii) one can start with a narrower neighborhood function and smaller learning-rate factor. <cite>Teuvo Kohonen</cite> </blockquote> ===== Vector quantization with adaptation of neighbored prototype nodes ===== Something cool happens if we do not only adapt the BMU to the current training input vector but also its 8 neighbors (assuming a 2D topology): {{:public:som_03_adapt_bmu_plus_its_neighbors.gif|}} The nodes now distribute onto the places in 2D input space where we get the input from. Neighborhood adaptation coefficients used: <file> 0.000 0.000 0.000 0.000 0.000 0.000 0.500 0.500 0.500 0.000 0.000 0.500 1.000 0.500 0.000 0.000 0.500 0.500 0.500 0.000 0.000 0.000 0.000 0.000 0.000 </file> ===== SOM with inhibition ===== In theory often a neighborhood adaptation function is proposed that also incorporates some small inhibition. E.g. the [[http://en.wikipedia.org/wiki/Mexican_hat_wavelet|mexican hat function]]. But if one uses too big inhibition the SOM can 'explode': {{:public:som_04_adapt_with_inhibition.gif|}} Neighborhood adaptation coefficients used: <file> 0.000 -0.200 -0.200 -0.200 0.000 -0.200 0.500 0.500 0.500 -0.200 -0.200 0.500 1.000 0.500 -0.200 -0.200 0.500 0.500 0.500 -0.200 0.000 -0.200 -0.200 -0.200 0.000 </file> If you use instead a small inhibition, the SOM does not explode, but we get nearly the same result as with no inhibition. For this, in practice most people use a non-inhibitory adaptation function as e.g. a [[http://en.wikipedia.org/wiki/Gaussian_function#Two-dimensional_Gaussian_function|2D Gaussian]]. {{:public:som_05_adapt_with_small_inhibition.gif|}} Neighborhood adaptation coefficients used: <file> 0.000 -0.010 -0.010 -0.010 0.000 -0.010 0.500 0.500 0.500 -0.010 -0.010 0.500 1.000 0.500 -0.010 -0.010 0.500 0.500 0.500 -0.010 0.000 -0.010 -0.010 -0.010 0.000 </file> ===== SOM and isolated input data clusters ===== A drawback of the SOM is its fixed topology. Using a fixed topology means that each node has some neighbors (defined by the topology) that are adapted always - even if the neighbor represents a very different prototype vector. One can see this effect when the input data stems not only from a coherent region in high-dimensional space, but from different isolated regions: Two isolated regions: Swiss roll + one rectangle {{:public:som_06_adapt_to_2dswissroll_plus_one_rectangle.gif|}} Three isolated regions: Swiss roll + two rectangles {{:public:som_07_adapt_to_2dswissroll_plus_two_rectangles.gif|}} Some nodes are attracted by samples from the Swiss roll and one of the two rectangles at the same time such that they are pulled to a prototype vector somewhere between both regions - but actually no input data stems from that region! In this sense, the representation becomes errorneously! Neighborhood adaptation coefficients used: <file> 0.000 -0.010 -0.010 -0.010 0.000 -0.010 0.500 0.500 0.500 -0.010 -0.010 0.500 1.000 0.500 -0.010 -0.010 0.500 0.500 0.500 -0.010 0.000 -0.010 -0.010 -0.010 0.000 </file> Note that there is a good solution to this drawback of a Self Organizing Map. It is called [[public:growing neural gas|Growing Neural Gas]] and allows for isolated connected prototype clusters by not only adapting the prototype nodes but learning a topology as well.

 
public/som.txt · Last modified: 2010/12/17 19:00 (external edit) · []
Recent changes RSS feed Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki