1 %> @brief Hierarchical Clustering
3 %> Clustering generates another dataset whose variables will be cluster indexes. Each variable is a different scheme
4 %> corresponding to a split considering a different number of clusters. Once the distance matrix is built and the
5 %> dendrogram is derived, it is easy to generate different dendrogram cuts.
7 %> @sa uip_clus_hca.m, pdist, linkage (MATLAB functions)
10 %> Minimum number of clusters
12 %> Maximum number of clusters
14 %> ='euclidean'. Distance type. See help for @c pdist() for possible types.
15 distancetype = 'euclidean';
16 %> ='ward'. Linkage type. Default is set to the famous "Ward". See help for @c linkage() for possible types.
22 o.classtitle = 'Hierarchical Cluster Analysis';
27 methods(Access=protected)
28 function dout = do_use(o, data)
33 irverbose(sprintf('Starting calculation of the distance matrix with %d observations...', data.no));
36 b_obsidxs = data.classes >= 0;
37 no_new = sum(b_obsidxs);
39 Y = pdist(single(data.X(b_obsidxs, :)), o.distancetype);
40 irverbose(sprintf('...finished (took %g seconds)', toc(t)));
45 Z = linkage(Y, o.linkagetype);
48 % 2.5) Organizing into dataset
49 Xnew = ones(data.no, o.nc_max-o.nc_min+1)*-3; % default value is -3: "refuse-to-cluster"
50 for i = o.nc_min:o.nc_max
51 Xnew(b_obsidxs, i-o.nc_min+1) = cluster(Z, 'MaxClust', i)-1;
55 dout = dout.import_from_struct(data);
57 dout.X =
double(Xnew);
58 dout.fea_x = o.nc_min:o.nc_max;
59 dout.title = [dout.title, ' - HCA'];
60 dout.xname = 'Number of clusters';
62 dout.yname = 'Cluster number';
function irverbose(in s, in level)
Dataset class - cluster data.
function progress2_change(in prgrss, in title, in perc, in i, in n)
function progress2_open(in title, in perc, in i, in n)
function progress2_close(in prgrss)