dendrogram. The dendrogram illustrates how each cluster is composed by drawing a U-shaped link between a non-singleton cluster and its children. The height of the top of the U-link is the distance between its children clusters. It is also the cophenetic distance between original observations in the two children clusters. It is expected that the distances in Z[:,2] be monotonic, otherwise crossings appear in the dendrogram.
Arguments: |
|
---|
color_threshold : double For brevity, let t be the color_threshold. Colors all the descendent links below a cluster node k the same color if k is the first node below the cut threshold t. All links connecting nodes with distances greater than or equal to the threshold are colored blue. If t is less than or equal to zero, all nodes are colored blue. If color_threshold is None or ‘default’, corresponding with MATLAB(TM) behavior, the threshold is set to 0.7*max(Z[:,2]).
get_leaves : bool Includes a list R['leaves']=H in the result dictionary. For each i, H[i] == j, cluster node j appears in the i th position in the left-to-right traversal of the leaves, where j < 2n-1 and i < n.
orientation : string The direction to plot the dendrogram, which can be any of the following strings
- ‘top’: plots the root at the top, and plot descendent
links going downwards. (default).
- ‘bottom’: plots the root at the bottom, and plot descendent
links going upwards.
- ‘left’: plots the root at the left, and plot descendent
links going right.
- ‘right’: plots the root at the right, and plot descendent
links going left.
labels : ndarray By default labels is None so the index of the original observation is used to label the leaf nodes.
Otherwise, this is an n -sized list (or tuple). The labels[i] value is the text to put under the i th leaf node only if it corresponds to an original observation and not a non-singleton cluster.
count_sort : string/bool For each node n, the order (visually, from left-to-right) n’s two descendent links are plotted is determined by this parameter, which can be any of the following values:
- False: nothing is done.
- ‘ascending’/True: the child with the minimum number of
original objects in its cluster is plotted first.
- ‘descendent’: the child with the maximum number of
original objects in its cluster is plotted first.
Note distance_sort and count_sort cannot both be True.
distance_sort : string/bool For each node n, the order (visually, from left-to-right) n’s two descendent links are plotted is determined by this parameter, which can be any of the following values:
- False: nothing is done.
- ‘ascending’/True: the child with the minimum distance
between its direct descendents is plotted first.
- ‘descending’: the child with the maximum distance
between its direct descendents is plotted first.
Note distance_sort and count_sort cannot both be True.
show_leaf_counts : bool
When True, leaf nodes representing k>1 original observation are labeled with the number of observations they contain in parentheses.
no_plot : bool When True, the final rendering is not performed. This is useful if only the data structures computed for the rendering are needed or if matplotlib is not available.
no_labels : bool When True, no labels appear next to the leaf nodes in the rendering of the dendrogram.
leaf_label_rotation : double
Specifies the angle (in degrees) to rotate the leaf labels. When unspecified, the rotation based on the number of nodes in the dendrogram. (Default=0)
leaf_font_size : int Specifies the font size (in points) of the leaf labels. When unspecified, the size based on the number of nodes in the dendrogram.
leaf_label_func : lambda or function
When leaf_label_func is a callable function, for each leaf with cluster index k < 2n-1. The function is expected to return a string with the label for the leaf.
Indices k < n correspond to original observations while indices k \geq n correspond to non-singleton clusters.
For example, to label singletons with their node id and non-singletons with their id, count, and inconsistency coefficient, simply do:
# First define the leaf label function. def llf(id): if id < n: return str(id) else: return '[%d %d %1.2f]' % (id, count, R[n-id,3]) # The text for the leaf nodes is going to be big so force # a rotation of 90 degrees. dendrogram(Z, leaf_label_func=llf, leaf_rotation=90)show_contracted : bool When True the heights of non-singleton nodes contracted into a leaf node are plotted as crosses along the link connecting that leaf node. This really is only useful when truncation is used (see truncate_mode parameter).
link_color_func : lambda/function When a callable function, link_color_function is called with each non-singleton id corresponding to each U-shaped link it will paint. The function is expected to return the color to paint the link, encoded as a matplotlib color string code.
For example:
dendrogram(Z, link_color_func=lambda k: colors[k])colors the direct links below each untruncated non-singleton node k using colors[k].
Returns: |
|
---|