stoclust

Logo

Clustering algorithms using stochastic analysis and ensemble techniques.

Hierarchy

  ↳ stoclust

Hierarchy(items_idx,cluster_idx,cluster_children)

A class for describing a hierarchical clustering.

Hierarchys are defined by three primary attributes: their Index of items, their Index of cluster clusters, and a dict whose keys are cluster indices and whose values are tuples. The first element of the tuple is a scale parameter which must increase from subset to superset, and the second is an array containing the indices of each immediate child cluster of the key cluster.

Attributes that can be obtained are self.items and self.clusters. Hierarchies act like dictionaries in that the clusters may be called as indices. That is, for a Hierarchy H, and cluster c, H[c] results in a Index containing all items under cluster c. When treated as an iterator, H returns tuples of the form (c,H[c]), much like the dictionary items() iterator. The length of a Hierarchy, len(H), is the number of distinct clusters.

Attributes

Attribute Visibility Description
items Public An Index whose elements are divided into categories by the Aggregation.
clusters Public An Index whose elements are labels corresponding to the main clusters.
_children Private A dict whose keys are cluster indices and whose values are arrays containing the indices of child clusters. It is better for the user to retrieve the clustering information either through treating the Hierarchy like a dictionary, or through the public methods such as cluster_children and cluster_groups.
_scales Private An array whose indices correspond to clusters and whose entries give the scale parameter of each cluster. It is better for the user to retrieve this information through the public method get_scales and to modify it through the method set_scales.

Methods

  • at_scale

    Hierarchy.at_scale(scale)
    Returns the Aggregation corresponding to the coarsest partition made from clusters not exceeding the given scale.
  • cluster_children

    Hierarchy.cluster_children()
    Returns a dictionary where keys are cluster labels and the values are the labels of the immediate child clusters.
  • cluster_groups

    Hierarchy.cluster_groups()
    Returns a dictionary where keys are cluster labels and the values are a Group of all items under the key cluster.
  • clusters_containing

    Hierarchy.clusters_containing(items_list)
    Returns a Group containing all cluster labels for clusters that contain the given items.
  • get_scales

    Hierarchy.get_scales()
    Returns the array of scales, indexed by clusters.
  • get_ultrametric

    Hierarchy.get_ultrametric()
    Returns a Parisi matrix P of nested diagonal blocks, such that P[i,j] is the smallest scale at which i and j are in the same cluster.
  • join

    Hierarchy.join(cluster_list)
    Returns the Aggregation corresponding to the coarsest partition made from clusters not exceeding the given scale.
  • measure

    Hierarchy.measure(field,axis=0)
    Given a field (array of values over items), gives the partial sums of the field over each cluster in the form of an array whose indices correspond to clusters.
  • set_scales

    Hierarchy.set_scales(scales)
    Allows the user to define the scales through an array indexed according to clusters.