Class BisectingKMeansClusteringAlgorithm

  • All Implemented Interfaces:
    AcceptingVisitor, ClusteringAlgorithm

    public class BisectingKMeansClusteringAlgorithm
    extends AttrComposite
    implements ClusteringAlgorithm
    A very simple implementation of bisecting k-means clustering. Unlike other algorithms in Carrot2, this one creates hard clustering (one document belongs only to one cluster). On the other hand, the clusters are labeled only with individual words that may not always fully correspond to all documents in the cluster.
    • Field Detail

      • clusterCount

        public final AttrInteger clusterCount
        The number of clusters to create. The algorithm will create at most the specified number of clusters.
      • maxIterations

        public final AttrInteger maxIterations
        The maximum number of k-means iterations to perform.
      • partitionCount

        public final AttrInteger partitionCount
        Partition count. The number of partitions to create at each k-means clustering iteration.
      • labelCount

        public final AttrInteger labelCount
        Label count. The minimum number of labels to return for each cluster.
      • queryHint

        public final AttrString queryHint
        Query terms used to retrieve documents. The query is used as a hint to avoid trivial clusters.
      • useDimensionalityReduction

        public final AttrBoolean useDimensionalityReduction
        Use dimensionality reduction. If true, k-means will be applied on the dimensionality-reduced term-document matrix with the number of dimensions being equal to twice the number of requested clusters. If the number of dimensions is lower than the number of input documents, reduction will not be performed. If false, the k-means will be performed directly on the original term-document matrix.
    • Constructor Detail

      • BisectingKMeansClusteringAlgorithm

        public BisectingKMeansClusteringAlgorithm()