Class ClusterBuilder

  • All Implemented Interfaces:
    AcceptingVisitor

    public class ClusterBuilder
    extends AttrComposite
    Builds cluster labels based on the reduced term-document matrix and assigns documents to the labels.
    • Field Detail

      • phraseLabelBoost

        public AttrDouble phraseLabelBoost
        Phrase label boost. The weight of multi-word labels relative to one-word labels. Low values will result in more one-word labels being produced, higher values will favor multi-word labels.
      • phraseLengthPenaltyStart

        public AttrInteger phraseLengthPenaltyStart
        Phrase length penalty start. The phrase length at which the overlong multi-word labels should start to be penalized. Phrases of length smaller than phraseLengthPenaltyStart will not be penalized.
      • phraseLengthPenaltyStop

        public AttrInteger phraseLengthPenaltyStop
        Phrase length penalty stop. The phrase length at which the overlong multi-word labels should be removed completely. Phrases of length larger than phraseLengthPenaltyStop will be removed.
      • clusterMergingThreshold

        public AttrDouble clusterMergingThreshold
        Cluster merging threshold. The percentage overlap between two cluster's documents required for the clusters to be merged into one cluster. Low values will result in more aggressive merging, which may lead to irrelevant documents in clusters. High values will result in fewer clusters being merged, which may lead to very similar or duplicated clusters.
      • labelAssigner

        public LabelAssigner labelAssigner
        Cluster label assignment method.
    • Constructor Detail

      • ClusterBuilder

        public ClusterBuilder()