Class ClusterBuilder

java.lang.Object
org.carrot2.attrs.AttrComposite
org.carrot2.clustering.lingo.ClusterBuilder
All Implemented Interfaces:
AcceptingVisitor

public class ClusterBuilder
extends AttrComposite
Builds cluster labels based on the reduced term-document matrix and assigns documents to the labels.
  • Field Details

    • phraseLabelBoost

      public AttrDouble phraseLabelBoost
      Phrase label boost. The weight of multi-word labels relative to one-word labels. Low values will result in more one-word labels being produced, higher values will favor multi-word labels.
    • phraseLengthPenaltyStart

      public AttrInteger phraseLengthPenaltyStart
      Phrase length penalty start. The phrase length at which the overlong multi-word labels should start to be penalized. Phrases of length smaller than phraseLengthPenaltyStart will not be penalized.
    • phraseLengthPenaltyStop

      public AttrInteger phraseLengthPenaltyStop
      Phrase length penalty stop. The phrase length at which the overlong multi-word labels should be removed completely. Phrases of length larger than phraseLengthPenaltyStop will be removed.
    • clusterMergingThreshold

      public AttrDouble clusterMergingThreshold
      Cluster merging threshold. The percentage overlap between two cluster's documents required for the clusters to be merged into one cluster. Low values will result in more aggressive merging, which may lead to irrelevant documents in clusters. High values will result in fewer clusters being merged, which may lead to very similar or duplicated clusters.
    • labelAssigner

      public LabelAssigner labelAssigner
      Cluster label assignment method.
  • Constructor Details