Package org.carrot2.clustering.lingo
Class UniqueLabelAssigner
- java.lang.Object
-
- org.carrot2.attrs.AttrComposite
-
- org.carrot2.clustering.lingo.UniqueLabelAssigner
-
- All Implemented Interfaces:
AcceptingVisitor
,LabelAssigner
public class UniqueLabelAssigner extends AttrComposite implements LabelAssigner
Assigns unique labels to each base vector using a greedy algorithm. For each base vector chooses the label that maximizes the base vector--label term vector cosine similarity and has not been previously selected. Once a label is selected, it will not be used to label any other vector. This algorithm does not create duplicate cluster labels, which usually means that this assignment method will create more clusters thanSimpleLabelAssigner
. This method is slightly slower thanSimpleLabelAssigner
.
-
-
Field Summary
-
Fields inherited from class org.carrot2.attrs.AttrComposite
attributes
-
-
Constructor Summary
Constructors Constructor Description UniqueLabelAssigner()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
assignLabels(LingoProcessingContext context, org.carrot2.math.mahout.matrix.DoubleMatrix2D stemCos, com.carrotsearch.hppc.IntIntHashMap filteredRowToStemIndex, org.carrot2.math.mahout.matrix.DoubleMatrix2D phraseCos)
Assigns labels to base vectors found by the matrix factorization.-
Methods inherited from class org.carrot2.attrs.AttrComposite
accept
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.carrot2.attrs.AcceptingVisitor
accept
-
-
-
-
Method Detail
-
assignLabels
public void assignLabels(LingoProcessingContext context, org.carrot2.math.mahout.matrix.DoubleMatrix2D stemCos, com.carrotsearch.hppc.IntIntHashMap filteredRowToStemIndex, org.carrot2.math.mahout.matrix.DoubleMatrix2D phraseCos)
Description copied from interface:LabelAssigner
Assigns labels to base vectors found by the matrix factorization. The results must be stored in theLingoProcessingContext.clusterLabelFeatureIndex
andLingoProcessingContext.clusterLabelScore
arrays.- Specified by:
assignLabels
in interfaceLabelAssigner
- Parameters:
context
- contains all information about the current clustering requeststemCos
- base vector -- single stems cosine matrixfilteredRowToStemIndex
- mapping between row indices of stemCos and indices of stems inPreprocessingContext.allStems
phraseCos
- base vector -- phrase cosine matrix
-
-