Package org.carrot2.text.vsm
Class VectorSpaceModelContext
- java.lang.Object
-
- org.carrot2.text.vsm.VectorSpaceModelContext
-
public class VectorSpaceModelContext extends Object
Stores data related to the Vector Space Model of the processed documents.
-
-
Field Summary
Fields Modifier and Type Field Description PreprocessingContext
preprocessingContext
Preprocessing context for the underlying documents.com.carrotsearch.hppc.IntIntHashMap
stemToRowIndex
Stem index to row index mapping for thetdMatrix
.org.carrot2.math.mahout.matrix.DoubleMatrix2D
termDocumentMatrix
Term-document matrix.org.carrot2.math.mahout.matrix.DoubleMatrix2D
termPhraseMatrix
Term-document-like matrix for phrases fromPreprocessingContext.AllLabels
.
-
Constructor Summary
Constructors Constructor Description VectorSpaceModelContext(PreprocessingContext preprocessingContext)
Creates a vector space model context with the provided preprocessing context.
-
-
-
Field Detail
-
preprocessingContext
public final PreprocessingContext preprocessingContext
Preprocessing context for the underlying documents.
-
termDocumentMatrix
public org.carrot2.math.mahout.matrix.DoubleMatrix2D termDocumentMatrix
Term-document matrix. Rows of the matrix correspond to word stems, columns correspond to the processed documents. For mapping between rows of this matrix andPreprocessingContext.AllStems
, seestemToRowIndex
.This matrix is produced by
TermDocumentMatrixBuilder.buildTermDocumentMatrix(VectorSpaceModelContext)
.
-
termPhraseMatrix
public org.carrot2.math.mahout.matrix.DoubleMatrix2D termPhraseMatrix
Term-document-like matrix for phrases fromPreprocessingContext.AllLabels
. If there are no phrases inPreprocessingContext.AllLabels
, phrase matrix isnull
. For mapping between rows of this matrix andPreprocessingContext.AllStems
, seestemToRowIndex
.This matrix is produced by
TermDocumentMatrixBuilder.buildTermPhraseMatrix(VectorSpaceModelContext)
.
-
stemToRowIndex
public com.carrotsearch.hppc.IntIntHashMap stemToRowIndex
Stem index to row index mapping for thetdMatrix
. Keys in this map are indices of entries inPreprocessingContext.AllStems
arrays, values are the indices oftdMatrix
rows corresponding to the stems. Please note that depending on the limit on the size of the matrix, some stems may not have their corresponding matrix rows.This object is produced by
TermDocumentMatrixBuilder.buildTermDocumentMatrix(VectorSpaceModelContext)
.
-
-
Constructor Detail
-
VectorSpaceModelContext
public VectorSpaceModelContext(PreprocessingContext preprocessingContext)
Creates a vector space model context with the provided preprocessing context.
-
-