Package org.carrot2.text.vsm
Class VectorSpaceModelContext
- java.lang.Object
-
- org.carrot2.text.vsm.VectorSpaceModelContext
-
public class VectorSpaceModelContext extends Object
Stores data related to the Vector Space Model of the processed documents.
-
-
Field Summary
Fields Modifier and Type Field Description PreprocessingContextpreprocessingContextPreprocessing context for the underlying documents.com.carrotsearch.hppc.IntIntHashMapstemToRowIndexStem index to row index mapping for thetdMatrix.org.carrot2.math.mahout.matrix.DoubleMatrix2DtermDocumentMatrixTerm-document matrix.org.carrot2.math.mahout.matrix.DoubleMatrix2DtermPhraseMatrixTerm-document-like matrix for phrases fromPreprocessingContext.AllLabels.
-
Constructor Summary
Constructors Constructor Description VectorSpaceModelContext(PreprocessingContext preprocessingContext)Creates a vector space model context with the provided preprocessing context.
-
-
-
Field Detail
-
preprocessingContext
public final PreprocessingContext preprocessingContext
Preprocessing context for the underlying documents.
-
termDocumentMatrix
public org.carrot2.math.mahout.matrix.DoubleMatrix2D termDocumentMatrix
Term-document matrix. Rows of the matrix correspond to word stems, columns correspond to the processed documents. For mapping between rows of this matrix andPreprocessingContext.AllStems, seestemToRowIndex.This matrix is produced by
TermDocumentMatrixBuilder.buildTermDocumentMatrix(VectorSpaceModelContext).
-
termPhraseMatrix
public org.carrot2.math.mahout.matrix.DoubleMatrix2D termPhraseMatrix
Term-document-like matrix for phrases fromPreprocessingContext.AllLabels. If there are no phrases inPreprocessingContext.AllLabels, phrase matrix isnull. For mapping between rows of this matrix andPreprocessingContext.AllStems, seestemToRowIndex.This matrix is produced by
TermDocumentMatrixBuilder.buildTermPhraseMatrix(VectorSpaceModelContext).
-
stemToRowIndex
public com.carrotsearch.hppc.IntIntHashMap stemToRowIndex
Stem index to row index mapping for thetdMatrix. Keys in this map are indices of entries inPreprocessingContext.AllStemsarrays, values are the indices oftdMatrixrows corresponding to the stems. Please note that depending on the limit on the size of the matrix, some stems may not have their corresponding matrix rows.This object is produced by
TermDocumentMatrixBuilder.buildTermDocumentMatrix(VectorSpaceModelContext).
-
-
Constructor Detail
-
VectorSpaceModelContext
public VectorSpaceModelContext(PreprocessingContext preprocessingContext)
Creates a vector space model context with the provided preprocessing context.
-
-