Package org.carrot2.text.preprocessing
Class PreprocessingContext.AllPhrases
java.lang.Object
org.carrot2.text.preprocessing.PreprocessingContext.AllPhrases
- Enclosing class:
- PreprocessingContext
public class PreprocessingContext.AllPhrases extends Object
Information about all frequently appearing sequences of words found in the input documents.
Each entry in each array corresponds to one sequence.
All arrays in this class have the same length and values across different arrays correspond to each other for the same index.
-
Field Summary
Fields Modifier and Type Field Description int[]tfTerm frequency of the phrase.int[][]tfByDocumentTerm frequency of the phrase for each document.int[][]wordIndicesPointers toPreprocessingContext.AllWordsfor each word in the phrase sequence. -
Constructor Summary
Constructors Constructor Description AllPhrases() -
Method Summary
Modifier and Type Method Description CharSequencegetPhrase(int index)Returns space-separated words that constitute this phrase.intsize()Returns length of all arrays in thisPreprocessingContext.AllPhrases.StringtoString()For debugging purposes.
-
Field Details
-
wordIndices
public int[][] wordIndicesPointers toPreprocessingContext.AllWordsfor each word in the phrase sequence.This array is produced by
PhraseExtractor. -
tf
public int[] tfTerm frequency of the phrase.This array is produced by
PhraseExtractor. -
tfByDocument
public int[][] tfByDocumentTerm frequency of the phrase for each document. The encoding of this array is similar toPreprocessingContext.AllWords.tfByDocument: consecutive pairs of: document index, frequency.This array is produced by
PhraseExtractor. The order of documents in this array is not defined.
-
-
Constructor Details
-
AllPhrases
public AllPhrases()
-
-
Method Details