Package org.carrot2.text.preprocessing
Class PreprocessingContext.AllPhrases
java.lang.Object
org.carrot2.text.preprocessing.PreprocessingContext.AllPhrases
- Enclosing class:
- PreprocessingContext
public class PreprocessingContext.AllPhrases extends Object
Information about all frequently appearing sequences of words found in the input documents.
Each entry in each array corresponds to one sequence.
All arrays in this class have the same length and values across different arrays correspond to each other for the same index.
-
Field Summary
Fields Modifier and Type Field Description int[]
tf
Term frequency of the phrase.int[][]
tfByDocument
Term frequency of the phrase for each document.int[][]
wordIndices
Pointers toPreprocessingContext.AllWords
for each word in the phrase sequence. -
Constructor Summary
Constructors Constructor Description AllPhrases()
-
Method Summary
Modifier and Type Method Description CharSequence
getPhrase(int index)
Returns space-separated words that constitute this phrase.int
size()
Returns length of all arrays in thisPreprocessingContext.AllPhrases
.String
toString()
For debugging purposes.
-
Field Details
-
wordIndices
public int[][] wordIndicesPointers toPreprocessingContext.AllWords
for each word in the phrase sequence.This array is produced by
PhraseExtractor
. -
tf
public int[] tfTerm frequency of the phrase.This array is produced by
PhraseExtractor
. -
tfByDocument
public int[][] tfByDocumentTerm frequency of the phrase for each document. The encoding of this array is similar toPreprocessingContext.AllWords.tfByDocument
: consecutive pairs of: document index, frequency.This array is produced by
PhraseExtractor
. The order of documents in this array is not defined.
-
-
Constructor Details
-
AllPhrases
public AllPhrases()
-
-
Method Details