Package org.carrot2.text.preprocessing
Class PreprocessingContext.AllPhrases
- java.lang.Object
-
- org.carrot2.text.preprocessing.PreprocessingContext.AllPhrases
-
- Enclosing class:
- PreprocessingContext
public class PreprocessingContext.AllPhrases extends Object
Information about all frequently appearing sequences of words found in the input documents. Each entry in each array corresponds to one sequence.All arrays in this class have the same length and values across different arrays correspond to each other for the same index.
-
-
Field Summary
Fields Modifier and Type Field Description int[]
tf
Term frequency of the phrase.int[][]
tfByDocument
Term frequency of the phrase for each document.int[][]
wordIndices
Pointers toPreprocessingContext.AllWords
for each word in the phrase sequence.
-
Constructor Summary
Constructors Constructor Description AllPhrases()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description CharSequence
getPhrase(int index)
Returns space-separated words that constitute this phrase.int
size()
Returns length of all arrays in thisPreprocessingContext.AllPhrases
.String
toString()
For debugging purposes.
-
-
-
Field Detail
-
wordIndices
public int[][] wordIndices
Pointers toPreprocessingContext.AllWords
for each word in the phrase sequence.This array is produced by
PhraseExtractor
.
-
tf
public int[] tf
Term frequency of the phrase.This array is produced by
PhraseExtractor
.
-
tfByDocument
public int[][] tfByDocument
Term frequency of the phrase for each document. The encoding of this array is similar toPreprocessingContext.AllWords.tfByDocument
: consecutive pairs of: document index, frequency.This array is produced by
PhraseExtractor
. The order of documents in this array is not defined.
-
-
Method Detail
-
getPhrase
public CharSequence getPhrase(int index)
Returns space-separated words that constitute this phrase.
-
size
public int size()
Returns length of all arrays in thisPreprocessingContext.AllPhrases
.
-
-