Package org.carrot2.text.suffixtree
Class SuffixTree
java.lang.Object
org.carrot2.text.suffixtree.SuffixTree
public final class SuffixTree extends Object
Builds a suffix tree (or generalized suffix tree) on a sequence of any integers (or objects that
 can be represented as unique integers). A direct implementation of Esko Ukkonen's algorithm, but
 optimized for Java to use primitive data types instead of objects (or boxed types).
- See Also:
- "E. Ukkonen, On-line construction of suffix trees, Algorithmica, 1995, volume 14, number 3, pages 249-260."
- 
Nested Class SummaryNested Classes Modifier and Type Class Description static interfaceSuffixTree.IProgressCallbackProgress callback is invoked when iterating forward through the input sequence elements.static interfaceSuffixTree.IStateCallbackA callback invoked when new states are added to the tree.static interfaceSuffixTree.IVisitorVisitor interface for traversals.static classSuffixTree.VisitorAdapterEmpty implementation recursively walking the entire suffix tree.
- 
Field SummaryFields Modifier and Type Field Description static intNO_EDGEMarker for the state's last edge intransitions.
- 
Constructor SummaryConstructors Constructor Description SuffixTree(Sequence sequence, SuffixTree.IStateCallback newStateCallback, SuffixTree.IProgressCallback progressCallback)Build a suffix tree for a given input sequence of symbols.
- 
Method SummaryModifier and Type Method Description booleancontainsSuffix(Sequence seq)intfindEdge(int state, int symbol)Find a transition from statestate, labeled with a given symbol.intfirstEdge(int state)Returns the index of the first edge from a given state orNO_EDGEif a given state has no edges.intgetEndIndex(int edge)Returns the edge label's end index (inclusive).intgetRootState()For procedural traversals (not visitors).intgetStartIndex(int edge)Returns the edge label's start index (inclusive).intgetStatesCount()intgetToState(int edge)Returns the target state for a given edge.intgetTransitionsCount()booleanisLeaf(int state)Check ifstateis a leaf (has no outgoing edges).intnextEdge(int edge)Returns the index of the next edge (sibling) orNO_EDGEifedgeis the last edge in its state.voidvisit(SuffixTree.IVisitor visitor)Walks the states and edges of the suffix tree, depth-first.voidvisitState(int state, SuffixTree.IVisitor visitor)Start visiting from a given state.
- 
Field Details- 
NO_EDGEpublic static final int NO_EDGEMarker for the state's last edge intransitions.- See Also:
- Constant Field Values
 
 
- 
- 
Constructor Details- 
SuffixTreepublic SuffixTree(Sequence sequence, SuffixTree.IStateCallback newStateCallback, SuffixTree.IProgressCallback progressCallback)Build a suffix tree for a given input sequence of symbols.
 
- 
- 
Method Details- 
getTransitionsCountpublic final int getTransitionsCount()- Returns:
- Return the number of transitions (edges) in the tree.
 
- 
getStatesCountpublic final int getStatesCount()- Returns:
- Return the number of states in the tree.
 
- 
containsSuffix- Returns:
- trueif this suffix tree has a path from the root state to a leaf state corresponding to a given sequence of objects. This indicates the input sequence had a suffix identical to- sequence.
 
- 
visitWalks the states and edges of the suffix tree, depth-first.
- 
visitStateStart visiting from a given state.
- 
getRootStatepublic int getRootState()For procedural traversals (not visitors).
- 
isLeafpublic final boolean isLeaf(int state)Check ifstateis a leaf (has no outgoing edges).
- 
firstEdgepublic final int firstEdge(int state)Returns the index of the first edge from a given state orNO_EDGEif a given state has no edges. Does not perform any sanity check on the input state.
- 
nextEdgepublic final int nextEdge(int edge)Returns the index of the next edge (sibling) orNO_EDGEifedgeis the last edge in its state.
- 
findEdgepublic final int findEdge(int state, int symbol)Find a transition from statestate, labeled with a given symbol.NO_EDGEis returned if there is no such edge.
- 
getToStatepublic int getToState(int edge)Returns the target state for a given edge.
- 
getStartIndexpublic int getStartIndex(int edge)Returns the edge label's start index (inclusive).
- 
getEndIndexpublic int getEndIndex(int edge)Returns the edge label's end index (inclusive).
 
-