Class PreprocessedDocumentScanner

    • Field Detail

      • ON_DOCUMENT_SEPARATOR

        public static final com.carrotsearch.hppc.predicates.ShortPredicate ON_DOCUMENT_SEPARATOR
        Predicate for splitting on document separator.
      • ON_FIELD_SEPARATOR

        public static final com.carrotsearch.hppc.predicates.ShortPredicate ON_FIELD_SEPARATOR
        Predicate for splitting on field separator.
      • ON_SENTENCE_SEPARATOR

        public static final com.carrotsearch.hppc.predicates.ShortPredicate ON_SENTENCE_SEPARATOR
        Predicate for splitting on sentence separator.
    • Constructor Detail

      • PreprocessedDocumentScanner

        public PreprocessedDocumentScanner()
    • Method Detail

      • equalTo

        public static final com.carrotsearch.hppc.predicates.ShortPredicate equalTo​(short t)
        Return a new ShortPredicate returning true if the argument equals a given value.
      • document

        protected void document​(PreprocessingContext context,
                                int start,
                                int length)
        Invoked for each document. Splits further into fields.
      • field

        protected void field​(PreprocessingContext context,
                             int start,
                             int length)
        Invoked for each document's field. Splits further into sentences.
      • sentence

        protected void sentence​(PreprocessingContext context,
                                int start,
                                int length)
        Invoked for each document's sentence.