Interface ClusteringAlgorithm

    • Method Detail

      • requiredLanguageComponents

        Set<Class<?>> requiredLanguageComponents()
        Returns:
        A set of classes required to be present in the LanguageComponents instance provided for clustering.
      • optionalLanguageComponents

        default Set<Class<?>> optionalLanguageComponents()
        Returns:
        A set of classes used by the algorithm, if present, but optional in LanguageComponents instance provided for clustering.
      • cluster

        <T extends DocumentList<Cluster<T>> cluster​(Stream<? extends T> documents,
                                                      LanguageComponents languageComponents)
        Cluster a set of documents.
        Type Parameters:
        T - Any subclass of Document. Clusters of objects of the same type are returned.
        Parameters:
        documents - A stream of documents for clustering.
        languageComponents - LanguageComponents with a set of suppliers for the required language-specific components.
        Returns:
        A list of top-level clusters (clusters can form a hierarchy via Cluster.getClusters().