Hello, Carrot²!

Carrot² is a programming library for clustering text. It can automatically discover groups of related documents and label them with short key terms or phrases.

Carrot² can turn, for example, search result titles and snippets into groups like these:

Search results (snippets) and clusters discovered from them. — Search result titles and snippets (on the left) for query "salsa" and corresponding cluster labels (right).

What's in the box

Carrot² provides a common infrastructure and a number of algorithms for clustering of text. Out-of-the-box, Carrot² distribution comes with:

a Java API and several clustering algorithm implementations,
a REST service for mash-ups or integration with languages other than Java,
a simple search-engine-like demo application that clusters search results,
code snippets and examples for reuse in your code.

Additionally, several downstream projects provide integration between Carrot² and popular document retrieval services:

Apache Solr has built-in support for clustering search results via Carrot² algorithms,
elasticsearch-carrot2 plugin provides search results clustering for Elasticsearch.

Try Carrot² now

The quickest way to try out Carrot² is to use the public live demo. The demo lets you play with clustering of web search results provided by the eTools search engine or explore medical documents from the PubMed database of medical abstracts.

Carrot2 search results clustering application. — Screenshot of the live Carrot² demo application clustering search results for query "JSON".

next article
Algorithms

API elements

Sections and content

Hello, Carrot²!

What's in the box

Try Carrot² now

API elements

Sections and content

Hello, Carrot2!

What's in the box

Try Carrot2 now

Hello, Carrot²!

Try Carrot² now