Hello, Carrot2!

Carrot2 is a programming library for clustering text. It can automatically discover groups of related documents and label them with short key terms or phrases.

Carrot2 can turn, for example, search result titles and snippets into groups like these:

Search results (snippets) and clusters discovered from them.

Search result titles and snippets (on the left) for query "salsa" and corresponding cluster labels (right).

What's in the box

Carrot2 provides a common infrastructure and a number of algorithms for clustering of text. Out-of-the-box, Carrot2 distribution comes with:

Additionally, several downstream projects provide integration between Carrot2 and popular document retrieval services:

Try Carrot2 now

The quickest way to try out Carrot2 is to use the public live demo. The demo lets you play with clustering of web search results provided by the eTools search engine or explore medical documents from the PubMed database of medical abstracts.

Carrot2 search results clustering application.

Screenshot of the live Carrot2 demo application clustering search results for query "JSON".