Getting started

The quickest way to try Carrot² is to visit the on-line demo. For more options, such as Java or REST API, read on.

On-line demo

You can use the on-line demo to play with clustering of web search results provided by eTools and explore medical documents from the PubMed database.

Carrot2 search results clustering application, light theme. — Screenshot of the live Carrot² demo application clustering search results for query *data mining*.

Carrot2 search results clustering application, dark theme. — Screenshot of the live Carrot² demo application clustering search results for query *data mining*.

Clustering your own data

Part of the on-line demo is also the Carrot² Clustering Workbench — a more advanced application for clustering content from files, Solr or Elasticsearch instances. You can use it to test Carrot² clustering with your own content.

Carrot2 Clustering Workbench, light theme. — Screenshot of the Carrot² Clustering Workbench application clustering results from a local file.

Carrot2 Clustering Workbench, dark theme. — Screenshot of the Carrot² Clustering Workbench application clustering results from a local file.

Heads up, limitations of on-line demos.

Please note that Workbench will transfer the contents of your files or search results to the Carrot² server for clustering. The server will keep the data in memory for the duration of the clustering process. None of the data you submit will be permanently stored or logged.

Additionally, our server limits the rate and size of clustering requests to keep the service from overloading.

If you'd rather keep your data private or hit the processing limits, install Carrot² on your own machine.

Excel, OpenOffice or CSV

To cluster data from an Excel, OpenOffice or CSV spreadsheet:

Make sure your spreadsheet contains one document per row. The first row will be treated as a header with field names, for example:

	A	B	C	D	E
1	id	title	question	score	views
2	67	PDF Viewer on Windows	I've tried Foxit and Adobe's reader, but I'm not satisfied with either. Foxit has update nagging for non-critical junk.	39	1975
3	94	What Windows services can I safely disable?	I'm trying to improve the boot time and general performance of a Windows XP machine and ...	28	4808
4	135	Log viewer on Windows	I'm a developer, and I generate big log files. I've tried several log viewer applications ...	31	26011

The spreadsheet can contain fields of all types, Workbench will try to identify the natural text fields to be used for clustering.

Open Carrot² Clustering Workbench in a modern browser.
Choose Local file in the Data source combo box and upload the spreadsheet with your data. If necessary, refine the selection of fields to cluster using the Fields to cluster check boxes.
Press the Cluster button to generate the clusters.

Solr or Elasticsearch

If your data is stored in an Apache Solr or Elasticsearch:

Choose Solr or Elasticsearch in the Data source combo box.
Provide the service URL of your search server and press the Connect button.

Make sure that your server is configured to emit CORS HTTP headers, otherwise Workbench will not be able to query it.
Choose the collection to search, type query and press Cluster.

JSON file

You can submit a file containing an array of flat JSON objects for clustering, for example:

[
  { title: "Title 1", body: "Text", views: 583 },
  { title: "Title 2", body: "Text", views: 23 }
]

Each object represents one document. The object can contain both textual and non-textual properties, Workbench will try to determine the fields containing natural text.

To cluster the contents of a JSON file:

Choose Local file in the Data source combo box.
Upload or drag and drop your JSON file.
Choose the fields to cluster and press Cluster.

Workbench can also cluster files in the Carrot² legacy XML format, but that format is discouraged because it does not support arbitrary field types.

Local installation

You can install Carrot² on your own machine to use the search results clustering and Workbench applications without any limitations.

To run Carrot² on your machine:

Download the latest release package.
Follow Carrot² Document Clustering Server installation instructions.
Open http://localhost:8080 in a modern browser to access the applications.

APIs and integrations

If you'd like to integrate Carrot² with your existing systems, use one of the following options:

the Java API,
the REST API for other programming languages,
Apache Solr has built-in support for clustering search results via Carrot² algorithms (versions up to Solr 8.7 and starting with Solr 9 - unreleased yet),
elasticsearch-carrot2 plugin provides search results clustering for Elasticsearch.

previous article
Hello, Carrot2!

next article
Java API Basics

API elements

Sections and content