Wednesday, October 2, 2013

ElasticSearch - Indexing via Java API

There are many ways to populate your data to the ElasticSearch data store. The most primitive way is to populate via the REST API via PUT or POST requests.

In this tutorial, we will be populating via the Java API. I have data in MySQL and my Web application is based on Spring.

Here's my setup:

  • Ubuntu 12.04 Amazon EC2
  • JDK 1.7
  • Spring 3.2
  • MySQL


Install ElasticSearch (ES) via Maven

Put the following into your pom.xml file.
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>0.90.5</version>
</dependency>

Make sure you also installed the same version of ES on your server. Read How to Install ElasticSearch on EC2.

Let's create a search service called ElasticSearchService:

Interface:

package com.developer24hours.elasticsearch.service;
public interface ElasticSearchService {
public boolean indexCategories();
}


Implementation:

We will be using the ElasticSearch's native Java API. We will connect to the ElasticSearch cluster using the Client object. Using the XContentBuilder, we can construct JSON wrapper of the category objects. The category data is stored in MySQL and retrieved by the categoryDao object. Finally, a HTTP GET request will put the data into the ES cluster.

package com.developer24hours.elasticsearch.service.impl;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;
import org.elasticsearch.node.NodeBuilder.*;
import org.elasticsearch.client.*;
import org.elasticsearch.client.transport.*;
import org.elasticsearch.common.xcontent.*;
import org.elasticsearch.common.transport.*;
import org.elasticsearch.action.index.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.io.IOException;
import java.util.List;
/**
*
* @author kenneth
*
* ES Guide
* https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch
*/
@Service("elasticSearchService")
@Transactional
public class ElasticSearchServiceImpl implements ElasticSearchService {
org.slf4j.Logger log = LoggerFactory.getLogger(this.getClass().getName());
private static final String CLUSTER_NAME_LABEL = "cluster.name";
private static final String INDEX = "categories";
private static final String TYPE = "category";
private static final String NAME = "name";
@Autowired
ICategoryDao categoryDao;
@Override
@Transactional(readOnly=true)
public boolean indexCategories() {
boolean isValid = false;
try {
// config.getElasticSearchClusterName() is the name of your cluster; you can hard code this or put this in spring's resources file
Settings settings = ImmutableSettings.settingsBuilder().put(CLUSTER_NAME_LABEL, config.getElasticSearchClusterName()).build();
Client client = new TransportClient(settings).addTransportAddress(new InetSocketTransportAddress("localhost", 9300));
List<Category> categories = categoryDao.getAll();
XContentBuilder jsonBuilder = null;
IndexResponse response = null;
Category category = null;
for(int i = 0; i < categories.size(); i++) {
category = categories.get(i);
jsonBuilder = XContentFactory.jsonBuilder()
.startObject()
.field(NAME, category.getName())
.endObject();
response = client.prepareIndex(INDEX, TYPE)
.setSource(jsonBuilder)
.execute()
.actionGet();
}
isValid = true;
} catch (IOException ioe) {
log.error("ElasticSearchServiceImpl: indexCategories error", ioe);
isValid = false;
}
return isValid;
}
}


Let's create the interface that you can invoke the call.

Interface:

package com.developer24hours.elasticsearch;
public interface SearchApiService {
boolean indexCategories() throws EpubServerException;
}


Implementation:

package com.developer24hours.elasticsearch.impl;
import java.sql.Timestamp;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Map;
import javax.ws.rs.Consumes;
import javax.ws.rs.FormParam;
import javax.ws.rs.GET;
import javax.ws.rs.POST;
import javax.ws.rs.Path;
import javax.ws.rs.PathParam;
import javax.ws.rs.Produces;
import javax.ws.rs.core.Context;
import javax.ws.rs.core.MediaType;
import org.apache.cxf.jaxrs.ext.MessageContext;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
@Path("/search")
@Service("srvSearchApi")
public class SearchApiServiceImpl extends AbstractApiService implements SearchApiService {
org.slf4j.Logger log = LoggerFactory.getLogger(this.getClass().getName());
@Autowired
ElasticSearchService elasticSearchService;
@GET
@Path("/index/categories")
public boolean indexCategories() throws EpubServerException {
try {
return elasticSearchService.indexCategories();
} catch (Exception e) {
log.error("Error indexing Categories", e);
}
}
}

No comments:

Post a Comment