Monday, September 30, 2013

ElasticSearch Query - how to insert and retreive search data

ElasticSearch uses HTTP Methods (ex. GET, POST, PUT, DELETE) to retrieve, save, and delete search data from its index.

For simplicity, we will use curl to demonstrate some usages. If you haven't done so already, start ElasticSearch in your terminal.


Adding a document

We will send a HTTP POST request to add the subject "sports" to an index. The request will have the following form:
curl -XPOST "http://localhost:9200/{index}/{type}/{id}" -d '{"key0":  "value0", ... , "keyX": "valueX"}'
Example:
curl -XPOST "http://localhost:9200/subjects/subject/1" -d '{"name":  "sports",  "creator": {"first_name":"John", "last_name":"Smith"}}'

Retrieving the document

We can get back the document by sending a GET request.
curl -X GET "http://localhost:9200/subjects/_search?q=sports"
We can also use a POST request to query the above.
curl -X POST "http://localhost:9200/subjects/_search" -d '{
"query": {"term":{"name":"sports"}}
}'
Both of the above will give you the following:
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":0.30685282,"hits":[{"_index":"subjects","_type":"subject","_id":"1","_score":0.30685282, "_source" : {"name":  "sports"}}]}}
The _source filed above holds the results for the query.

To search based on the nested properties (Ex. first_name, last_name), we can do the following:
curl -XGET "http://localhost:9200/subjects/_search?q=subject.creator.first_name:John"
curl -XGET "http://localhost:9200/subjects/subject/_search?q=creator.first_name:John"
curl -XGET "http://localhost:9200/subjects/subject/_search?q=subject.creator.first_name:John" 
All the above queries will return the same results.


Deleting the document

Similarly, we can delete the subject index by a DELETE request.
curl -X DELETE "http://localhost:9200/subjects"

Creating Document with settings and mappings

If you want to adjust settings like number of shards and replicas, you may find the following useful. The more shards you have, the better the indexing performance. The more replicas you have, the better the searching performance.
curl -X PUT "http://localhost:9200/subjects" -d '
{"settings":{"index":{"number_of_shards":3, "number_of_replicas":2}}},
{"mappings":{"document": {
                             "properties": {
                                 "name" : {"type":string, "analyzer":"full_text"}
                             }
                         }
                       }
}'
The above created an index called subjects. Each document in the index has a property called name.


Checking the Mapping
curl -X GET "http://localhost:9200/subjects/_mapping?pretty=true"
You should see
{
  "subjects" : { }
}
The pretty parameter above just formats the JSON result in a human readable format.

No comments:

Post a Comment