Introduction
Elasticsearch is a distributed, scalable, real-time search and analytics engine built on top of Apache Lucene™,. Lucene ( a library) is arguably the most advanced, high-performance, and fully featured search engine library in existence today—both open source and proprietary. It enables you to search, analyze, and explore your data whether you need full-text search, real-time analytics of structured data, or a combination of the two.
Elasticsearch is much more than just Lucene and much more than “just” full-text search. It can also be described as follows:
- A distributed real-time document store where every field is indexed and searchable
- A distributed search engine with real-time analytics
- Capable of scaling to hundreds of servers and petabytes of structured and unstructured data
It packages up all this functionality into a standalone server that your application can talk to via a simple RESTful API, using a web client from your favorite programming language, or even from the command line.
Installing and Running
[hdpsysuser@hdpmaster elk]$ java -version
[hdpsysuser@hdpmaster elk]$ tar -xvf elasticsearch-6.2.3.tar.gz
[hdpsysuser@hdpmaster elk]$ tar -xvf kibana-6.2.3-linux-x86_64.tar.gz
[hdpsysuser@hdpmaster elk]$ tar -xvf logstash-6.2.3.tar.gz
export ES_HOME=/usr/hadoopsw/elk/elasticsearch-6.2.3
export PATH=$PATH:$ES_HOME/bin
export KIBANA_HOME=/usr/hadoopsw/elk/kibana-6.2.3-linux-x86_64
export PATH=$PATH:$KIBANA_HOME/bin
2- Go to the Elasticsearch home directory and inside the bin folder. Default port for Elasticsearch web interface is 9200. You can change it by changing http.port inside elasticsearch.yml file present in bin directory.
[hdpsysuser@hdpmaster elasticsearch-6.2.3]$ cd /usr/hadoopsw/elk/elasticsearch-6.2.3/bin
Config location: /usr/hadoopsw/elk/elasticsearch-6.2.3/config/elasticsearch.yml
network.host: 0.0.0.0
http.port: 9200
In config/elasticsearch.yml put, do the same for Kibana if you have it also.
network.host: 0.0.0.0
Elasticsearch loads its configuration from the $ES_HOME/config/elasticsearch.yml file by default. Any settings that can be specified in the config file can also be specified on the command line, using the -E syntax as follows:
./bin/elasticsearch -d -Ecluster.name=my_cluster -Enode.name=node_1
./bin/elasticsearch
-- To run Elasticsearch as a daemon
./bin/elasticsearch -d -p pid
[hdpsysuser@hdpmaster ~]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 15673
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
vi /etc/security/limits.conf
* hard nofile 65536
* soft nofile 65536
You may get below error also while starting Elasticsearch
max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[hdpsysuser@hdpmaster ~]$ sudo sysctl -w vm.max_map_count=262144
vm.max_map_count = 262144
The vm_map_max_count setting should be set permanently in /etc/sysctl.conf:
vi /etc/sysctl.conf
vm.max_map_count=262144
Talking to Elasticsearch
Log messages can be found in the $ES_HOME/logs/ directory.
To shut down Elasticsearch, kill the process ID recorded in the pid file:
kill `cat pid`
On Linux, Elasticsearch uses a lot of file descriptors or file handles. Running out of file descriptors can be disastrous and will most probably lead to data loss. Make sure to increase the limit on the number of open files descriptors for the user running Elasticsearch to 65,536 or higher. Set nofile to 65536 in /etc/security/limits.conf.
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 15673
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
vi /etc/security/limits.conf
* hard nofile 65536
* soft nofile 65536
You may get below error also while starting Elasticsearch
max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[hdpsysuser@hdpmaster ~]$ sudo sysctl -w vm.max_map_count=262144
vm.max_map_count = 262144
vm.max_map_count=262144
4- You can check if the server is up and running by browsing http://localhost:9200. It will return a JSON object, which contains the information about the installed Elasticsearch
[hdpsysuser@hdpmaster ~]$ curl http://localhost:9200
{
{
"name" : "w_ykJGL",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "bie9YKCvRlWqetugFWPawg",
"version" : {
"number" : "6.2.3",
"build_hash" : "c59ff00",
"build_date" : "2018-03-13T10:06:29.741383Z",
"build_snapshot" : false,
"lucene_version" : "7.2.1",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
Talking to Elasticsearch
You can talk with Elastic search using Java API or RESTful API. For Java, Elasticsearch comes with two built-in clients (Node and Transport client) that you can use in your code over port 9300. All other languages can communicate with Elasticsearch over port 9200 using a RESTful API, accessible with your favorite web client. Elasticsearch provides official clients for several languages—Groovy, JavaScript, .NET, PHP, Perl, Python, and Ruby.
A request to Elasticsearch consists of the same parts as any HTTP request:
curl -X<VERB> '<PROTOCOL>://<HOST>:<PORT>/<PATH>?<QUERY_STRING>' -d '<BODY>'
The parts marked with < > above are:
For instance, to count the number of documents in the cluster, we could use this:
The parts marked with < > above are:
VERB |
The appropriate HTTP method or verb:
GET , POST , PUT , HEAD , or DELETE . |
PROTOCOL |
Either
http or https (if you have an https proxy in front of Elasticsearch.) |
HOST |
The hostname of any node in your Elasticsearch cluster, or
localhost for a node on your
local machine.
|
PORT |
The port running the Elasticsearch HTTP service, which defaults to
9200 . |
PATH |
API Endpoint (for example
_count will return the number of documents in the cluster).
Path may contain multiple components, such as
_cluster/stats or _nodes/stats/jvm |
QUERY_STRING |
Any optional query-string parameters (for example
?pretty will pretty-print the JSON
response to make it easier to read.)
|
BODY |
A JSON-encoded request body (if the request needs one.)
|
curl -XGET 'http://localhost:9200/_count?pretty' -d '
{
"query": {
"match_all": {}
}
}
'
Elasticsearch returns an HTTP status code like 200 OK and (except for HEAD requests) a JSON-encoded response body. The preceding curl request would respond with a JSON body like the following:
{
"count" : 0,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
}
}
We don’t see the HTTP headers in the response because we didn’t ask curl to display them. To see the headers, use the curl command with the -i switch:
[hdpsysuser@hdpmaster bin]$ curl -i -XGET 'localhost:9200/'
Elasticsearch is document oriented, meaning that it stores entire objects or documents. It not only stores them, but also indexes the contents of each document in order to make them searchable. In Elasticsearch, you index, search, sort, and filter documents—not rows of columnar data. This is a fundamentally different way of thinking about data and is one of the reasons Elasticsearch can perform complex full-text search.
Elasticsearch uses JavaScript Object Notation, or JSON, as the serialization format for documents. JSON serialization is supported by most programming languages, and has become the standard format used by the NoSQL movement. It is simple, concise, and easy to read.
To have a feel for what is possible in Elasticsearch and how easy it is to use, let’s start by walking through simple statements that cover basic concepts such as indexing, search, and aggregations.
Indexing
Delete by Query
POST logstash-2018.04.23/_delete_by_query
{
"query": {
"match": {
"message": "HDFS"
}
}
}
POST logstash-2018.04.23/_delete_by_query
{
"query": {
"match_all": {}
}
}
POST logstash-2018.04.23/_delete_by_query
{
"query": {
"match": {"@timestamp":"2018-04-23"}
}
}
The act of storing data in Elasticsearch is called indexing,but before we can index a document, we need to decide where to store it. An Elasticsearch cluster can contain multiple indices, which in turn contain multiple types. These types hold multiple documents, and each document has multiple fields.
An index is like a database in a traditional relational database. Indexing is much like the INSERT keyword in SQL except that, if the document already exists, the new document would replace the old.
Relational databases add an index, such as a B-tree index, to specific columns in order to improve the speed of data retrieval. Elasticsearch and Lucene use a structure calledan inverted index for exactly the same purpose.
By default, every field in a document is indexed (has an inverted index) and thus is searchable. A field without an inverted index is not searchable.
Create Index
[hdpsysuser@hdpmaster bin]$ curl -X PUT http://localhost:9200/elktest
{"acknowledged":true,"shards_acknowledged":true,"index":"elktest"}
Using index
insert a “Hello, world” test document to verify that your new index is available
curl --header "content-type: application/JSON" -XPOST http://localhost:9200/elktest/test/hello -d '{"title":"Hello world"}'
[hdpsysuser@hdpmaster bin]$ curl --header "content-type: application/JSON" -XPOST http://localhost:9200/elktest/test/hello -d '{"title":"Hello world"}'
{"_index":"elktest","_type":"test","_id":"hello","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}
Retrieve document
You may view this document by issuing a GET request to the _search endpoint
curl -XGET 'http://localhost:9200/elktest/test/hello'
[hdpsysuser@hdpmaster bin]$ curl -XGET 'http://localhost:9200/elktest/test/hello'
{"_index":"elktest","_type":"test","_id":"hello","_version":1,"found":true,"_source":{"title":"Hello world"}}
Basic health check
[hdpsysuser@hdpmaster bin]$ curl -XGET 'localhost:9200/_cat/health?v&pretty'
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1523186823 11:27:03 elasticsearch yellow 1 1 5 5 0 0 5 0 - 50.0%
List of nodes in cluster
[hdpsysuser@hdpmaster bin]$ curl -XGET 'localhost:9200/_cat/nodes?v&pretty'
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
127.0.0.1 15 67 5 0.00 0.01 0.05 mdi * w_ykJGL
List of indices
[hdpsysuser@hdpmaster bin]$ curl -XGET 'localhost:9200/_cat/indices?v&pretty'
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open elktest PAqzGiWRTeeSea8qAvAN3A 5 1 1 0 4.4kb 4.4kb
Create an index named "customer" and then list all the indexes again
[hdpsysuser@hdpmaster bin]$ curl -XPUT 'localhost:9200/customer?pretty&pretty'
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "customer"
}
[hdpsysuser@hdpmaster bin]$ curl -XGET 'localhost:9200/_cat/indices?v&pretty'
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open elktest PAqzGiWRTeeSea8qAvAN3A 5 1 1 0 4.4kb 4.4kb
yellow open customer f-9cBDSlQ-aNmAWhHhj_rA 5 1 0 0 230b 230b
Put something into our customer index. We’ll index a simple customer document into the customer index
curl -XPUT 'localhost:9200/customer/_doc/1?pretty&pretty' -H 'Content-Type: application/json' -d'
{
"name": "Inam Bukhari"
}
'
[hdpsysuser@hdpmaster bin]$ curl -XPUT 'localhost:9200/customer/_doc/1?pretty&pretty' -H 'Content-Type: application/json' -d'
> {
> "name": "Inam Bukhari"
> }
> '
{
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
Retrieve document
Retrieve that document that we just indexed
curl -XGET 'localhost:9200/customer/_doc/1?pretty&pretty'
[hdpsysuser@hdpmaster bin]$ curl -XGET 'localhost:9200/customer/_doc/1?pretty&pretty'
{
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "Inam Bukhari"
}
}
Delete the index
curl -XDELETE 'localhost:9200/elktest?pretty&pretty'
[hdpsysuser@hdpmaster bin]$ curl -XDELETE 'localhost:9200/elktest?pretty&pretty'
{
"acknowledged" : true
}
[hdpsysuser@hdpmaster bin]$ curl -XGET 'localhost:9200/_cat/indices?v&pretty'
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open customer f-9cBDSlQ-aNmAWhHhj_rA 5 1 1 0 4.3kb 4.3kb
Replace the existing doc
Replace the existing doc with new one using same ID eg; 1, see the version info
curl -XPUT 'localhost:9200/customer/_doc/1?pretty&pretty' -H 'Content-Type: application/json' -d'
{
"name": "Inam Bukhari"
}
'
[hdpsysuser@hdpmaster bin]$ curl -XPUT 'localhost:9200/customer/_doc/1?pretty&pretty' -H 'Content-Type: application/json' -d'
> {
> "name": "Inam Bukhari"
> }
> '
{
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1
}
Update documents
curl -XPOST 'localhost:9200/customer/_doc/1/_update?pretty&pretty' -H 'Content-Type: application/json' -d'
{
"doc": { "name": "Inaam Bukhary" }
}
'
[hdpsysuser@hdpmaster bin]$ curl -XPOST 'localhost:9200/customer/_doc/1/_update?pretty&pretty' -H 'Content-Type: application/json' -d'
> {
> "doc": { "name": "Inaam Bukhary" }
> }
> '
{
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 3,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1
}
curl -XPOST 'localhost:9200/customer/_doc/1/_update?pretty&pretty' -H 'Content-Type: application/json' -d'
{
"doc": { "name": "Inaam Bukhary", "age": 30 }
}
'
[hdpsysuser@hdpmaster bin]$ curl -XPOST 'localhost:9200/customer/_doc/1/_update?pretty&pretty' -H 'Content-Type: application/json' -d'
> {
> "doc": { "name": "Inaam Bukhary", "age": 30 }
> }
> '
{
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 4,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 3,
"_primary_term" : 1
}
Update by scripts
Updates can also be performed by using simple scripts.
curl -XPOST 'localhost:9200/customer/_doc/1/_update?pretty&pretty' -H 'Content-Type: application/json' -d'
{
"script" : "ctx._source.age += 5"
}
'
[hdpsysuser@hdpmaster bin]$ curl -XPOST 'localhost:9200/customer/_doc/1/_update?pretty&pretty' -H 'Content-Type: application/json' -d'
> {
> "script" : "ctx._source.age += 5"
> }
> '
{
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 5,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 4,
"_primary_term" : 1
}
Deleting a document
[hdpsysuser@hdpmaster bin]$ curl -XDELETE 'localhost:9200/customer/_doc/1?pretty&pretty'
{
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 6,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 5,
"_primary_term" : 1
}
Delete by Query
POST logstash-2018.04.23/_delete_by_query
{
"query": {
"match": {
"message": "HDFS"
}
}
}
POST logstash-2018.04.23/_delete_by_query
{
"query": {
"match_all": {}
}
}
POST logstash-2018.04.23/_delete_by_query
{
"query": {
"match": {"@timestamp":"2018-04-23"}
}
}
Batch Operations
Elasticsearch also provides the ability to perform any of the above operations in batches using the _bulk API.
curl -XPOST 'localhost:9200/customer/_doc/_bulk?pretty&pretty' -H 'Content-Type: application/json' -d'
{"index":{"_id":"1"}}
{"name": "Inam Bukhari" }
{"index":{"_id":"2"}}
{"name": "Abuzar Bukhari" }
'
[hdpsysuser@hdpmaster bin]$ curl -XPOST 'localhost:9200/customer/_doc/_bulk?pretty&pretty' -H 'Content-Type: application/json' -d'
> {"index":{"_id":"1"}}
> {"name": "Inam Bukhari" }
> {"index":{"_id":"2"}}
> {"name": "Abuzar Bukhari" }
> '
{
"took" : 41,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 6,
"_primary_term" : 1,
"status" : 201
}
},
{
"index" : {
"_index" : "customer",
"_type" : "_doc",
"_id" : "2",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1,
"status" : 201
}
}
]
}
curl -XPOST 'localhost:9200/customer/_doc/_bulk?pretty&pretty' -H 'Content-Type: application/json' -d'
{"update":{"_id":"1"}}
{"doc": { "name": "Inaam Bukhary" } }
{"delete":{"_id":"2"}}
'
[hdpsysuser@hdpmaster bin]$ curl -XPOST 'localhost:9200/customer/_doc/_bulk?pretty&pretty' -H 'Content-Type: application/json' -d'
> {"update":{"_id":"1"}}
> {"doc": { "name": "Inaam Bukhary" } }
> {"delete":{"_id":"2"}}
> '
{
"took" : 45,
"errors" : false,
"items" : [
{
"update" : {
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 7,
"_primary_term" : 1,
"status" : 200
}
},
{
"delete" : {
"_index" : "customer",
"_type" : "_doc",
"_id" : "2",
"_version" : 2,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1,
"status" : 200
}
}
]
}
Exploring Data
Loading Dataset
curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_doc/_bulk?pretty&refresh" --data-binary "@/data/accounts.json"
[hdpsysuser@hdpmaster bin]$ curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_doc/_bulk?pretty&refresh" --data-binary "@/data/accounts.json"
[hdpsysuser@hdpmaster bin]$ curl "localhost:9200/_cat/indices?v"
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open customer f-9cBDSlQ-aNmAWhHhj_rA 5 1 1 0 4.4kb 4.4kb
yellow open bank ymIPSIiDS9i_iiQeM6mD1w 5 1 1000 0 498.3kb 498.3kb
Simple searches
[hdpsysuser@hdpmaster bin]$ curl -XGET 'localhost:9200/bank/_search?q=*&sort=account_number:asc&pretty&pretty'
[hdpsysuser@hdpmaster bin]$ curl -XGET 'localhost:9200/bank/_search?q=_id:9&sort=account_number:asc&pretty&pretty'
-- same exact search above using the alternative request body method
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match_all": {} },
"sort": [
{ "account_number": "asc" }
]
}
'
Query Language
we also can pass other parameters to influence the search results. In the example in the section above we passed in sort, here we pass in size (limit in SQL):
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match_all": {} },
"size": 1
}
'
-- documents 10 through 19:
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match_all": {} },
"from": 10,
"size": 10
}
'
-- match_all and sorts the results by account balance in descending order and returns the top 10 (default size) documents.
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match_all": {} },
"sort": { "balance": { "order": "desc" } }
}
'
Executing Searches
-- return two fields, account_number and balance (inside of _source), from the search:
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match_all": {} },
"_source": ["account_number", "balance"]
}
'
-- basic fielded search query
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match": { "account_number": 20 } }
}
'
-- returns all accounts containing the term "mill" in the address
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match": { "address": "mill" } }
}
'
-- returns all accounts containing the term "mill" or "lane" in the address:
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match": { "address": "mill lane" } }
}
'
-- a variant of match (match_phrase) that returns all accounts containing the phrase "mill lane" in the address:
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": { "match_phrase": { "address": "mill lane" } }
}
'
-- The bool query allows us to compose smaller queries into bigger queries using boolean logic.
-- returns all accounts containing "mill" and "lane" in the address:
-- bool must clause specifies all the queries that must be true for a document to be considered a match.
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}
'
-- the bool should clause specifies a list of queries either of which must be true for a document to be considered a match.
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"should": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}
'
-- the bool must_not clause specifies a list of queries none of which must be true for a document to be considered a match.
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must_not": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}
'
-- We can combine must, should, and must_not clauses simultaneously inside a bool query.
-- below returns all accounts of anybody who is 40 years old but doesn’t live in ID:
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": [
{ "match": { "age": "40" } }
],
"must_not": [
{ "match": { "state": "ID" } }
]
}
}
}
'
Executing Filters
--return all accounts with balances between 20000 and 30000, inclusive.
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": { "match_all": {} },
"filter": {
"range": {
"balance": {
"gte": 20000,
"lte": 30000
}
}
}
}
}
}
'
Executing Aggregation
-- group all the accounts by state, and then returns the top 10 (default) states sorted by
count descending (also default):
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "state.keyword"
}
}
}
}
'
-- calculates the average account balance by state (again only for the top 10 states sorted by count in descending order):
curl -XGET 'localhost:9200/bank/_search?pretty' -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "state.keyword"
},
"aggs": {
"average_balance": {
"avg": {
"field": "balance"
}
}
}
}
}
}
'
No comments:
Post a Comment