Please see my other blog for Oracle EBusiness Suite Posts - EBMentors

Search This Blog

Note: All the posts are based on practical approach avoiding lengthy theory. All have been tested on some development servers. Please don’t test any post on production servers until you are sure.

Monday, August 08, 2022

Satisfy your search use cases using Opensearch & Opensearch-Dashboards

OpenSearch is a distributed search and analytics engine to perform full-text searches with all of the features like search by field, search multiple indices, boost fields, rank results by score, sort results by field, and aggregate results. You can think it as the backend for a search application like Wikipedia or an online store. It organizes data into indices. Each index is a collection of JSON documents.

In this post I'll show how to use the opensearch without providing much theoretical details which can be taken from its documentation.

OpenSearch is well-suited to the following use cases:
  • Log analytics
  • Real-time application monitoring
  • Clickstream analytics
  • Search backend


You can download opensearch from below location 

https://opensearch.org/downloads.html

You will find different components to be installed from above link like Opensearch - a search/analytical engine, opensearch-dashbaords - a default visualization tool for data in opensearch, logstash & prepper - ingestion tools to process the events received and sent to opensearch engine, opensearch-cli - a command line tool to mange opensearch cluster, JDBC/ODBC drivers.

Opensearch

Let's install and use opensearch first and will do the same with other components.

[hdpsysuser@hdpmaster sw]$ tar -zxf opensearch-2.1.0-linux-x64.tar.gz
[hdpsysuser@hdpmaster apps]$ mv opensearch-2.1.0 /data/apps

-- Config Location

vim /data/apps/opensearch-2.1.0/config/opensearch.yml

discovery.type: single-node


[hdpsysuser@hdpmaster apps]$ sudo useradd opensearch
[hdpsysuser@hdpmaster apps]$ sudo passwd opensearch
[hdpsysuser@hdpmaster apps]$ sudo chown opensearch:opensearch -R /data/apps/opensearch-2.1.0/


[hdpsysuser@hdpmaster apps]$ su - opensearch
[opensearch@hdpmaster ~]$ export OPENSEARCH_HOME=/data/apps/opensearch-2.1.0
[opensearch@hdpmaster ~]$ export PATH=$PATH:$OPENSEARCH_HOME/bin

Security Enablement

OpenSearch comes with built-in security but you need to configure it before running the opensearch. Lets' install demo configuration.

-- tools for administering the security setup
[opensearch@hdpmaster ~]$ cd /data/apps/opensearch-2.1.0/plugins/opensearch-security/tools/
[opensearch@hdpmaster tools]$ chmod +x install_demo_configuration.sh
[opensearch@hdpmaster tools]$ ./install_demo_configuration.sh
**************************************************************************
** This tool will be deprecated in the next major release of OpenSearch **
** https://github.com/opensearch-project/security/issues/1755           **
**************************************************************************
OpenSearch Security Demo Installer
 ** Warning: Do not use on production or public reachable systems **
Install demo certificates? [y/N] y
Initialize Security Modules? [y/N] y
Cluster mode requires maybe additional setup of:
  - Virtual memory (vm.max_map_count)

Enable cluster mode? [y/N] N
Basedir: /data/apps/opensearch-2.1.0
OpenSearch install type: .tar.gz on CentOS Linux release 7.9.2009 (Core)
OpenSearch config dir: /data/apps/opensearch-2.1.0/config
OpenSearch config file: /data/apps/opensearch-2.1.0/config/opensearch.yml
OpenSearch bin dir: /data/apps/opensearch-2.1.0/bin
OpenSearch plugins dir: /data/apps/opensearch-2.1.0/plugins
OpenSearch lib dir: /data/apps/opensearch-2.1.0/lib
Detected OpenSearch Version: x-content-2.1.0
Detected OpenSearch Security Version: 2.1.0.0

### Success
### Execute this script now on all your nodes and then start all nodes
### OpenSearch Security will be automatically initialized.
### If you like to change the runtime configuration
### change the files in ../../../config/opensearch-security and execute:
"/data/apps/opensearch-2.1.0/plugins/opensearch-security/tools/securityadmin.sh" -cd "/data/apps/opensearch-2.1.0/config/opensearch-security" -icl -key "/data/apps/opensearch-2.1.0/config/kirk-key.pem" -cert "/data/apps/opensearch-2.1.0/config/kirk.pem" -cacert "/data/apps/opensearch-2.1.0/config/root-ca.pem" -nhnv
### or run ./securityadmin_demo.sh
### To use the Security Plugin ConfigurationGUI
### To access your secured cluster open https://<hostname>:<HTTP port> and log in with admin/admin.
### (Ignore the SSL certificate warning because we installed self-signed demo certificates)

We tell the installer that we want to install the built-in demo TLS certificates (do not use them in production!) and also initialize the security module with the demo users and roles.

Script actually changed and installed below in config folder

root-ca.pem: This is the certificate of the root CA that signed all other TLS certificates

esnode.pem: This is the certificate that this node uses when communicating with other nodes on the transport layer (inter-node traffic)

esnode-key.pem: The private key for the esnode.pem node certificate

kirk.pem: This is the admin TLS certificate used when making changes to the security configuration. This certificate gives you full access to the cluster

kirk-key.pem: The private key for the admin TLS certificate

[opensearch@hdpmaster tools]$ opensearch



Test on command line

[hdpsysuser@hdpmaster sw]$ curl -XGET https://hdpmaster:9200 -u admin:admin -k

{

"name" : "hdpmaster",
"cluster_name" : "opensearch",

"cluster_uuid" : "c0TKAaa2SFa9YAQTwwTpuQ",
"version" : {
"distribution" : "opensearch",
"number" : "2.1.0",
"build_type" : "tar",
"build_hash" : "388c80ad94529b1d9aad0a735c4740dce2932a32",
"build_date" : "2022-06-30T21:31:04.823801692Z",
"build_snapshot" : false,
"lucene_version" : "9.2.0",
"minimum_wire_compatibility_version" : "7.10.0",
"minimum_index_compatibility_version" : "7.0.0"
},
"tagline" : "The OpenSearch Project: https://opensearch.org/"
}



Access the opensearch URL from browser http://hdpmaster:9200/

The Browser willcomplains that this self-signed root CA is not on the list of trusted CAs. Proceed in the browser and provide username/password ie; admin/admin


[hdpsysuser@hdpmaster ~]$ netstat -an | grep :9200
tcp6       0      0 192.168.56.104:9200     :::*                    LISTEN

Test using your browser






Disable security


You might want to temporarily disable the security plugin to make testing or internal usage more straightforward. To disable the plugin, add the following line in opensearch.yml:

plugins.security.disabled: true


If you want to disable permenently then you need to remove the security plugins.


[hdpsysuser@hdpmaster ~]$ sudo /data/apps/opensearch-2.1.0/bin/opensearch-plugin list
[hdpsysuser@hdpmaster ~]$ sudo /data/apps/opensearch-2.1.0/bin/opensearch-plugin remove opensearch-security

-> removing [opensearch-security]...
-> preserving plugin config files [/data/apps/opensearch-2.1.0/config/opensearch-security] in case of upgrade; use --purge if not needed

[hdpsysuser@hdpmaster ~]$ sudo /data/apps/opensearch-dashboards-2.1.0/bin/opensearch-dashboards-plugin list

OpenSearch Dashboards should not be run as root. Use --allow-root to continue.
[opensearch@hdpmaster ~]$ /data/apps/opensearch-dashboards-2.1.0/bin/opensearch-dashboards-plugin list
alertingDashboards@2.1.0.0
anomalyDetectionDashboards@2.1.0.0
ganttChartDashboards@2.1.0.0
indexManagementDashboards@2.1.0.0
notificationsDashboards@2.1.0.0
observabilityDashboards@2.1.0.0
queryWorkbenchDashboards@2.1.0.0
reportsDashboards@2.1.0.0
securityDashboards@2.1.0.0



[opensearch@hdpmaster ~]$ /data/apps/opensearch-dashboards-2.1.0/bin/opensearch-dashboards-plugin remove securityDashboards
Removing securityDashboards...
Plugin removal complete
Restart opensearch and opensearch dashboards
[opensearch@hdpmaster ~]$ opensearch
[opensearch@hdpmaster ~]$ opensearch-dashboards
[opensearch@hdpmaster ~]$ /data/apps/opensearch-dashboards-2.1.0/bin/opensearch-dashboards-plugin remove securityDashboards
Removing securityDashboards...
Plugin removal complete



Opensearch-Dashboards

Install opensearch-dashboards after downloading, and install the security configuration 

[hdpsysuser@hdpmaster sw]$ tar -xvf opensearch-dashboards-2.1.0-linux-x64.tar.gz
[hdpsysuser@hdpmaster sw]$ mv opensearch-dashboards-2.1.0 /data/apps
[hdpsysuser@hdpmaster apps]$ sudo chown -R opensearch:opensearch /data/apps/opensearch-dashboards-2.1.0/
[hdpsysuser@hdpmaster apps]$ su - opensearch
[opensearch@hdpmaster ~]$ export OPENSEARCH_DASHBOARDS_HOME=/data/apps/opensearch-dashboards-2.1.0
[opensearch@hdpmaster ~]$ export PATH=$PATH:$OPENSEARCH_DASHBOARDS_HOME/bin

-- Config Location

vim /data/apps/opensearch-dashboards-2.1.0/config/opensearch_dashboards.yml

server.host: "hdpmaster"
server.name: hdpmaster
opensearch.hosts: [http://hdpmaster:9200]

[opensearch@hdpmaster ~]$ opensearch-dashboards


Access http://hdpmaster:5601/





Logstash

Logstash is a real-time event processing engine. You can send events to Logstash from many different sources. Logstash processes the events and sends it one or more destinations eg; opensearch. Sending events to Logstash lets you decouple event processing from your app. Your app only needs to send events to Logstash and doesn’t need to know anything about what happens to the events afterwards.

[hdpsysuser@hdpmaster sw]$ tar -xvf logstash-oss-with-opensearch-output-plugin-7.16.3-linux-x64.tar.gz
[hdpsysuser@hdpmaster sw]$ mv logstash-7.16.3/ /data/apps/
[hdpsysuser@hdpmaster sw]$ sudo chown opensearch:opensearch -R /data/apps/logstash-7.16.3/
[opensearch@hdpmaster ~]$ export LS_HOME=/data/apps/logstash-7.16.3
[opensearch@hdpmaster ~]$ export PATH=$LS_HOME/bin:$PATH

-- Test installation
Use the -e argument to pass a pipeline configuration directly to the Logstash binary. In this case, stdin is the input plugin and stdout is the output plugin:

[opensearch@hdpmaster ~]$ logstash -e "input { stdin { } } output { stdout { } }"

Hi world
{
       "message" => "Hi world",
    "@timestamp" => 2022-08-07T11:02:24.953Z,
      "@version" => "1",
          "host" => "hdpmaster"
}

Write to opensearch (config below)
vim /data/apps/logstash-7.16.3/config/logstash-test.conf

input {
  beats {
    port => 5044
  }
}#input

filter{

# example log generationbelow
# [opensearch@hdpmaster ~]$ echo 192.168.0.3 - - [2022-08-07T00:39:02.912Z] GET /opensearch/opensearch-1.0.0.deb_1 200 6219 >> /data/apps/logstash-7.16.3/testiis.log

# parsing example log
  grok {
   match => { "message" => "%{IPORHOST:ip_address} %{USER:identity} %{USER:auth} \[%{TIMESTAMP_ISO8601:log_timestamp}\] %{WORD:http_method} %{URIPATH:uri_path} %{NUMBER:http_status} %{NUMBER:num_bytes}"
 
}
  }


}

output {
  stdout{ }

  opensearch {
     hosts => ["https://hdpmaster:9200"]
     #index => "opensearch_dashboards_sample_data_logs"
     index => "idxtest1"
     user => "admin"
     password => "admin"
     ssl => true
     ssl_certificate_verification => false
   }
}#output


[opensearch@hdpmaster ~]$ logstash -f $LS_HOME/config/logstash-test.conf --config.reload.automatic

Beats

Beats send data to logstash, it is assumed that filebeat has been installed on a system and it is monitoring one file ie; testiis.log.

[hdpsysuser@hdpmaster ~]$ sudo systemctl start filebeat.service

Write to the test event to file which will be forwarded by filebeat to logstash and logstash will be processing the event and further sending to opensearch

[opensearch@hdpmaster ~]$ echo 192.168.0.3 - - [2022-08-07T00:39:02.912Z] GET /opensearch/opensearch-1.0.0.deb_1 200 6219 >> /data/apps/logstash-7.16.3/testiis.log

Opensearch CLI

Download and extract the appropriate installation package for your computer.
[hdpsysuser@hdpmaster sw]$ unzip opensearch-cli-1.1.0-linux-x64.zip
[hdpsysuser@hdpmaster sw]$ sudo mv opensearch-cli /data/apps/opensearch-2.1.0/bin/
[opensearch@hdpmaster bin]$ opensearch-cli --version
opensearch-cli version 1.1.0 linux/amd64

-- create profile which lets you easily switch between different clusters and user credentials. 

[opensearch@hdpmaster bin]$ opensearch-cli profile create --auth-type basic --endpoint https://hdpmaster:9200 --name local-profile
Username: admin
Password:
Profile created successfully.

-- request to the OpenSearch CAT API
[opensearch@hdpmaster bin]$ opensearch-cli curl get --path _cat/plugins --profile local-profile
hdpmaster opensearch-alerting                  2.1.0.0
hdpmaster opensearch-anomaly-detection         2.1.0.0
hdpmaster opensearch-asynchronous-search       2.1.0.0
hdpmaster opensearch-cross-cluster-replication 2.1.0.0
hdpmaster opensearch-index-management          2.1.0.0
hdpmaster opensearch-job-scheduler             2.1.0.0
hdpmaster opensearch-knn                       2.1.0.0
hdpmaster opensearch-ml                        2.1.0.0
hdpmaster opensearch-notifications             2.1.0.0
hdpmaster opensearch-notifications-core        2.1.0.0
hdpmaster opensearch-observability             2.1.0.0
hdpmaster opensearch-performance-analyzer      2.1.0.0
hdpmaster opensearch-reports-scheduler         2.1.0.0
hdpmaster opensearch-security                  2.1.0.0
hdpmaster opensearch-sql                       2.1.0.0

-- retrieves information about a detector
[opensearch@hdpmaster bin]$ opensearch-cli ad get my-detector --profile local-profile

-- help
[opensearch@hdpmaster bin]$ opensearch-cli -h
[opensearch@hdpmaster bin]$ opensearch-cli ad -h

Data Prepper

Data Prepper is an independent component, not an OpenSearch plugin, that converts data for use with OpenSearch.

[hdpsysuser@hdpmaster sw]$ tar -xvf opensearch-data-prepper-jdk-1.5.1-linux-x64.tar.gz
[hdpsysuser@hdpmaster sw]$ sudo mv opensearch-data-prepper-jdk-1.5.1-linux-x64 /data/apps
[hdpsysuser@hdpmaster sw]$ sudo chown opensearch:opensearch -R /data/apps/opensearch-data-prepper-jdk-1.5.1-linux-x64/


-- Define a pipeline
Create a Data Prepper pipeline file, pipelines.yaml with a source (random UUID) sending data to a sink (stdout). Each pipeline is a combination of a source, a buffer (store data as it passes through the pipeline), zero or more preppers, and one or more sinks.

vim /data/apps/opensearch-data-prepper-jdk-1.5.1-linux-x64/config/pipelines.yaml
simple-sample-pipeline:
  workers: 2
  delay: "5000"
  source:
    random:
  sink:
    - stdout:

Example 2:
simple-sample-pipeline:
  workers: 2 # the number of workers
  delay: 5000 # in milliseconds, how long workers wait between read attempts
  source:
    random:
  buffer:
    bounded_blocking:
      buffer_size: 1024 # max number of records the buffer accepts
      batch_size: 256 # max number of records the buffer drains after each read
  processor:
    - string_converter:
        upper_case: true
  sink:
    - stdout:

Example 3:
log-pipeline:
source:
http:
ssl: false
port: 5040
processor:
- grok:
match:
log: [ "%{IPORHOST:ip_address} %{USER:identity} %{USER:auth} %{TIMESTAMP_ISO8601:log_timestamp} %{WORD:http_method} %{URIPATH:uri_path} %{NUMBER:http_status} %{NUMBER:num_bytes}" ]
sink:
- opensearch:
hosts: [ "https://hdpmaster:9200" ]
insecure: true
username: admin
password: admin
index: idx-prepper

-- Run data prepper pipeline now
[opensearch@hdpmaster opensearch-data-prepper-jdk-1.5.1-linux-x64]$ ./data-prepper-tar-install.sh

Error: Paths to pipeline and data-prepper configuration files are required. Example:
./data-prepper-tar-install.sh config/example-pipelines.yaml config/example-data-prepper-config.yaml


[opensearch@hdpmaster opensearch-data-prepper-jdk-1.5.1-linux-x64]$ ./data-prepper-tar-install.sh /data/apps/opensearch-data-prepper-jdk-1.5.1-linux-x64/config/pipelines.yaml /data/apps/opensearch-data-prepper-jdk-1.5.1-linux-x64/config/example-data-prepper-config.yaml

Troubleshooting

ERROR:


Virtual memory

ERROR: [2] bootstrap checks failed

[1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

[2]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_cluster_manager_nodes / cluster.initial_master_nodes] must be configured

Opensearch uses a mmapfs directory by default to store its indices. The default operating system limits on mmap counts is likely to be too low, which may result in out of memory exceptions.

On Linux, you can increase the limits by running the following command as root:

sysctl -w vm.max_map_count=262144

To set this value permanently, update the vm.max_map_count setting in /etc/sysctl.conf. To verify after rebooting, run sysctl vm.max_map_count.

[hdpsysuser@hdpmaster ~]$ sudo sysctl -w vm.max_map_count=262144

vm.max_map_count = 262144

ERROR

[1]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_cluster_manager_nodes / cluster.initial_master_nodes] must be configured

if you are running Opensearch locally(single node) or just with a single node on the cloud then just use below config in your opensearch.yml to avoid the production check, and to make it work,

discovery.type: single-node

No comments: