What is Hive?
Apache Hive (originated in Facebook) is a Data warehouse system which is built to work on Hadoop to manage large datasets residing in HDFS. Hive is designed to enable easy data summarization, ad-hoc querying and analysis of large volumes of data using SQL At the same time, Hive's SQL gives users multiple places to integrate their own functionality to do custom analysis, such as User Defined Functions (UDFs). It is not designed for online transaction processing and best used for traditional data warehousing tasks.
Please be noted, Hadoop is a batch processing system and Hadoop jobs tend to have high latency and incur substantial overheads in job submission and scheduling. As a result - latency for Hive queries is generally very high (minutes) even when data sets involved are very small (say a few hundred megabytes). As a result it cannot be compared with systems such as Oracle where analyses are conducted on a significantly smaller amount of data but the analyses proceed much more iteratively with the response times between iterations being less than a few minutes. Hive aims to provide acceptable (but not optimal) latency for interactive data browsing, queries over small data sets or test queries. Hive is not designed for online transaction processing and does not offer
real-time queries and row level updates. It is best used for batch jobs over large sets of immutable data (like web logs).
Hive needs a metastore for its tables and partitions access so it requires a RDBMS eg; Oracle or MySQL. But we will use Derby instead. Hive provides a SQL dialect, called Hive Query Language (abbreviated HiveQL or just HQL) for querying data stored in a Hadoop cluster.
Environment
- Already running 3 node Hadoop cluster 2.7.3 (1 name node, 2 data nodes) on RHEL 7.3
- We have 1 more node (hadoopedge1) as Hadoop client
- We want to install Hive on hadoopedge1 node for analyzing data in HDFS using SQL
Installing Hive
Step 1: Check JAVA Installation
[hdpclient@hadoopedge1 ~]$ java -version
openjdk version "1.8.0_102"
OpenJDK Runtime Environment (build 1.8.0_102-b14)
[hdpclient@hadoopedge1 ~]$ java -version
openjdk version "1.8.0_102"
OpenJDK Runtime Environment (build 1.8.0_102-b14)
Step 2: Verifying Hadoop Installation
[hdpclient@hadoopedge1 ~]$ hadoop version
Hadoop 2.7.3
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff
Compiled by root on 2016-08-18T01:41Z
Compiled with protoc 2.5.0
From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4
This command was run using /usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/hadoop-common-2.7.3.jar
Step 3: Downloading and extracting Hive
Download hive from http://www-us.apache.org/dist/hive/hive-2.1.1/
Place downloaded file in the folder /usr/hadoopsw
Extracting and verifying Hive Archive
The following command is used to verify the download and extract the hive archive:
[root@hadoopedge1 hadoopsw]# pwd
/usr/hadoopsw
[root@hadoopedge1 hadoopsw]# ls -ltr
total 355900
drwxr-xr-x. 9 hdpclient hadoop_edge 149 Aug 18 04:49 hadoop-2.7.3
-rw-r--r--. 1 root
root 214092195 Feb 1 12:08 hadoop-2.7.3.tar.gz
-rw-r--r--. 1 root
root 149756462 Feb 12 14:07
apache-hive-2.1.1-bin.tar.gz
-rw-r--r--. 1 root
root 587484 Feb 12 14:10
db-derby-10.13.1.1-bin.tar.gz
[root@hadoopedge1 hadoopsw]# tar zxvf apache-hive-2.1.1-bin.tar.gz
[root@hadoopedge1 hadoopsw]# ls -ltr
[root@hadoopedge1 hadoopsw]# tar zxvf apache-hive-2.1.1-bin.tar.gz
[root@hadoopedge1 hadoopsw]# ls -ltr
total 355900
drwxr-xr-x. 9 hdpclient hadoop_edge 149 Aug 18 04:49 hadoop-2.7.3
-rw-r--r--. 1 root
root 214092195 Feb 1 12:08 hadoop-2.7.3.tar.gz
-rw-r--r--. 1 root
root 149756462 Feb 12 14:07
apache-hive-2.1.1-bin.tar.gz
-rw-r--r--. 1 root
root 587484 Feb 12 14:10
db-derby-10.13.1.1-bin.tar.gz
drwxr-xr-x. 9 root
root 171 Feb 12 18:08
apache-hive-2.1.1-bin
[root@hadoopedge1 hadoopsw]# chown -R hdpclient:hadoop_edge
apache-hive-2.1.1-bin
[root@hadoopedge1 hadoopsw]# ls -ltr
total 355900
drwxr-xr-x. 9 hdpclient hadoop_edge 149 Aug 18 04:49 hadoop-2.7.3
-rw-r--r--. 1 root root 214092195 Feb 1 12:08 hadoop-2.7.3.tar.gz
-rw-r--r--. 1 root
root 149756462 Feb 12 14:07
apache-hive-2.1.1-bin.tar.gz
-rw-r--r--. 1 root
root 587484 Feb 12 14:10
db-derby-10.13.1.1-bin.tar.gz
drwxr-xr-x. 9 hdpclient hadoop_edge 171 Feb 12 18:08
apache-hive-2.1.1-bin
Step 4: Setting up environment for Hive
You need to set up the Hive environment by appending the following lines to .bash_profile
[hdpclient@hadoopedge1 ~]$ vi .bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/.local/bin:$HOME/bin
export PATH
##Hadoop Variables
export HADOOP_HOME=/usr/hadoopsw/hadoop-2.7.3
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
#Hive Variables
export HIVE_HOME=/usr/hadoopsw/apache-hive-2.1.1-bin
export PATH=$PATH:$HIVE_HOME/bin
export CLASSPATH=$CLASSPATH:$HADOOP_HOME/lib/*:.
export CLASSPATH=$CLASSPATH:$HIVE_HOME/lib/*:.
[hdpclient@hadoopedge1 ~]$ source .bash_profile
[hdpclient@hadoopedge1 ~]$ echo $HIVE_HOME
/usr/hadoopsw/apache-hive-2.1.1-bin
Hive software installation is completed successfully. Now you require an external database server to configure Metastore. We will use Apache Derby database.
Installing/Configuring Apache Derby
Apache Derby, an Apache DB subproject, is a relational database implemented in Java. Its footprint is so small that you can easily embed it in any Java-based solution. In addition to its embedded framework, Derby supports a more familiar client/server framework with the Derby Network Server.
Step 1: Extracting and verifying Derby archive
Download apache derby from http://www-us.apache.org/dist//db/derby/db-derby-10.13.1.1/db-derby-10.13.1.1-bin.tar.gz
The following commands are used for extracting and verifying the Derby archive:
[root@hadoopedge1 hadoopsw]# tar zxvf db-derby-10.13.1.1-bin.tar.gz
[root@hadoopedge1 hadoopsw]# ls -ltr
total 373416
drwxr-xr-x. 9 hdpclient hadoop_edge 149 Aug 18 04:49 hadoop-2.7.3
-rw-r--r--. 1 root root 214092195 Feb 1 12:08 hadoop-2.7.3.tar.gz
-rw-r--r--. 1 root root 149756462 Feb 12 14:07 apache-hive-2.1.1-bin.tar.gz
-rw-r--r--. 1 root root 18523564 Feb 12 15:53 db-derby-10.13.1.1-bin.tar.gz
drwxr-xr-x. 9 hdpclient hadoop_edge 171 Feb 12 18:08 apache-hive-2.1.1-bin
drwxr-xr-x. 8 root root 164 Feb 12 18:32 db-derby-10.13.1.1-bin
[root@hadoopedge1 hadoopsw]# chown -R hdpclient:hadoop_edge db-derby-10.13.1.1-bin
Step 2: Setting up environment for Derby
[hdpclient@hadoopedge1 ~]$ vi .bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/.local/bin:$HOME/bin
export PATH
##Hadoop Variables
export HADOOP_HOME=/usr/hadoopsw/hadoop-2.7.3
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
#Hive Variables
export HIVE_HOME=/usr/hadoopsw/apache-hive-2.1.1-bin
export PATH=$PATH:$HIVE_HOME/bin
export CLASSPATH=$CLASSPATH:$HADOOP_HOME/lib/*:.
export CLASSPATH=$CLASSPATH:$HIVE_HOME/lib/*:.
#Derby Variables
export DERBY_HOME=/usr/hadoopsw/db-derby-10.13.1.1-bin
export PATH=$PATH:$DERBY_HOME/bin
export CLASSPATH=$CLASSPATH:$DERBY_HOME/lib/derby.jar:$DERBY_HOME/lib/derbyclient.jar:$DERBY_HOME/lib/derbytools.jar
export DERBY_OPTS=-Dij.protocol=jdbc:derby://hadoopedge1/
[hdpclient@hadoopedge1 ~]$ source .bash_profile
[hdpclient@hadoopedge1 ~]$ echo $DERBY_HOME
/usr/hadoopsw/db-derby-10.13.1.1-bin
Edit the hosts file and add entry for the hadoopedge1 node where we are installing hive
-- added in hosts file
[hdpsysuser@hdpmaster ~]$ sudo vi /etc/hosts
[hdpsysuser@hdpmaster ~]$ cat /etc/hosts
192.168.44.170 hdpmaster
192.168.44.171 hdpslave1
192.168.44.172 hdpslave2
192.168.44.161 hadoopedge1
--run ij utility located in DERBY_HOEM/bin
[hdpclient@hadoopedge1 bin]$ ./ij
ij version 10.13
ij>
Step 3: Run/verify the derby network server
[hdpclient@hadoopedge1 ~]$ /usr/hadoopsw/db-derby-10.13.1.1-bin/bin/startNetworkServer -h hadoopedge1 -p 1527 &
--get info about network derby server
[hdpclient@hadoopedge1 bin]$ ./NetworkServerControl sysinfo -h hadoopedge1
You can run the derby database as network server so that you can access the same database from any network node if you want.
[hdpclient@hadoopedge1 ~]$ /usr/hadoopsw/db-derby-10.13.1.1-bin/bin/startNetworkServer -h hadoopedge1 -p 1527 &
--get info about network derby server
[hdpclient@hadoopedge1 bin]$ ./NetworkServerControl sysinfo -h hadoopedge1
--get db info on network derby server
[hdpclient@hadoopedge1 bin]$ ./dblook -d 'jdbc:derby://hadoopedge1/hive_metastore_db' -- created in next step
[hdpclient@hadoopedge1 bin]$ ./dblook -d 'jdbc:derby://hadoopedge1/hive_metastore_db' -- created in next step
Step 4: Create database for Hive in derby database
[hdpclient@hadoopedge1 bin]$ ./ij
ij version 10.13
ij>
--create database on derby
ij> CONNECT 'jdbc:derby://hadoopedge1:1527/hive_metastore_db;create=true';
hive_metastore_db is created in /home/hdpclient now
--create table using ij for testing purpose
ij> CREATE TABLE TABLE1(ID INT PRIMARY KEY,NAME VARCHAR(14));
0 rows inserted/updated/deleted
-- Insert three records into the table.
ij> INSERT INTO TABLE1 VALUES
(100,'ONE HUNDRED'),(200,'TWO HUNDRED'),(300,'THREE HUNDRED');
3 rows inserted/updated/deleted
-- Select all of the records in the table.
ij> SELECT * FROM TABLE1;
ID |NAME
--------------------------
100 |ONE HUNDRED
200 |TWO HUNDRED
300 |THREE HUNDRED
3 rows selected
ij>
ij> SELECT * FROM TABLE1 WHERE ID=200;
ID |NAME
--------------------------
200 |TWO HUNDRED
1 row selected
OPTIONAL: Now as Derby db (hive_meta_store) is functioning fine for you , you can access this database from any other machine if you want , just you need to use some client or you can install Derby and use ji utility. I used the same derby database from another machine in my network i.e; hdpslave1 as I wanted to use it as my Hive client. I installed hive and derby on the hdpclient and then performed below steps.
[hdpsysuser@hdpslave1 ~]$ tar -xzf db-derby-10.13.1.1-bin.tar.gz
[hdpsysuser@hdpslave1 ~]$ nc -v hadoopedge1 1527
Ncat: Version 6.40 ( http://nmap.org/ncat )
Ncat: Could not resolve hostname "hadoopedg1": Name or service not known. QUITTING.
--edit the hosts file and add hadoopedge1 entry and now it is connected
[hdpsysuser@hdpslave1 ~]$ nc -v hadoopedge1 1527
Ncat: Version 6.40 ( http://nmap.org/ncat )
Ncat: Connected to 192.168.44.161:1527.
[hdpsysuser@hdpslave1 ~]$ cd /usr/hadoopsw/db-derby-10.13.1.1-bin/bin/
[hdpsysuser@hdpslave1 bin]$ ./NetworkServerControl ping -h hadoopedge1 -p 1527
Thu Feb 23 11:52:06 AST 2017 : Connection obtained for host: hadoopedge1, port number 1527.
[hdpsysuser@hdpslave1 bin]$ ./ij
ij version 10.13
ij>
ij> CONNECT 'jdbc:derby://hadoopedge1:1527/hive_metastore_db;';
ij> show tables;
TABLE_SCHEM |TABLE_NAME |REMARKS
------------------------------------------------------------------
SYS |SYSALIASES |
SYS |SYSCHECKS |
SYS |SYSCOLPERMS |
SYS |SYSCOLUMNS |
ij> select * from table1;
ID |NAME
--------------------------
100 |ONE HUNDRED
200 |TWO HUNDRED
300 |THREE HUNDRED
3 rows selected
ij>
Here you see your Derby DB is accessible from network node, you can configure Hive on that network node also.
Configuring Metasotre for Hive
Configuring Metastore means specifying to Hive where the database is stored. You can do this by editing the hive-site.xml file, which is in the $HIVE_HOME/conf directory. First of all, copy the template file using the following command:
Step 1: Set the hive-env.sh
To configure Hive with Hadoop, you need to edit the hive-env.sh file, which is placed in the $HIVE_HOME/conf directory. Use the following commands, redirect to Hive config folder and copy the template file:
[hdpclient@hadoopedge1 ~]$ cd $HIVE_HOME/conf
[hdpclient@hadoopedge1 conf]$ cp hive-env.sh.template hive-env.sh
Edit the hive-env.sh file by appending the following line:
vi hive-env.sh
vi hive-env.sh
export HADOOP_HOME=$HADOOP_HOME
Step 2: Set properties in hive-site.xml and initialize schema for hive
Edit hive-site.xml and append/modify the following lines between the <configuration> and </configuration> tags:
[hdpclient@hadoopedge1 conf]$ vi hive-site.xml
[hdpclient@hadoopedge1 ~]$ cat /usr/hadoopsw/apache-hive-2.1.1-bin/conf/hive-site.xml
<configuration>
<property>
<name>hive.metastore.local</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<value>jdbc:derby://hadoop1:1527/hive_metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.ClientDriver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<configuration>
<property>
<name>hive.metastore.local</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name><value>jdbc:derby://hadoop1:1527/hive_metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.ClientDriver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
</configuration>
[hdpsysuser@hdpmaster ~]$ hdfs dfs -mkdir -p /user/hive/warehouse
[hdpsysuser@hdpmaster ~]$ hdfs dfs -mkdir -p /tmp/hive
[hdpsysuser@hdpmaster ~]$ hdfs dfs -chmod 777 /tmp
[hdpsysuser@hdpmaster ~]$ hdfs dfs -chmod 777 /user/hive/warehouse
[hdpsysuser@hdpmaster ~]$ hdfs dfs -chmod 777 /tmp/hive
[hdpclient@hadoopedge1 ~]$ schematool -dbType derby -initSchema hive_metastore_db
which: no hbase in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/hdpclient/.local/bin:/home/hdpclient/bin:/usr/java/default/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/apache-hive-2.1.1-bin/bin:/usr/hadoopsw/db-derby-10.13.1.1-bin/bin)
Metastore connection URL: jdbc:derby://hadoopedge1:1527/hive_metastore_db;create=true
Metastore Connection Driver : org.apache.derby.jdbc.ClientDriver
Metastore connection User: APP
Starting metastore schema initialization to 2.1.0
Initialization script hive-schema-2.1.0.derby.sql
Initialization script completed
schemaTool completed
Optional: Local storage for Hive (no Hadoop Required if using Presto)
If you wish to use the local file system instead of HDFS, you could make changes to the below properties. You may work with local storage if you use presto as your query layer. This way you can use other storages as well like minio, ceph etc.
--Config hive to use local storage hive-site.xml
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>file:///data/hivelocalfs</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>fs.default.name</name>
<value>file:///data/hivelocalfs</value>
</property>
Optional: mySQL as metastore
You can use mysql instead of derby if you want.
Download mysql-connector-java-5.0.5.jar file and copy it to $HIVE_HOME/lib directory.
Create database in mysql to have it as metastore for Hive.
[root@en01 hdpclient]# mysql -u root -p
mysql> create database hivemetastore;
mysql>CREATE USER 'hiveuser'@'en01' IDENTIFIED BY 'P@ssw0rd';
mysql>GRANT all on *.* to 'hiveuser'@'en01' identified by 'P@ssw0rd' with grant option;
mysql> flush privileges;
mysql>GRANT all on *.* to 'hiveuser'@'en01' identified by 'P@ssw0rd' with grant option;
mysql> flush privileges;
[hdpclient@en01 ~]$ schematool -initSchema -dbType mysql -userName hiveuser -passWord 'P@ssw0rd' -verbose
<configuration>
<property>
<name>hive.metastore.local</name>
<value>faslse</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://en01:3306/hivemetastore?createDatabaseIfNotExist=true</value>
<description>metadata is stored in a MySQL server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>MySQL JDBC driver class</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hiveuser</value>
<description>user name for connecting to mysql server </description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>P@ssw0rd</value>
<description>password for connecting to mysql server </description>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
<description> Setting this property to true will have HiveServer2 execute Hive operations as the user making the calls to it. </description>
</property>
</configuration>
Hive metastore service
The Hive metastore could be made available as a service. All you need to do is run the following command on the terminal of your machine:
hive --service metastore &
hive.metastore.uri=thrift://localhost:9083
Try again to run hive
You got the hive prompt now, run some hive commands;
Note: If you still encounter any hive related error then run the hive in debug mode to get more details eg;
[hdpclient@hadoopedge1 ~]$ hive -hiveconf hive.root.logger=DEBUG,console
The Hive metastore could be made available as a service. All you need to do is run the following command on the terminal of your machine:
hive --service metastore &
In the case of a remote metastore configuration, all clients connect to the metastore service to query the underlying datastore (MySQL, Oracle, and so on). The communication is done through the Thrift protocol. At the client's side, a user needs to add the below configuration atleast to make the client connect to a metastore service.
hive.metastore.uri=thrift://localhost:9083
Step 3: Run hive now
[hdpclient@hadoopedge1 ~]$ hive
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/db-derby-10.13.1.1-bin/bin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/apache-hive-2.1.1-bin/bin:/usr/hadoopsw/db-derby-10.13.1.1-bin/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hadoopsw/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
[Fatal Error] hive-site.xml:19:1: XML document structures must start and end within the same entity.
You got error SLF4J bindings because log4j-slf4j-impl-2.4.1.jar is existing at two locations (both locations are included in classpath) HIVE_HOME and HADOOP_HOME
Delete log4j-slf4j-impl-2.4.1.jar from /usr/hadoopsw/apache-hive-2.1.1-bin/lib
[hdpclient@hadoopedge1 ~]$ rm /usr/hadoopsw/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar
[hdpclient@hadoopedge1 ~]$ hive
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/db-derby-10.13.1.1-bin/bin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/apache-hive-2.1.1-bin/bin:/usr/hadoopsw/db-derby-10.13.1.1-bin/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hadoopsw/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
[Fatal Error] hive-site.xml:19:1: XML document structures must start and end within the same entity.
You got error SLF4J bindings because log4j-slf4j-impl-2.4.1.jar is existing at two locations (both locations are included in classpath) HIVE_HOME and HADOOP_HOME
Delete log4j-slf4j-impl-2.4.1.jar from /usr/hadoopsw/apache-hive-2.1.1-bin/lib
[hdpclient@hadoopedge1 ~]$ rm /usr/hadoopsw/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar
Try again to run hive
[hdpclient@hadoopedge1 ~]$ hive
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/db-derby-10.13.1.1-bin/bin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/apache-hive-2.1.1-bin/bin:/usr/hadoopsw/db-derby-10.13.1.1-bin/bin)
Logging initialized using configuration in jar:file:/usr/hadoopsw/apache-hive-2.1.1-bin/lib/hive-common-2.1.1.jar!/hive-log4j2.properties Async: true
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:591)
... <<output truncated>>
... <<output truncated>>
Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "BONECP" plugin to create a ConnectionPool gave an error : The specified datastore driver ("org.apache.derby.jdbc.ClientDriver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.
From the error seems issue with classpath
[hdpclient@hadoopedge1 ~]$ ll /usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/derb*
ls: cannot access /usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/derb*: No such file or directory
HADOOP_HOME/lib is not having any derby related jar (derbyclient) which is required
[hdpclient@hadoopedge1 ~]$ cp /usr/hadoopsw/db-derby-10.13.1.1-bin/lib/derbyclient.jar /usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/
[hdpclient@hadoopedge1 ~]$ ll /usr/hadoopsw/apache-hive-2.1.1-bin/lib/derb*
-rw-r--r--. 1 hdpsysuser hdpsysuser 2838580 Nov 28 21:11 /usr/hadoopsw/apache-hive-2.1.1-bin/lib/derby-10.10.2.0.jar
Run hive again now
[hdpsysuser@hdpslave1 bin]$ hive
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/db-derby-10.13.1.1-bin/bin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/apache-hive-2.1.1-bin/bin:/usr/hadoopsw/db-derby-10.13.1.1-bin/bin)
Logging initialized using configuration in jar:file:/usr/hadoopsw/apache-hive-2.1.1-bin/lib/hive-common-2.1.1.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>
Note: If you still encounter any hive related error then run the hive in debug mode to get more details eg;
[hdpclient@hadoopedge1 ~]$ hive -hiveconf hive.root.logger=DEBUG,console
hive> show tables;
OK
dept
Time taken: 1.189 seconds, Fetched: 1 row(s)
hive>
hive> select * from dept;
OK
1 Ali NULL
2 Zeeshan NULL
3 Saleem NULL
Time taken: 1.403 seconds, Fetched: 3 row(s)
hive>
Congratulations! your hive is working perfectly now.
3 comments:
Excellent
Thanks for youre Hive Installation and Configuration step. Excellent work!
I have one doubt regarding ==> which: no hbase in
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/db-derby-10.13.1.1-bin/bin:/usr/hadoopsw/.local/bin:/usr/hadoopsw/bin:/usr/hadoopsw/hadoop-2.7.3/sbin:/usr/hadoopsw/hadoop-2.7.3/bin:/usr/hadoopsw/apache-hive-2.1.1-bin/bin:/usr/hadoopsw/db-derby-10.13.1.1-bin/bin)
==> should we ignore this error and what is this no hbase in means?
Thank you
DO YOU NEED A LOAN? patialalegitimate515@gmail.com
We give out loans with an affordable interest rate of 2% Captain One provide Excellent and Professional Financial Services and we are known and genuine money lenders across the globe
Our services include the following:
*Student Loans
Truck Loan
Personal Loan
Debt consolidation loan
Car Loans
Business Loan
Student Loan
Mortgage Loan
Refinancing Loan
Home Loan
Improvement loan
Construction Loan
Looking forward to receive your reply and I will advise you on how to get the fund to your bank account no matter your location no collateral for more information on how to get started: (Whats App) number: +919394133968 patialalegitimate515@gmail.com Mr Jeffery
Post a Comment