Please see my other blog for Oracle EBusiness Suite Posts - EBMentors

Search This Blog

Note: All the posts are based on practical approach avoiding lengthy theory. All have been tested on some development servers. Please don’t test any post on production servers until you are sure.

Saturday, February 18, 2017

Setting up single node Apache Hadoop Cluster 2.7.3 on RHEL 7.3

The purpose of this post is to setup the single node Apache Hadoop Cluster 2.7.3 for BigData enthusiasts who want to learn the parallel computation to tackle large amount of datasets. Knowledge of Linux is the prerequisite for this post.

Below is the information for the environment that I've , I'm using RHV.
RHV,  Red Hat Enterprise Linux Server 7.3
Master Node:   hostname è hdpmaster           IPè192.166.44.170     rootpwdè hadoop123

Software Locations : 

Java installation è  /usr/java
Hadoop Installation
è /usr/hadoopsw

Hadoop File System Storage Locations:

DataNodeè /opt/volume/hadoop_tmp/hdfs/datanode
è /opt/volume/hadoop_tmp/hdfs/namenode

Hadoop System related objects:

Hadoop system user group è hadoop_grp
Dedicated hadoop system user è hdpsysuser

1- Preparing the OS environment

a) Check OS release
$ cat /etc/*-release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.3 (Maipo)"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.3 (Maipo)"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
Red Hat Enterprise Linux Server release 7.3 (Maipo)

Red Hat Enterprise Linux Server release 7.3 (Maipo)

b) Set Hostname of machine
[root@hdpmaster ~]# hostnamectl set-hostname hdpmaster
[root@hdpmaster ~]# hostname

c)Add machine name & IP in hostfile
[root@hdpmaster ~]# vi /etc/hosts   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6  hdpmaster

d) Creating a Hadoop user for accessing HDFS and MapReduce
To avoid security issues, It is recommended to setup new Hadoop user group and user account to deal with all Hadoop related activities. I will create Hadoop_grp as system group and hdpsysuser as system user, I’ll use it for Hadoop installation path and working environment.

[root@hdpmaster ~]# groupadd hadoop_grp
[root@hdpmaster ~]# useradd -d /usr/hadoopsw hdpsysuser
[root@hdpmaster ~]# passwd hdpsysuser

e)ADD hdpsysuser USER IN SUDOER list
You can add the hdpsysuser in the sudoer list so that it will be able to run the commands which require the root privileges. First add the hadoop system user in the required groups and then use visudo utility to edit the sudoer list un-comment the related line as mentioned below

[root@hdpmaster ~]# usermod -aG wheel hdpsysuser
[root@hdpmaster ~]# usermod -aG hadoop_grp hdpsysuser

 [root@hdpmaster ~]# visudo
## Same thing without a password
#%wheel        ALL=(ALL)       NOPASSWD: ALL  ##
uncomment this line

Now test sudo functionality by connecting to hdpsysuser
[hdpsysuser@hdpmaster ~]$ whoami
[hdpsysuser@hdpmaster ~]$ sudo whoami

f) Configuring SSH for ‘hdpsysuser’ Hadoop account
The hadoop control scripts rely on SSH to peform cluster-wide operations. For example, there is a script for stopping and starting all the daemons in the clusters. To work seamlessly, SSH needs to be setup to allow password-less login for the hadoop user from machines in the cluster.

Hadoop requires SSH access to manage its nodes, i.e. remote machines plus your local machine. For our single-node setup of Hadoop, we therefore need to configure SSH access to localhost for the 'hdpsysuser' user we created earlier.

Check the status of the SSH service
[root@hdpmaster ~]# systemctl status sshd.service
[root@hdpmaster ~]# systemctl start sshd.service   ##start if not started

Generate SSH key
[hdpsysuser@hdpmaster ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/usr/hadoopsw/.ssh/id_rsa):
Created directory '/usr/hadoopsw/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /usr/hadoopsw/.ssh/id_rsa.
Your public key has been saved in /usr/hadoopsw/.ssh/
The key fingerprint is:
25:5f:ed:f2:81:38:86:dc:a3:8a:6e:a2:70:c0:c1:4a hdpsysuser@hdpmaster
The key's randomart image is:
+--[ RSA 2048]----+
|                 |
|.            .   |
| E      . . . .  |
|+ .    . * o o   |
|o.      S B o o  |
| .       o o o . |
|. .     .     .  |
|... .. .         |
|.. +o .          |
To copy the public key to a remote machine (in our case same machine ie;hdpmaster), issue a command in the following format:
ssh-copy-id user@hostname
This will copy the most recently modified ~/.ssh/id*.pub public key if it is not yet installed. 

[hdpsysuser@hdpmaster ~]$ ssh-copy-id hdpsysuser@hdpmaster
The authenticity of host 'hdpmaster (' can't be established.
ECDSA key fingerprint is 04:86:d2:4c:2d:3e:38:1c:61:f4:39:24:52:f4:09:4c.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hdpsysuser@hdpmaster's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'hdpsysuser@hdpmaster'"
and check to make sure that only the key(s) you wanted were added.

Confirm ssh working
[hdpsysuser@hdpmaster ~]$ ssh hdpmaster
Last login: Mon Feb  6 16:19:19 2017 from

2- Installing and Configuring Java

Apache Hadoop is java framework, we need java installed on our machine to get it run over operating system. Download java ( and place in /usr/java directory

[root@hdpmaster ~]# cd /usr/java
[root@hdpmaster java]# ls
[root@hdpmaster java]# rpm -Uvh  jdk-8u121-linux-x64.rpm
Preparing...                          ################################# [100%]
Updating / installing...
   1:jdk1.8.0_121-2000:1.8.0_121-fcs  ################################# [100%]
Unpacking JAR files...
[root@hdpmaster java]#

Please note that REH7.3 already has open JDK , you can use it also.

Confirm Java version
[root@hdpmaster java]# java -version
openjdk version "1.8.0_102"
OpenJDK Runtime Environment (build 1.8.0_102-b14)
OpenJDK 64-Bit Server VM (build 25.102-b14, mixed mode)

Configuration of Java

[hdpsysuser@hdpmaster ~]$ cat .bash_profile
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc

# User specific environment and startup programs


export PATH

Add below lines in .bash_profile at end of file

[root@hdpmaster ~]# vi .bash_profile
###################Inside .bash_profile########################
## JAVA env variables
export JAVA_HOME=/usr/java/default
export PATH=$PATH:$JAVA_HOME/bin
 export CLASSPATH=.:$JAVA_HOME/jre/lib:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar    

Confirm java environment variables
[root@hdpmaster ~]# echo $JAVA_HOME

[root@hdpmaster ~]# source .bash_profile
[root@hdpmaster ~]# echo $JAVA_HOME

3- Installing and Configuring Hadoop

a) Installing Hadoop

Download latest Hadoop version (, I downloaded in /usr/hadoopsw location.

b) Extract Hadoop source
[root@hdpmaster ~]# cd /usr/hadoopsw/
[root@hdpmaster hadoopsw]# tar xfz hadoop-2.7.3.tar.gz

It will create the folder hadoop-2.7.3

[root@hdpmaster hadoopsw]# ls
hadoop-2.7.3  hadoop-2.7.3.tar.gz

Optional: Create symbolic link for easy access to Hadoop installation directory
[root@hdpmaster hadoopsw]# ln -s /usr/hadoopsw/hadoop-2.7.3 /hadoop
[root@hdpmaster /]# cd /
[root@hdpmaster /]# ll
total 20
-rw-r--r--.   1 root root    0 Feb  6 11:58 1
lrwxrwxrwx.   1 root root    7 Feb  6 11:54 bin -> usr/bin
dr-xr-xr-x.   3 root root 4096 Feb  6 12:10 boot
drwxr-xr-x.  21 root root 3200 Feb  6 12:09 dev
drwxr-xr-x. 140 root root 8192 Feb  6 16:26 etc
lrwxrwxrwx.   1 root root   26 Feb  6 17:54 hadoop -> /usr/hadoopsw/hadoop-2.7.3
drwxr-xr-x.   3 root root   18 Feb  6 12:30 home

c) Assign ownership of folder to ‘hdpsysuser’ Hadoop account
[root@hdpmaster hadoopsw]# chown -R hdpsysuser:hadoop_grp /usr/hadoopsw/hadoop-2.7.3

d) Create Hadoop temp directories for Namenode and Datanode
[root@hdpmaster hadoopsw]# mkdir -p /opt/volume/hadoop_tmp/hdfs/namenode
[root@hdpmaster hadoopsw]# mkdir -p /opt/volume/hadoop_tmp/hdfs/datanode
Again assign ownership of this hadoop_tmp folder to hdpsysuser Hadoop account

[root@hdpmaster]# chown hdpsysuser:hadoop_grp -R /opt/volume/hadoop_tmp/

Confirm permissions
[hdpsysuser@hdpmaster volume]$ ls -al
total 0
drwxr-xr-x. 3 root       root       24 Feb  6 17:31 .
drwxr-xr-x. 4 root       root       30 Feb  6 17:31 ..
drwxr-xr-x. 3 hdpsysuser hadoop_grp 18 Feb  6 17:31 hadoop_tmp

e) Configuring Hadoop Environment Variables
Switch to hdpsysuser and add the hadoop environment variables in .bash_profile for hdpsysuser.

[hdpsysuser@hdpmaster ~]$ vi .bash_profile
export HADOOP_HOME=/usr/hadoopsw/hadoop-2.7.3
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"

Verify the changes
[hdpsysuser@hdpmaster ~]$ source .bash_profile
[hdpsysuser@hdpmaster ~]$ echo $HADOOP_HOME

f) Update Hadoop configuration files

In order to work properly Hadoop needs to the set the variables using configuration (XML) files.

[hdpsysuser@hdpmaster ~]$ cd /hadoop/etc/hadoop/

Remember here /hadoop is the symbolic link for  /usr/hadoopsw/hadoop-2.7.3 created earlier

[hdpsysuser@hdpmaster hadoop]$ pwd

[hdpsysuser@hdpmaster hadoop]$ ls
configuration.xsl   mapred-queues.xml.template
container-executor.cfg      httpfs-signature.secret  mapred-site.xml.template
core-site.xml               httpfs-site.xml          slaves
hadoop-env.cmd              kms-acls.xml             ssl-client.xml.example                    ssl-server.xml.example     yarn-env.cmd   kms-site.xml   
hadoop-policy.xml          yarn-site.xml
hdfs-site.xml               mapred-env.cmd

Configuration file :

This file specifies environment variables that affect the JDK used by Hadoop Daemon (bin/hadoop). As Hadoop framework is written in Java and uses Java Runtime environment, one of the important environment variables for Hadoop daemon is $JAVA_HOME in This variable directs Hadoop daemon to the Java path in the system. This file is also used for setting another Hadoop daemon execution environment such as heap size (HADOOP_HEAP), hadoop home (HADOOP_HOME), log file location (HADOOP_LOG_DIR), etc. For the simplicity of understanding the cluster setup, I have configured only necessary parameter to start a cluster.

[hdpsysuser@hdpmaster]$ vi /hadoop/etc/hadoop/
export JAVA_HOME=/usr/java/default/

Configuration file : core-site.xml

This file informs Hadoop daemon where NameNode runs in the cluster. It contains the configuration settings for Hadoop Core such as I/O settings that are common to HDFS and MapReduce.

Add the below code between configuration tag <configuration> ... </configuration>.

[hdpsysuser@hdpmaster hadoop]$ vi core-site.xml

<property> <name>hadoop.proxyuser.hdpsysuser.groups</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hdpsysuser.hosts</name> <value>*</value> </property>

Configuration file : hdfs-site.xml

The hdfs-site.xml file contains the configuration settings for HDFS daemons; the NameNode, the Secondary NameNode, and the DataNodes. We can configure hdfs-site.xml to specify default block replication and permission checking on HDFS.

Add the below code between configuration tag <configuration> ... </configuration>.

[hdpsysuser@hdpmaster hadoop]$ vi hdfs-site.xml


Configuration file : mapred-site.xml

The mapred-site.xml file contains the configuration settings for MapReduce daemons; the job tracker and the task-trackers. A new configuration option for Hadoop 2 is the capability to specify a framework name for MapReduce, setting the property. In this install we will use the value of "yarn" to tell MapReduce that it will run as a YARN application. 

[hdpsysuser@hdpmaster hadoop]$ vi mapred-site.xml

Configuration file : yarn-site.xml

YARN compute framework configuration options are stored in this file that overrides the default values for YARN parameters.

[hdpsysuser@hdpmaster hadoop]$ vi yarn-site.xml

<!-- Site specific YARN configuration properties -->

Configuration file : slaves

The ‘slaves’ file at Master node contains a list of hosts, one per line, that are to host Data Node. ‘slaves’ file at Slave node contains only its own IP address and not of any other Data Nodes in the cluster.

Replace the localhost value from slaves file to point to your machine hostname set up at the beginning.

[hdpsysuser@hdpmaster hadoop]$ vi slaves
[hdpsysuser@hdpmaster hadoop]$
cat slaves

in our case hapmaster is slave too.

g) Format Namenode

Hadoop NameNode is the centralized place of an HDFS file system which keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. In short, it keeps the metadata related to datanodes. When we format namenode it formats the meta-data related to data-nodes. By doing that, all the information on the datanodes are lost and they can be reused for new data. Below command formats your file system at the location specified in hdfs-site.xml

[hdpsysuser@hdpmaster hadoop]$ hdfs namenode -format
17/02/07 12:55:38 INFO namenode.NameNode: STARTUP_MSG:
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = hdpmaster/
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.7.3
STARTUP_MSG:   classpath = /usr/hadoopsw/hadoop-2.7.3/etc/hadoop:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/stax-api-1.0-2.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/activation-1.1.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/asm-3.2.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/slf4j-api-1.7.10.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/avro-1.7.4.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/paranamer-2.3.jar:/usr/hadoopsw/hadoop-2.7.3/share/hadoop/common/lib/snappy-java-*.jar
STARTUP_MSG:   build = -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff; compiled by 'root' on 2016-08-18T01:41Z
STARTUP_MSG:   java = 1.8.0_121
17/02/07 12:55:38 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
17/02/07 12:55:38 INFO namenode.NameNode: createNameNode [-format]
Formatting using clusterid: CID-ea1208e4-20db-4aad-8a80-74f9e77ea02a
17/02/07 12:55:39 INFO namenode.FSNamesystem: No KeyProvider found.
17/02/07 12:55:39 INFO namenode.FSNamesystem: fsLock is fair:true
17/02/07 12:55:39 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
17/02/07 12:55:39 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
17/02/07 12:55:39 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
17/02/07 12:55:39 INFO blockmanagement.BlockManager: The block deletion will start around 2017 Feb 07 12:55:39
17/02/07 12:55:39 INFO util.GSet: Computing capacity for map BlocksMap
17/02/07 12:55:39 INFO util.GSet: VM type       = 64-bit
17/02/07 12:55:39 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB
17/02/07 12:55:39 INFO util.GSet: capacity      = 2^21 = 2097152 entries
17/02/07 12:55:39 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
17/02/07 12:55:39 INFO blockmanagement.BlockManager: defaultReplication         = 1
17/02/07 12:55:39 INFO blockmanagement.BlockManager: maxReplication             = 512
17/02/07 12:55:39 INFO blockmanagement.BlockManager: minReplication             = 1
17/02/07 12:55:39 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
17/02/07 12:55:39 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
17/02/07 12:55:39 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
17/02/07 12:55:39 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
17/02/07 12:55:39 INFO namenode.FSNamesystem: fsOwner             = hdpsysuser (auth:SIMPLE)
17/02/07 12:55:39 INFO namenode.FSNamesystem: supergroup          = supergroup
17/02/07 12:55:39 INFO namenode.FSNamesystem: isPermissionEnabled = true
17/02/07 12:55:39 INFO namenode.FSNamesystem: HA Enabled: false
17/02/07 12:55:39 INFO namenode.FSNamesystem: Append Enabled: true
17/02/07 12:55:40 INFO util.GSet: Computing capacity for map INodeMap
17/02/07 12:55:40 INFO util.GSet: VM type       = 64-bit
17/02/07 12:55:40 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB
17/02/07 12:55:40 INFO util.GSet: capacity      = 2^20 = 1048576 entries
17/02/07 12:55:40 INFO namenode.FSDirectory: ACLs enabled? false
17/02/07 12:55:40 INFO namenode.FSDirectory: XAttrs enabled? true
17/02/07 12:55:40 INFO namenode.FSDirectory: Maximum size of an xattr: 16384
17/02/07 12:55:40 INFO namenode.NameNode: Caching file names occuring more than 10 times
17/02/07 12:55:40 INFO util.GSet: Computing capacity for map cachedBlocks
17/02/07 12:55:40 INFO util.GSet: VM type       = 64-bit
17/02/07 12:55:40 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB
17/02/07 12:55:40 INFO util.GSet: capacity      = 2^18 = 262144 entries
17/02/07 12:55:40 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
17/02/07 12:55:40 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
17/02/07 12:55:40 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 30000
17/02/07 12:55:40 INFO metrics.TopMetrics: NNTop conf: = 10
17/02/07 12:55:40 INFO metrics.TopMetrics: NNTop conf: = 10
17/02/07 12:55:40 INFO metrics.TopMetrics: NNTop conf: = 1,5,25
17/02/07 12:55:40 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
17/02/07 12:55:40 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
17/02/07 12:55:40 INFO util.GSet: Computing capacity for map NameNodeRetryCache
17/02/07 12:55:40 INFO util.GSet: VM type       = 64-bit
17/02/07 12:55:40 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB
17/02/07 12:55:40 INFO util.GSet: capacity      = 2^15 = 32768 entries
17/02/07 12:55:40 INFO namenode.FSImage: Allocated new BlockPoolId: BP-728904171-
17/02/07 12:55:40 INFO common.Storage: Storage directory /opt/volume/hadoop_tmp/hdfs/namenode has been successfully formatted.
17/02/07 12:55:40 INFO namenode.FSImageFormatProtobuf: Saving image file /opt/volume/hadoop_tmp/hdfs/namenode/current/fsimage.ckpt_0000000000000000000 using no compression
17/02/07 12:55:40 INFO namenode.FSImageFormatProtobuf: Image file /opt/volume/hadoop_tmp/hdfs/namenode/current/fsimage.ckpt_0000000000000000000 of size 357 bytes saved in 0 seconds.
17/02/07 12:55:40 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
17/02/07 12:55:40 INFO util.ExitUtil: Exiting with status 0
17/02/07 12:55:40 INFO namenode.NameNode: SHUTDOWN_MSG:
SHUTDOWN_MSG: Shutting down NameNode at hdpmaster/
[hdpsysuser@hdpmaster hadoop]$

h) Start Hadoop daemons

Start hdfs daemons:

Hadoop commands are located in $HADOOP_HOME/sbin directory. In order to start Hadoop services run the below commands on your console:

[hdpsysuser@hdpmaster hadoop]$

Starting namenodes on [hdpmaster]
hdpmaster: starting namenode, logging to /usr/hadoopsw/hadoop-2.7.3/logs/hadoop-hdpsysuser-namenode-hdpmaster.out
hdpmaster: starting datanode, logging to /usr/hadoopsw/hadoop-2.7.3/logs/hadoop-hdpsysuser-datanode-hdpmaster.out
Starting secondary namenodes []
The authenticity of host ' (' can't be established.
ECDSA key fingerprint is 04:86:d2:4c:2d:3e:38:1c:61:f4:39:24:52:f4:09:4c.
Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '' (ECDSA) to the list of known hosts. starting secondarynamenode, logging to /usr/hadoopsw/hadoop-2.7.3/logs/hadoop-hdpsysuser-secondarynamenode-hdpmaster.out

Check the hdfs cluster services  using jps.
[hdpsysuser@hdpmaster hadoop]$ jps
32162 Jps
31748 DataNode
31929 SecondaryNameNode
31611 NameNode

Start MapReduce (YARN) daemons :

[hdpsysuser@hdpmaster hadoop]$
starting yarn daemons
starting resourcemanager, logging to /usr/hadoopsw/hadoop-2.7.3/logs/yarn-hdpsysuser-resourcemanager-hdpmaster.out
hdpmaster: starting nodemanager, logging to /usr/hadoopsw/hadoop-2.7.3/logs/yarn-hdpsysuser-nodemanager-hdpmaster.out

Check hdfs and yarn services
[hdpsysuser@hdpmaster hadoop]$ jps
31748 DataNode
32324 NodeManager
31929 SecondaryNameNode
32218 ResourceManager
31611 NameNode
32638 Jps, and, start/stop HDFS and YARN daemons separately on all the nodes from the master machine. It is advisable to use these commands now over & namenode/datanode and resourcemanager : To start individual daemons on an individual machine manually. You need to go to a particular node and issue these commands.

Use case : Suppose you have added a new data node to your cluster and you need to start the DN daemon only on that machine.

bin/ start datanode

i) Track/Veify Hadoop using web UI

If you wish to track Hadoop MapReduce as well as HDFS, you can try exploring Hadoop web view of ResourceManager and NameNode which are usually used by hadoop administrators. Open your default browser and visit to the following links.

For ResourceManager – http://hdpmaster:8088

For NameNode – http://hdpmaster:50070

Note: if you are unable to access the UI in web browser, check ports and firewall on Linux OS.

-- current listening ports
root@hdpmaster ~]# netstat -tulpn | grep LISTEN
tcp        0      0 *               LISTEN      8460/java
tcp        0      0 *               LISTEN      8460/java
tcp        0      0 *               LISTEN      8460/java
tcp        0      0*               LISTEN      8460/java
tcp        0      0*               LISTEN      8323/java
tcp        0      0 *               LISTEN      8323/java
tcp        0      0    *               LISTEN      777/sshd

tcp6       0      0 :::22                   :::*                    LISTEN      777/sshd

-- Check Firewall
[root@hdpmaster ~]# service iptables status
Redirecting to /bin/systemctl status iptables.service
● iptables.service - IPv4 firewall with iptables
   Loaded: loaded (/usr/lib/systemd/system/iptables.service; enabled; vendor preset: disabled)
   Active: active (exited) since Fri 2018-03-16 00:34:28 UTC; 1h 36min left
 Main PID: 483 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/iptables.service

Mar 16 00:34:27 localhost.localdomain systemd[1]: Starting IPv4 firewall with iptables...
Mar 16 00:34:28 localhost.localdomain iptables.init[483]: iptables: Applying firewall rules: [  OK  ]
Mar 16 00:34:28 localhost.localdomain systemd[1]: Started IPv4 firewall with iptables.

[root@hdpmaster ~]# service iptables stop

Congratulations! You have successfully setup Apache Hadoop single node cluster in Redhat Linux.