Please see my other blog for Oracle EBusiness Suite Posts - EBMentors

Search This Blog

Note: All the posts are based on practical approach avoiding lengthy theory. All have been tested on some development servers. Please don’t test any post on production servers until you are sure.

Saturday, February 25, 2017

Hive for Oracle Developers and DBAs - Part I

The Hadoop ecosystem emerged as a cost-effective way of working with large data sets. It imposes a particular programming model, called MapReduce, for breaking up computation tasks into units that can be distributed around a cluster of commodity, server class hardware, thereby providing cost-effective, horizontal scalability.

Thursday, February 23, 2017

Hadoop Administration: Accessing HDFS (File system & Shell Commands)

You can access HDFS in many different ways. HDFS provides a native Java application programming interface (API) and a native C-language wrapper for the Java API. In addition, you can use a web browser to browse HDFS files. I'll be using CLI only in this post.

Tuesday, February 21, 2017

Setting up Hadoop Edge/Gateway Node (Hadoop Client)

We have 3 node Hadoop cluster (2.7.3) (One Master and two Slaves) already running in our environment, now we want to set up a fourth instance as a client machine (analogous to Oracle client) and submit commands from the client machine to the hadoop cluster. 

Monday, February 20, 2017

Setting up multi node Apache Hadoop Cluster 2.7.3 on RHEL 7.3

Multi node Hadoop cluster as composed of Master-Slave Architecture to accomplish BigData processing which contains multiple nodes. For setting up multi node Hadoop Cluster, I am going to use three machines (One as MasterNode and rest two are as SlaveNodes). 

Saturday, February 18, 2017

Setting up single node Apache Hadoop Cluster 2.7.3 on RHEL 7.3

The purpose of this post is to setup the single node Apache Hadoop Cluster 2.7.3 for BigData enthusiasts who want to learn the parallel computation to tackle large amount of datasets. Knowledge of Linux is the prerequisite for this post.

Below is the information for the environment that I've , I'm using RHV.
RHV,  Red Hat Enterprise Linux Server 7.3
Master Node:   hostname è hdpmaster           IPè192.166.44.170     rootpwdè hadoop123

Hadoop Ecosystem - Quick Introduction

This is data age, data data everywhere. Although we cannot measure total volume of data stored electronically but it is estimated that 4.4 zettabytes in 2013 and is forecasting a tenfold growth by 2020 to 44 zettabytes. Clearly we can say this is Zettabyte Era. A zettabyte is equal to one thousand exabytes, one million petabytes, or one billion terabytes.

Sunday, February 12, 2017

Big Data - The Bigger Picture

I’ve put the title with "The Bigger Picture" instead of "The Big Picture" because even big picture comes with much more details. The aim of this post is to provide a broad understanding of the topic without indulging into deeper details.