DBMentors - Inam Bukhari's Blog: August 2017

Thursday, August 10, 2017

Working with Talend for Big Data (TOSBD)

Introduction

Talend (eclipse based) provides unified development and management tools to integrate and process all of your data with an easy to use, visual designer. It helps companies become data driven by making data more accessible, improving its quality and quickly moving it where it’s needed for real-time decision making.

Talend for Big Data is built on top of Talend's data integration solution that enables users to access, transform, move and synchronize big data by leveraging the Apache Hadoop Big Data Platform and makes the Hadoop platform ever so easy to use.

Analyzing/Parsing syslogs using Hive and Presto

Scenario

My company asked me to provide the solution for syslog aggregation for all the environments so that they may be able to analyze and get insights. Logs should be captured first, then retained and finally processed by the analyst team in a way they already use to query/process with database. The requirements are not much clearer as well as volume of data can't be determined at the stage.

Working with Apache Cassandra (RHEL 7)

Introduction

Cassandra (created at Facebook for inbox search) like HBase is a NoSQL database, generally, it means you cannot manipulate the database with SQL. However, Cassandra has implemented CQL (Cassandra Query Language), the syntax of which is obviously modeled after SQL and designed to manage extremely large data sets with manipulation capabilities. It is a distributed database, clients can connect to any node in the cluster and access any data.

Hortonworks - Using HDP Spark SQL

Using SQLContext, Apache Spark SQL can read data directly from the file system. This is useful when the data you are trying to analyze does not reside in Apache Hive (for example, JSON files stored in HDFS).

DBMentors - Inam Bukhari's Blog

Pages

Please see my other blog for Oracle EBusiness Suite Posts - EBMentors

Search This Blog

Thursday, August 10, 2017

Working with Talend for Big Data (TOSBD)

Tuesday, August 08, 2017

Analyzing/Parsing syslogs using Hive and Presto

Wednesday, August 02, 2017

Working with Apache Cassandra (RHEL 7)

Tuesday, August 01, 2017

Hortonworks - Using HDP Spark SQL

Translate

Followers

Labels

Blog Archive

About Me

Total Pageviews