Please see my other blog for Oracle EBusiness Suite Posts - EBMentors

Search This Blog

Note: All the posts are based on practical approach avoiding lengthy theory. All have been tested on some development servers. Please don’t test any post on production servers until you are sure.

Thursday, July 28, 2022

Hello Lakehoue! Building Your First On-Prem Data Lakehouse

As the emerging concept of a data lakehouse is continuing to gain traction, I thought to write the hello world for it which I named as Hello Lakehouse. In this post first I'll elaborate some necessary concepts and then will come to the implementation part using open source technologies.

Monday, July 25, 2022

Centralized Logging with Fluentd/Fluent-bit and Minio

 Fluentd is an open source data collector for building the unified logging layer. Once installed on a server, it runs in the background to collect, parse, transform, analyze and store various types of data. It is written in Ruby for flexibility, with performance-sensitive parts in C. td-agent is a stable distribution package of Fluentd having 30-40MB memory footprint. 

Fluent Bit is a Lightweight Data Forwarder (with 450KB memory footprint) for Fluentd. Fluent Bit is specifically designed for forwarding the data from the edge (Containers / Servers / Embedded Systems) to Fluentd aggregators.

Thursday, July 07, 2022

Using Filebeat/Logstash to send logs to Minio Data Lake

To aggregate logs directly to an object store like Minio, you can use the Logstash S3 output plugin. Logstash aggregates and periodically writes objects on S3, which are then available for later analysis. For more information please review the related post at the end of this post.


Tuesday, July 05, 2022

Create Data Lake Without Hadoop

 In this post, the focus is to build a modern data lake using only open source technologies. I will walk-through a step-by-step process to demonstrate how we can leverage an S3-Compatible Object Storage (MinIO) and a Distributed SQL query engine (Presto) to achieve this. For some administrative work we may use Hive as well.