When I first joined my company more than 2 years ago, my tasks at that time was heavily focused on the analytical and dashboard reporting side. One of these task was to prepare daily business performance email which shows our the performance business metrics for different business segments (B2C, B2B, Ikea delivery, e-commerce parcels delivery for Shopee and Lazada and so on).

And all these data for this daily email report were coming from multiple data sources such as SQL database, Facebook, Google analytic, our third-party keyword search API, Our cloud storage (for manual upload data), then was transformed and…


As our customer base grows, we start seeing the importance of understanding our customer's journey inside our products. Before this, we rely heavily on Google Analytic Web to understand our customer journey. And then we noticed more and more of our new customers start to use our Mobile app over our Website to make their booking.

While Firebase itself did provide a visualization dashboard, it is pretty basic and could not tell us the flow between events and pages. …


Welcome to Part 2 of Building a Simple Batch Data Pipeline from AWS RDS to Google BigQuery. You can check out Part 1 on how to set up a batch data pipeline using AWS Datapipeline before you begin this.

As described in Part 1, our approach for this batch data pipeline is ELT. We extract data from AWS RDS, Load inside Bigquery, and Transform the data inside Bigquery using SQL.

The key point to note in the ELT process is that raw data is transformed inside the data warehouse without the need for a staging server; your data warehouse now…


A couple of months ago, I was assigned to set up a batch data pipeline using AWS Data pipeline service to bring our operation(production) data from AWS RDS(used as our production db) into Google Bigquery(our data warehouse) on daily basis. The goal was to setup faster and more efficient querying dashboards in Holistics (An awesome BI tool).

I have divided this guide into 2 sections, one section will be covering the necessary configuration inside AWS and the second will be covering configurations required inside GCP. …


When I did my master project. One of the obstacles that I faced was to scrap data from multiple websites. At first, I was contemplating manual scrapping but after hours of tedious copy-paste from websites to excel spreadsheet, including multiple typos error and miss-entry, I finally gave up.


The magnificent view of Kuala Lumpur

I guess amongst the biggest dilemma faced by a house buyer or seller is to determine the fair price of the house. This sparked me in developing machine learning models that can help to determine the price of a house with a profound accuracy.

I developed 3 ensembles model which are, Bagging-Random Forest, Boosting-XGBoost and Stack Ensemble — Using multiple base learners. My main focus was high rise property (Service Resident, Condominium, Apartment and Flat) in Kuala Lumpur from year2008–2018.

This post will be covering on how do I acquire the dataset, clean the data, conducting Exploratory Data Analysis, feature…

Muhammad Izzuddin

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store