ETL refers to extract, transform, load and it is generally used for data
warehousing and data integration. ETL is a product of the relational database
era and it has not evolved much in last decade. With the arrival of new
cloud-native tools and platform, ETL...
Kubernetes has emerged as go to container orchestration platform for data
engineering teams. Kubernetes has a massive community support and momentum
behind it. In 2018, a widespread adaptation of Kubernetes for big data
processing is anitcipated. Organisations are already using Kubernetes for a
variety of...
To unlock the true value of data, organisations will need internal data
services. Data services provide streamlined and centralised data access to a
diverse set of users which removes the friction in delivering faster insights,
products and services. Data services promote innovation. In addition, effective...
I have been playing with Apache Drill for quite some time now. In layman's
terms, Apache Drill is SQL query engine which can perform queries against any
type of data store - in particular any non-relational data store. Apache Drill's
ability to query against raw...
Currently the majority of cloud based database and data warehouse services are
provisioned with fixed storage and compute resources. Resizing of resources
cannot be performed without compromising availability and performance. This
means service users typically end up with over-provisioned under-utilised
expensive resources to accommodate possible...