Category
Data Engineering
Data Engineering involves the design, development, and management of data infrastructure and pipelines. It focuses on collecting, processing, transforming, and storing data in a scalable and efficient manner. Data Engineers build systems that enable data analytics, machine learning, and other data-driven applications, ensuring reliable and timely access to high-quality data for insights and decision-making.
Data integration generally requires in-depth domain knowledge, a strong understanding of data schemas and underlying relationships. This can be time-consuming and bit challenging if you are dealing with hundreds of data sources and thousands of event types (see my recent article on ELT architecture). Various...
A case for ELT (i.e. extract, load, and transform) and difference between ETL and ELT
To unlock the true value of data, organisations will need internal data services. Data services provide streamlined and centralised data access to a diverse set of users which removes the friction in delivering faster insights, products and services. Data services promote innovation. In addition, effective...
Currently the majority of cloud based database and data warehouse services are provisioned with fixed storage and compute resources. Resizing of resources cannot be performed without compromising availability and performance. This means service users typically end up with over-provisioned under-utilised expensive resources to accommodate possible...
In 2005 Stonebraker et al. published a paper that outlined 8 key requirements for stream processing architecture. These key requirements can be easily translated into building blocks of stream processing architecture. Although, this article dates before systems such as Apache Kafka, Amazon Kinesis, Apache Spark,...