Big Data - Abhishek Tiwari (Page 2)

Big Data

Premium

Requirements for stream processing architecture

Posted by

Abhishek Tiwari on May 6th, 2015

In 2005 Stonebraker et al. published a paper that outlined 8 key requirements for stream processing architecture. These key requirements can be easily translated into building blocks of stream processing architecture. Although, this article dates before systems such as Apache Kafka, Amazon Kinesis, Apache Spark,...

Big Data

Premium

Building Distributed Systems with Mesos

Posted by

Abhishek Tiwari on February 27th, 2014

Apache Mesos is a popular open source cluster manager which enables building resource-efficient distributed systems. Mesos provides efficient dynamic resources isolation and sharing across multiple distributed applications such as Hadoop, Spark, Memcache, MySQL etc on a dynamic shared pool of resources nodes. This means with...

Hadoop

Premium

Hadoop Ecosystem- Deployment And Management

Posted by

Abhishek Tiwari on October 12th, 2012

My notes and thoughts on Hadoop Ecosystem from book Hadoop Operations[1]. One of the major key take aways is emergence of the Hadoop cluster deployment and management tools such as hstack and Apache AMBARI. In our own setup we managed to deploy and scale...

Big Data

Premium

Traditional Ways To Solve Scalability Problems With RDBMS

Posted by

Abhishek Tiwari on October 5th, 2012

Notes plus thoughts from my recent read Cassandra: The Definitive Guide. Common ways to solve scalability bottleneck with relational databases, Throw More/better Hardware (memory And Cpu) * Vertical scaling * Faster disks (SSD vs RAID) Move To A Database Cluster * With master-slave configuration: * Master is now...

Big Data

Premium

Knowing your data vs relying on it

Posted by

Abhishek Tiwari on October 31st, 2010

It is good to know your data. But there is clear distinction between being data driven vs data informed. No matter which area you work, there is always an opportunity to make additional gains by closely observing the characteristic and quality of your data. By...