Archive for the ‘MapReduce’ Category
Big Data: Hadoop is tuned for availability not efficiency
A very interesting post by UC Berkeley Professor Joe Hellerstein on his blog about two very different big data deployments on Hadoop and Greenplum. Joe was contrasting a recent Yahoo implementation on Hadoop to sort a petabyte using approximately 3800 [...]
MapReduce is now reaching mainstream science
Most of you will be aware about MapReduce, a framework developed by Google to analyze large data sets in parallel on clusters of computers. It is used for certain kinds of distributable problems using a large number of computers. There [...]

