Angel \”Java\” Lopez on Blog

August 9, 2011

MapReduce: Links, News and Resources (1)

Filed under: Algorithms, Distributed Computing, Links — ajlopez @ 9:38 am

Ones of my preferred topics in programming are algorithms and distributed computing. You can have both with MapReduce. These are some of my links (thanks to @asehmi for his help; he sent me some of these links).

MapReduce is a software framework introduced by Google in 2004 to support distributed computing on large data sets on clusters of computers.[1] Parts of the framework are patented in some countries.[2]

The framework is inspired by the map and reduce functions commonly used in functional programming,[3] although their purpose in the MapReduce framework is not the same as their original forms.[4]

MapReduce libraries have been written in C++, C#, Erlang, Java, OCaml, Perl, Python, PHP, Ruby, F#, R and other programming languages

MapReduce: Simplified Data Processing on Large Clusters

Parallel Processing Using the Map Reduce Programming Model

Graph Twiddling in a MapReduce World

Cloud9: a MapReduce library for Hadoop

An implementation of Map-Reduce in C#

Twister: iterative MapReduce

ySpace Qizmt – MySpace’s Open Source Mapreduce Framework

Cascading is a Data Processing API, Process Planner, and Process Scheduler used for defining and executing complex, scale-free, and fault tolerant data processing workflows on an Apache Hadoop cluster. All without having to ‘think’ in MapReduce.

Project Daytona – Microsoft Research
Iterative MapReduce on Windows Azure

InfoQ: Introduction to Oozie
Combine multiple Map/Reduce jobs into a logical unit of work

InfoQ: Ville Tuulos on Big Data and Map/Reduce in Erlang and Python with Disco

Spark Cluster Computing Framework

Preview of Storm: The Hadoop of Realtime Processing – BackType Technology

Hadoop in Azure – Distributed Development – Site Home – MSDN Blogs

MapReduce: A Soft Introduction

Mapreduce & Hadoop Algorithms in Academic Papers

MSDN Magazine: MapReduce in F# – Parsing Log Files with F#, MapReduce and Windows Azure

F#: With a few lines of code entered into the powershell and analyze gigabytes of cloud data! – Systems, architecture and engineering solutions!

Data-Intensive Text Processing with MapReduce

The Geomblog: Workshop on Parallelism, and a "breakthrough" in combinatorial geometry

Pragmatic Programming Techniques: Designing algorithms for Map Reduce

Mapreduce and Hadoop Algorithms in Bioinformatics Papers | Abhishek Tiwari

Pragmatic Programming Techniques: Map/Reduce to recommend people connection

High Scalability – Dremel: Interactive Analysis of Web-Scale Datasets – Data as a Programming Paradigm

Tutorial: MapReduce with Riak « myNoSQL

High Scalability – How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data


Pregel: Google’s other data-processing infrastructure | Scalable web architectures

Apache Mahout – Overview
The Apache Mahout™ machine learning library’s goal is to build scalable machine learning libraries.

InfoQ: Billy Newport Discusses Parallel Programming in Java

Sector/Sphere: High Performance Distributed Data Storage and Processing

MapReduce – The Fanfiction « Snail in a Turtleneck

Map / Reduce – A visual explanation

An Introduction to JavaScript Map/Reduce in Riak on Vimeo

Graph algorithms (and MapReduce)

Using MapReduce Functionality To Process Data

My Links

More links about Hadoop and other systems are coming.

Keep tuned!

Angel "MapReduced" Lopez

Theme: Shocking Blue Green. Get a free blog at


Get every new post delivered to your Inbox.

Join 67 other followers