Angel \”Java\” Lopez on Blog

September 9, 2011

AjFabriq on NodeJs (Part 3) A Distributed Simple Application

Filed under: AjFabriq, Distributed Computing, JavaScript, NodeJs, Open Source Projects — ajlopez @ 11:22 am

Previous Post

Lets run our “killer” application (a simple counter) in two node. At the repo, under examples\numbers, I have an appserver.js program:

It’s similar to my local example. The difference is that the top message processor is listening using a port:

/**
 * Host.
 */
var host = ajfabriq.createLocalHost();
/**
 * Application configuration.
 */
 
var app = host.createProcessor('numbers', 'application');
var node = app.createProcessor('processor', 'node');
node.on('decrement', function (message) {
	console.log("Processing number " + message.number);
	
	if (message.number <= 1) {
		console.log("End Processing");
		return;
		}
		
	var number = message.number-1;
	
	this.post({ action: 'decrement', number: number });
});
host.listen(3000);
host.process({ application: 'numbers', node: 'processor', action: 'decrement', number: 10 });

In this code, I’m using ajfabriq.createLocalHost() instead .createProcessor(). And host.listen(3000) to accept messages from other nodes.

I run another program: appclient.js. It has the same local processors:

/**
 * Application configuration.
 */
 
var app = host.createProcessor('numbers', 'application');
var node = app.createProcessor('processor', 'node');
node.on('decrement', function (message) {
	console.log("Processing number " + message.number);
	if (message.number <= 1)
		return;
		
	var number = message.number-1;
	
	this.post({ action: 'decrement', number: number });
});

But it connects to the first server, and post a new message:

var socket = new net.Socket();
socket.connect(3000, 'localhost',
	function() {
		host.connect(new ajfabriq.Channel(socket), true);
		socket.write(JSON.stringify({name : 'ajfmessage', message: { application: 'numbers', node: 'processor', action: 'decrement', number: 10 }}));
	}
);
	

ajfabriq.Channel is the bidirectional channel between two ajfabriq servers.

This server output:

Note the interchange of message between the two servers, at the beginning. They are informing their local processors, so each server knows if a message could be processed by another server.

The first server reaction:

Some of the numbers are processed by the second server, and the others are routed to the first server. The routing is a simple random choice in this demo. LocalHost objects have a new .post message:

LocalHost.prototype.post = function (message) {
	var hosts = [ this ];
	
	for (var remote in this.remotes) {
		if (this.remotes[remote].accepts(message)) {
			hosts.push(this.remotes[remote]);
		}
	}
	var n = Math.floor(Math.random() * hosts.length);
	
	hosts[n].process(message);
};

Next steps: better routing, improve socket communication (large JSON messages detection and split), logging, more sample apps.

Keep tuned!

Angel “Java” Lopez

http://www.ajlopez.com

http://twitter.com/ajlopez

September 8, 2011

AjFabriq on NodeJs (Part 2) A local Simple Application

Filed under: AjFabriq, Distributed Computing, JavaScript, NodeJs, Open Source Projects — ajlopez @ 11:40 am

Previous Post
Next Post

Lets explore how to use AjFabriq on NodeJs. Here is a simple application:

https://github.com/ajlopez/AjFabriqJs/tree/master/examples/numbers

It implements the ultimate killer application: it receives a message with a number, and post a message with the number less one ;-) . Let’s see how the application is defined:

/**
 * Module dependencies.
 */
 
var ajf = require('ajfabriq');

I included c:\Git in my NODE_PATH environment variable, and there is a c:\Git\ajfabriq containing my git local repo under development. You can clone the repo in your node_modules folder if you are using NodeJs 5.x (see Playing With NodeJs (1) Running on Windows (and Azure)) .

AjFabriq defines an object that exposes some methods. This is the way to create a message processor that is local, and it is not exposed to other servers (a distributed example in the repo, to review in an upcoming post):

/**
 * Host.
 */
var host = ajf.createProcessor();

Now, a message processor (see previous post) can be a composite. The aprocessor.createProcessor creates a new processor and adds it to the parent processor:

/**
 * Application configuration.
 */
 
var app = host.createProcessor('numbers', 'application');
var node = app.createProcessor('processor', 'node');

The first processor, app, will accept and process messages with property “application” having "numbers” as value. Its child processor node will process message with additional property “node” with value “processor”. In this way, we can define a tree of message processors. The message properties and their values are the routing information, so each message will be send to the appropriate message processor.

But, how to define the leaf processor behavior? To be aligned to NodeJs async processing, each processor inherits from an EventEmitter. Since the previous post, I defined the Processor as a “subclass” of process.EventEmitter in AjFabriq code:

var EventEmitter = process.EventEmitter;
// ...
function Processor(name, kind)
{
	this.name = name;
	this.kind = kind;
	this.processors = [];
}
Processor.prototype.__proto__ = EventEmitter.prototype;

Now, we can define the leaf processor behavior:

node.on('decrement', function (message) {
	console.log("Processing number " + message.number);
	
	if (message.number <= 1) {
		console.log("End Processing");
		return;
		}
		
	var number = message.number-1;
	
	this.post({ action: 'decrement', number: number });
});

The message processing detects an action property, and raise the corresponding event. Note the emit method invocation in AjFabriq code:

Processor.prototype.process = function (message)
{
	if (this.processors == null || this.processors.length == 0) {
		this.emit(message.action, message);
		return;
	}
	
	for (var processor in this.processors)
		if (this.processors[processor].accepts(message)) 
		{
			this.processors[processor].process(message);
		}
}

 

At the end of my “killer” app sample, a message is sent to the top processor:

host.process({ application: 'numbers', node: 'processor', action: 'decrement', number: 10 });

The output:

But wait! Do you check the ‘decrement’ code I commented above? There is a:

this.post({ action: 'decrement', number: number });

No application, node properties!! Yes, it's right. If you post a message from a processor (node in this case), the parent processors fill the missing properties. That is, application property is set to “numbers”, and node property is set to “processor”. If you want to sent a message to other application, you can explicitly set the corresponding property in your new message. Note that the recommended practice is: don't change the original message. Maybe it could be processed by other processors.

For upcoming posts: new apps, a distributed example, more fun! ;-)

Keep tuned!

Angel “Java” Lopez

http://www.ajlopez.com

http://twitter.com/ajlopez

August 25, 2011

AjFabriq on NodeJs (Part 1) Introduction

Filed under: AjFabriq, Distributed Computing, JavaScript, NodeJs, Open Source Projects — ajlopez @ 10:28 am

Next Post

Some years ago I discovered Fabriq project (thanks @asehmi!):

Remember Fabriq
FABRIQ has gone public!
Arvindra Shemi Fabriq Articles
Clemens Vasters Fabriq Articles

Key points:

FABRIQ is an infrastructure for constructing networks of nodes processing and relaying messages. These nodes are hosted in machines running into a serviced application.

You can have multiple machines running the same or different applications. The “network” is the application, “node” is a collection of “actions”, and each action process a message. More doc:

These nodes can be hosted in any distribution on several machines according to a defined configuration, so there may be machines running a single node or several nodes, this association are made by specifying the host-name or machine identification associated with each node in the network.

Each of these machines is running a serviced application responsible for starting and stopping its Host and Nodes which are the application main components. The host is responsible for handling the configuration, loading and unloading nodes and receives the messages and delivers them to the appropriate Node.

Past weekend, I started a Javascript project to run on NodeJs, based on Fabriq ideas:

A Distributed Application Framework for NodeJs https://github.com/ajlopez/AjFabriqJs

Simple one file implementation. Four Javascript “Classes”.

I want to run many NodeJs servers that hosts AjFabriq applications, sending messages across their network:

Original Fabriq had application, nodes, configuration. I simplified it and now I have a simple Processor that accepts messages. It can produce 0, 1 or more messages:

Messages are schema-less JSON objects that contains its own routing information:

A Processor could handle all the messages containing the key/value application: “webcrawler”. Some processors are composites: they have another processors. Then, the webcrawler processor could have a child processor specialized in handle message with key/value node:”downloader”. See the picture? But the key/values and the deep of processor trees ARE DECIDED by the developer. You decide to have “applications”, “actions”, “nodes”, or whatever you want.

A host can have many processor defined. When message is posted, a simple local method decides where to send it. It could be locally processed or it could be send to a remote host. The host network is dynamic: a new host can be added at any time, then it can collaborate with the current messages processing.

Pending topics for upcoming blog posts: implementation details, the simplest application, using net (plain) sockets, my failure with Socket.IO. Work in progress: make AjFabriq more NodeJs friendly using EventEmitter in key points, host info (hosted application) better dissemination, more robust implementation; rename Socket to Channel, a better description word.

My previous work on distributed application samples:

AjMessages
AjAgents

Keep tuned!

Angel “Java” Lopez
http://www.ajlopez.com
http://twitter.com/ajlopez

August 23, 2011

Node.Js: Links, news, Resources (2)

Filed under: Distributed Computing, JavaScript, Links, NodeJs — ajlopez @ 10:15 am

Previous post
Next post

Since my previous post, I was playing a bit with Node.Js, on Ubuntu and Windows (now, there is a Windows precompiled version, ready to download and run, see http://nodejs.org/#download). I’m working on rewriting Fabriq ideas from @asehmi to run on pure NodeJs (check my progress at https://github.com/ajlopez/AjFabriqJs; for Fabriq, check http://ajlopez.wordpress.com/2007/10/15/remember-fabriq/)

http://en.wikipedia.org/wiki/Node.js

Node.js is an event-driven I/O server-side JavaScript environment based on V8. It is intended for writing scalable network programs such as web servers.[1] It was created by Ryan Dahl in 2009, and its growth is sponsored by Joyent, which employs Dahl.[2] [3]

Similar environments written in other programming languages include Twisted for Python, Perl Object Environment for Perl, libevent for C and EventMachine for Ruby. Unlike most JavaScript, it is not executed in a web browser, but is instead a form of server-side JavaScript. Node.js implements some CommonJS specifications.[4] Node.js includes a REPL environment for interactive testing.

Deploying Node.js With Upstart and Monit – How To Node – NodeJS
http://howtonode.org/deploying-node-upstart-monit

Custom Node.js Modules « joshdulac.com
http://joshdulac.com/index.php/custom-node-js-modules/

Writing Node.js Native Extensions | Cloudkick, manage servers better
https://www.cloudkick.com/blog/2010/aug/23/writing-nodejs-native-extensions/

How To Module – How To Node – NodeJS
http://howtonode.org/how-to-module

Installing Node.js and NPM on Ubuntu 10.04 and try a simple chat application
http://www.giantflyingsaucer.com/blog/?p=1688

Installing node.js on ubuntu 10.04
http://www.codediesel.com/linux/installing-node-js-on-ubuntu-10-04/

BDD and TDD for node.js? – Stack Overflow
http://stackoverflow.com/questions/4706020/bdd-and-tdd-for-node-js

Ubuntu 11.10 to support the Cloud Foundry Platform-as-a-Service
http://www.h-online.com/open/news/item/Ubuntu-11-10-to-support-the-Cloud-Foundry-Platform-as-a-Service-1324917.html

TJ Holowaychuk • commander.js – nodejs command-line interfaces made easy
http://tjholowaychuk.com/post/9103188408/commander-js-nodejs-command-line-interfaces-made-easy

nodechat.js – Using node.js, backbone.js, socket.io, and redis to make a real time chat app
http://fzysqr.com/2011/02/28/nodechat-js-using-node-js-backbone-js-socket-io-and-redis-to-make-a-real-time-chat-app/

Node.js on Windows: Who Needs NPM? « ICED IN CODE
http://icewalker2g.wordpress.com/2011/07/23/node-js-on-windows-who-needs-npm/

node.js & socket.io fun – till’s blog
http://till.klampaeckel.de/blog/archives/133-node.js-socket.io-fun.html

Adam Coffman – Getting your feet wet with node.js and socket.io – Part 1
http://thecoffman.com/2011/02/21/getting-your-feet-wet-with-node.js-and-socket.io/

node.js: Building a graph of build times using the Go API at Mark Needham
http://www.markhneedham.com/blog/2011/08/13/node-js-building-a-graph-of-build-times-using-the-go-api/

Episode 10: Sprite 3D, Candy, NodeJS discussion, jQuery plugins | The Javascript Show
http://javascriptshow.com/episodes/10

Nodejs – A quick tour (v4)
http://www.slideshare.net/the_undefined/nodejs-a-quick-tour-v4

NodeSocket to launch hosting for Node.js apps
http://binzaman.com/2011/08/10/nodesocket-to-launch-hosting-for-node-js-apps/

Dear PHP, I’m leaving and yes, she’s sexier
http://blog.nodeping.com/2011/08/12/dear-php-im-leaving-and-yes-shes-sexier/

InfoQ: Ephemeralization or Heroku’s Evolution to a Polyglot Cloud OS
http://www.infoq.com/news/2011/08/heroku_polyglot

Understanding node.js » Debuggable Ltd
http://debuggable.com/posts/understanding-node-js:4bd98440-45e4-4a9a-8ef7-0f7ecbdd56cb

The Node.js Philosophy – blog.nodejitsu.com – scaling node.js applications one callback at a time.
http://blog.nodejitsu.com/the-nodejs-philosophy

Using node.js async library reminds me of continuations and monads – The Web and all that Jazz
http://iamwil.posterous.com/64271154

Real time online activity monitor example with node.js and WebSocket
http://lchandara.wordpress.com/2011/08/07/real-time-online-activity-monitor-example-with-node-js-and-websocket/

Windows Azure Storage for Node.js
http://blogs.southworks.net/jpgarcia/2011/08/06/windows-azure-storage-for-node-js

Step by step instructions to install NodeJS on Windows
http://lchandara.wordpress.com/2011/08/05/step-by-step-instructions-to-install-nodejs-on-windows/

Easy Node.js Apps With Lisp
http://confreaks.net/videos/485-larubyconf2011-easy-node-js-apps-with-lisp

LearnBoost/tobi – GitHub
https://github.com/LearnBoost/tobi

NodeJS is easy, just ask her out
http://vincentwoo.com/2011/06/10/nodejs-is-easy-just-ask-her-out/

npm – Node Package Manager
http://npmjs.org/

nowjs for Node – Directly call remote functions in Javascript
http://nowjs.com/

Websockets everywhere with Socket.IO
http://howtonode.org/websockets-socketio

Running the “Express” web development framework on Node for Windows
http://weblogs.asp.net/cibrax/archive/2011/08/05/running-the-express-web-development-framework-on-node-for-windows.aspx

Orkis – online web based multiplayer tetris
http://orkis.skdev.me/

node.js – A giant step backwards – Fenn’s Thoughts
http://fenn.posterous.com/nodejs-a-giant-step-backwards

Learning Javascript with Object Graphs
http://howtonode.org/object-graphs

Bricks.js – Documentation
http://bricksjs.com/documentation.html

Node.js Beginner Book
http://davidhayden.com/blog/dave/archive/2011/07/31/NodejsBeginnerBook.aspx

The Node Beginner Book » A comprehensive Node.js tutorial
http://www.nodebeginner.org/

Node.js Creator Ryan Dahl’s Keynote from NodeConf
http://www.readwriteweb.com/hack/2011/07/nodejs-creator-ryan-dahls-keyn.php

Node.js, Ruby, and Python in Windows Azure
http://channel9.msdn.com/Shows/Cloud+Cover/Cloud-Cover-Episode-48-Nodejs-Ruby-and-Python-in-Windows-Azure

Reactor pattern
http://en.wikipedia.org/wiki/Reactor_pattern

The New Heroku (Part 2 of 4): Node.js & New HTTP Capabilities
http://blog.heroku.com/archives/2011/6/22/the_new_heroku_2_node_js_new_http_routing_capabilities/

Microsoft working with Joyent and the Node community to bring Node.js to Windows
http://blogs.msdn.com/b/interoperability/archive/2011/06/23/microsoft-working-with-joyent-and-the-node-community-to-bring-node-js-to-windows.aspx

Porting Node to Windows With Microsoft’s Help
http://blog.nodejs.org/2011/06/23/porting-node-to-windows-with-microsoft%E2%80%99s-help/

Do I need a server to use HTML5′s WebSockets? – Stack Overflow
http://stackoverflow.com/questions/1530023/do-i-need-a-server-to-use-html5s-websockets

Single Page Apps with Node.js. – blog.nodejitsu.com – scaling node.js applications one callback at a time.
http://blog.nodejitsu.com/single-page-apps-with-nodejs

Node Inspector – Node.js Debugger
http://www.youtube.com/watch?v=AOnK3NVnxL8

Where does node.js fit in a Microsoft / Azure / WCF / MVC project? – Stack Overflow
http://stackoverflow.com/questions/5269892/where-does-node-js-fit-in-a-microsoft-azure-wcf-mvc-project

NodeJS Tutorial with CouchDB and Haml – ErdNodeFlips
http://www.robsearles.com/2010/05/28/nodejs-tutorial-with-couchdb-and-haml-erdnodeflips/

isaacs/nave – GitHub
https://github.com/isaacs/nave
Virtual Environments for Node

Comparing clojure and node.js for speed
http://swizec.com/blog/comparing-clojure-and-node-js-for-speed/swizec/1593

node core vs userland – GitHub
https://github.com/joyent/node/wiki/node-core-vs-userland

Calipso – A Node CMS
http://calip.so/

My links:
http://www.delicious.com/ajlopez/nodejs

August 15, 2011

Hadoop: Links, News and Resources (1)

Filed under: Distributed Computing, Open Source Projects, Scalability — ajlopez @ 9:54 am

After my posts with links about Scalability and MapReduce, it’s time to share my links about Hadoop (thanks to @asehmi for his links):

http://hadoop.apache.org/

The Apache™ Hadoop™ project develops open-source software for reliable, scalable, distributed computing.

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

The project includes these subprojects:

Other Hadoop-related projects at Apache include:

  • Avro™: A data serialization system.
  • Cassandra™: A scalable multi-master database with no single points of failure.
  • Chukwa™: A data collection system for managing large distributed systems.
  • HBase™: A scalable, distributed database that supports structured data storage for large tables.
  • Hive™: A data warehouse infrastructure that provides data summarization and ad hoc querying.
  • Mahout™: A Scalable machine learning and data mining library.
  • Pig™: A high-level data-flow language and execution framework for parallel computation.
  • ZooKeeper™: A high-performance coordination service for distributed applications.

http://wiki.apache.org/hadoop/

Papers – Hadoop Wiki
http://wiki.apache.org/hadoop/Papers

HDFS
http://hadoopblog.blogspot.com/

Realtime Hadoop usage at Facebook — Part 1
http://hadoopblog.blogspot.com/2011/05/realtime-hadoop-usage-at-facebook-part.html

HDFS: Realtime Hadoop usage at Facebook — Part 2 – Workload Types
http://hadoopblog.blogspot.com/2011/05/realtime-hadoop-usage-at-facebook-part_28.html

The top five most powerful Hadoop projects – SD Times: Software Development News
http://www.sdtimes.com/l/35596

How to Deploy a Hadoop Cluster on Windows Azure – Windows Azure
http://blogs.msdn.com/b/windowsazure/archive/2011/05/17/how-to-deploy-a-hadoop-cluster-on-windows-azure.aspx

Hadoop in Azure – Distributed Development
http://blogs.msdn.com/b/mariok/archive/2011/05/11/hadoop-in-azure.aspx

Radoop – It’s Like Yahoo Pipes for Hadoop | SiliconANGLE
http://siliconangle.com/blog/2011/08/11/radoop-its-like-yahoo-pipes-for-hadoop/?

Introduction to MapReduce and Hadoop
http://www.theserverside.com/discussions/thread.tss?thread_id=62376

Mapreduce & Hadoop Algorithms in Academic Papers (4th update – May 2011)
http://atbrox.com/2011/05/16/mapreduce-hadoop-algorithms-in-academic-papers-4th-update-may-2011/

Interning at Facebook: Bridging Marketing and Engineering (18)
http://www.facebook.com/note.php?note_id=10150254305343920

High Performance Computing: Understanding What is Hadoop
http://patodirahul.blogspot.com/2011/03/understanding-what-is-hadoop.html

Microsoft adds Hadoop support to SQL Server, data warehouse
http://www.tmcnet.com/usubmit/2011/08/10/5696037.htm

Parallel Data Warehouse News and Hadoop Interoperability Plans – SQL Server Team Blog
http://blogs.technet.com/b/dataplatforminsider/archive/2011/08/08/parallel-data-warehouse-news-and-hadoop-interoperability-plans.aspx

Cascading
http://www.cascading.org/
Cascading is a Data Processing API, Process Planner, and Process Scheduler used for defining and executing complex, scale-free, and fault tolerant data processing workflows on an Apache Hadoop cluster. All without having to ‘think’ in MapReduce.

Twitter Engineering: A Storm is coming: more details and plans for release
http://engineering.twitter.com/2011/08/storm-is-coming-more-details-and-plans.html
"A Storm cluster is superficially similar to a Hadoop cluster"

Preview of Storm: The Hadoop of Realtime Processing – BackType Technology
http://tech.backtype.com/preview-of-storm-the-hadoop-of-realtime-proce

Mesos: Dynamic Resource Sharing for Clusters
http://www.mesosproject.org/
Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, MPI, Hypertable, Spark (a new framework for low-latency interactive and iterative jobs), and other applications.

Big Analytics for Big Data on Hadoop
http://karmasphere.com/

More about Big Data
http://www.bigdata.com/bigdata/blog
Good white papers

Hadoop Summit 2010 – Yahoo! Developer Network
http://developer.yahoo.com/events/hadoopsummit2010/

DBMS Musings: Hadoop’s tremendous inefficiency on graph data management (and how to avoid it)
http://dbmsmusings.blogspot.com/2011/07/hadoops-tremendous-inefficiency-on.html

Hoop – Hadoop HDFS over HTTP | Apache Hadoop for the Enterprise | Cloudera
http://www.cloudera.com/blog/2011/07/hoop-hadoop-hdfs-over-http/

Bioinformatics and the Future of Hadoop
http://www.genomeweb.com/blog/bioinformatics-and-future-hadoop

Seven Java projects that changed the world – O’Reilly Radar
http://radar.oreilly.com/2011/07/7-java-projects.html

InfoQ: Introduction to Oozie
http://www.infoq.com/articles/introductionOozie
Within the Hadoop ecosystem, there is a relatively new component Oozie, which allows one to combine multiple Map/Reduce jobs into a logical unit of work, accomplishing the larger task

The Future of Hadoop in Bioinformatics | insideHPC.com
http://insidehpc.com/2011/07/03/the-future-of-hadoop-in-bioinformatics/

HDFS: Realtime Hadoop usage at Facebook: The Complete Story
http://hadoopblog.blogspot.com/2011/07/realtime-hadoop-usage-at-facebook.html

SNA Projects Blog : Tech Talk: Anil Madan (eBay) — “Hadoop at eBay”
http://sna-projects.com/blog/2011/06/hadoop-at-ebay/

Ceph as a scalable alternative to the Hadoop Distributed File System
http://www.usenix.org/publications/login/2010-08/openpdfs/maltzahn.pdf

The elephant in the room … Hadoop and BigData!
http://mikethetechie.com/post/6822576191/the-elephant-in-the-room-hadoop-and-bigdata

Hadoop, Hive and Redis for Foursquare Analytics :: myNoSQL
http://nosql.mypopescu.com/post/3872483038/hadoop-hive-and-redis-for-foursquare-analytics

The Hadoop Distributed File System
http://storageconference.org/2010/Papers/MSST/Shvachko.pdf

IBM Jeopardy: Building Watson: An Overview of the DeepQA Project
https://www.stanford.edu/class/cs124/AIMagzine-DeepQA.pdf
"To preprocess the corpus and create fast runtime indices we used Hadoop"

Jeopardy Goes to Hadoop :: myNoSQL
http://nosql.mypopescu.com/post/3406224331/jeopardy-goes-to-hadoop

ElephantDB, a Distributed Database for Working with Hadoop
http://www.readwriteweb.com/hack/2011/02/ravendb-a-distributed-database.php

InfoQ: Hadoop Redesign for Upgrades and Other Programming Paradigms
http://www.infoq.com/news/2011/02/hadoop_redesign

Riding the Elephant | The Molecular Ecologist
http://tomato.biol.trinity.edu/blog/2011/02/riding-the-elephant/

Yahoo focusing on Apache Hadoop, discontinuing “The Yahoo Distribution of Hadoop”
http://developer.yahoo.com/blogs/hadoop/posts/2011/01/announcement-yahoo-focusing-on-apache-hadoop-discontinuing-the-yahoo-distribution-of-hadoop/

Lessons learned putting Hadoop into production « Cloudera » Apache Hadoop for the Enterprise
http://www.cloudera.com/blog/2010/12/lessons-learned-putting-hadoop-into-production/

Dimensional Reduction – Apache Mahout – Apache Software Foundation
https://cwiki.apache.org/confluence/display/MAHOUT/Dimensional+Reduction

Beyond Hadoop – Next-Generation Big Data Architectures – NYTimes.com
https://www.nytimes.com/external/gigaom/2010/10/23/23gigaom-beyond-hadoop-next-generation-big-data-architectu-81730.html

Large Scale Natural Language Processing
http://us.pycon.org/media/2010/talkdata/PyCon2010/098/large-scale-nlp-pycon-2010.pdf

Hadoop and Realtime Cloud Computing | Cloud Computing Journal
http://cloudcomputing.sys-con.com/node/1572508

Hadoop and NoSQL Downfall Parody on Vimeo
http://vimeo.com/15782414

Hadoop: The Definitive Guide, Second Edition – O’Reilly Media
http://oreilly.com/catalog/9781449389734

Hadoop Ecosystem World-Map « Sanjay Sharma’s Weblog
http://indoos.wordpress.com/2010/08/16/hadoop-ecosystem-world-map/

MapReduce, Hadoop: Young, But Worth A Look — Data Management — InformationWeek
http://www.informationweek.com/news/business_intelligence/warehouses/showArticle.jhtml?articleID=226600088

Distributed data processing with Hadoop – Part-3: App Build
http://www.gnarc.com/tutorials/distributed-data-processing-with-hadoop-part-3-app-build

HDFS: Facebook has the world’s largest Hadoop cluster!
http://hadoopblog.blogspot.com/2010/05/facebook-has-worlds-largest-hadoop.html

High Availability MySQL: Hadoop and MySQL
http://mysqlha.blogspot.com/2007/10/hadoop-and-mysql.html

High Scalability – How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data

Realtime Search for Hadoop – Scalable Log Data Management with Hadoop, Part 3 « mgm technology blog
http://blog.mgm-tp.com/2010/06/hadoop-log-management-part3/

Behind Caffeine May Be Software to Inspire Hadoop 2.0
http://gigaom.com/2010/06/11/behind-caffeine-may-be-software-to-inspire-hadoop-2-0

Hadoop in a box
http://www.slideshare.net/tim.lossen.de/hadoop-in-a-box

Scalability of the Hadoop Distributed File System (Hadoop and Distributed Computing at Yahoo!)
http://developer.yahoo.net/blogs/hadoop/2010/05/scalability_of_the_hadoop_dist.html

Introduction to Hadoop, HBase, and NoSQL
http://www.slideshare.net/xefyr/introduction-to-hadoop-hbase-and-nosql

InfoQ: Horizontal Scalability via Transient, Shardable, and Share-Nothing Resources
http://www.infoq.com/presentations/Horizontal-Scalability

Neuroph on Hadoop: Massive Parallel Neural Network System? | NetBeans Zone
http://netbeans.dzone.com/neuroph-hadoop-nb

Pushing the Limits of Distributed Processing « Cloudera » Apache Hadoop for the Enterprise
http://www.cloudera.com/blog/2010/04/pushing-the-limits-of-distributed-processing/
April Joke ;-)

My Links
http://www.delicious.com/ajlopez/hadoop
http://www.delicious.com/ajlopez/hadoop+tutorial
http://www.delicious.com/ajlopez/hadoop+video
http://www.delicious.com/ajlopez/hadoop+nosql
http://www.delicious.com/ajlopez/hadoop+distributedcomputing
http://www.delicious.com/ajlopez/hadoop+scalability
http://www.delicious.com/ajlopez/hadoop+machinelearning
http://www.delicious.com/ajlopez/hadoop+artificialintelligence

More links are coming (distributed computing? NoSql?).

Keep tuned!

Angel “Java” Lopez
http://www.ajlopez.com
http://twitter.com/ajlopez

August 9, 2011

MapReduce: Links, News and Resources (1)

Filed under: Algorithms, Distributed Computing, Links — ajlopez @ 9:38 am

Ones of my preferred topics in programming are algorithms and distributed computing. You can have both with MapReduce. These are some of my links (thanks to @asehmi for his help; he sent me some of these links).

http://en.wikipedia.org/wiki/MapReduce

MapReduce is a software framework introduced by Google in 2004 to support distributed computing on large data sets on clusters of computers.[1] Parts of the framework are patented in some countries.[2]

The framework is inspired by the map and reduce functions commonly used in functional programming,[3] although their purpose in the MapReduce framework is not the same as their original forms.[4]

MapReduce libraries have been written in C++, C#, Erlang, Java, OCaml, Perl, Python, PHP, Ruby, F#, R and other programming languages

MapReduce: Simplified Data Processing on Large Clusters
http://labs.google.com/papers/mapreduce.html

Parallel Processing Using the Map Reduce Programming Model
http://blog.diskodev.com/parallel-processing-using-the-map-reduce-prog

Graph Twiddling in a MapReduce World
http://www.computer.org/portal/web/csdl/doi/10.1109/MCSE.2009.120

Cloud9: a MapReduce library for Hadoop
http://www.umiacs.umd.edu/~jimmylin/cloud9/docs/index.html

MapSharp
http://mapsharp.codeplex.com/
An implementation of Map-Reduce in C#

Twister: iterative MapReduce
http://www.iterativemapreduce.org/

ySpace Qizmt – MySpace’s Open Source Mapreduce Framework
http://code.google.com/p/qizmt/

Cascading
http://www.cascading.org/
Cascading is a Data Processing API, Process Planner, and Process Scheduler used for defining and executing complex, scale-free, and fault tolerant data processing workflows on an Apache Hadoop cluster. All without having to ‘think’ in MapReduce.

Project Daytona – Microsoft Research
http://research.microsoft.com/en-us/projects/azure/daytona.aspx
Iterative MapReduce on Windows Azure

InfoQ: Introduction to Oozie
http://www.infoq.com/articles/introductionOozie
Combine multiple Map/Reduce jobs into a logical unit of work

InfoQ: Ville Tuulos on Big Data and Map/Reduce in Erlang and Python with Disco
http://www.infoq.com/interviews/tuulos-erlang-mapreduce

Spark Cluster Computing Framework
http://www.spark-project.org/

Preview of Storm: The Hadoop of Realtime Processing – BackType Technology
http://tech.backtype.com/preview-of-storm-the-hadoop-of-realtime-proce

Hadoop in Azure – Distributed Development – Site Home – MSDN Blogs
http://blogs.msdn.com/b/mariok/archive/2011/05/11/hadoop-in-azure.aspx

MapReduce: A Soft Introduction
http://www.javacodegeeks.com/2011/05/mapreduce-soft-introduction.html

Mapreduce & Hadoop Algorithms in Academic Papers
http://atbrox.com/2011/05/16/mapreduce-hadoop-algorithms-in-academic-papers-4th-update-may-2011/

MSDN Magazine: MapReduce in F# – Parsing Log Files with F#, MapReduce and Windows Azure
http://msdn.microsoft.com/en-us/magazine/gg983490.aspx

F#: With a few lines of code entered into the powershell and analyze gigabytes of cloud data! – Systems, architecture and engineering solutions!
http://blogs.msdn.com/b/socal-sam/archive/2011/04/26/f-with-a-few-lines-of-code-entered-into-the-powershell-and-analyze-gigabytes-of-cloud-data.aspx

Data-Intensive Text Processing with MapReduce
http://www.umiacs.umd.edu/~jimmylin/MapReduce-book-final.pdf

The Geomblog: Workshop on Parallelism, and a "breakthrough" in combinatorial geometry
http://geomblog.blogspot.com/2010/11/workshop-on-parallelism-and.html

Pragmatic Programming Techniques: Designing algorithms for Map Reduce
http://horicky.blogspot.com/2010/08/designing-algorithmis-for-map-reduce.html

Mapreduce and Hadoop Algorithms in Bioinformatics Papers | Abhishek Tiwari
http://www.abhishek-tiwari.com/2010/08/mapreduce-and-hadoop-algorithms-in-bioinformatics-papers.html

Pragmatic Programming Techniques: Map/Reduce to recommend people connection
http://horicky.blogspot.com/2010/08/mapreduce-to-recommend-people.html

High Scalability – Dremel: Interactive Analysis of Web-Scale Datasets – Data as a Programming Paradigm
http://highscalability.com/blog/2010/8/4/dremel-interactive-analysis-of-web-scale-datasets-data-as-a.html

Tutorial: MapReduce with Riak « myNoSQL
http://nosql.mypopescu.com/post/849130434/tutorial-mapreduce-with-riak

High Scalability – How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data

Pregel
http://portal.acm.org/citation.cfm?id=1807167.1807184

Pregel: Google’s other data-processing infrastructure | Scalable web architectures
http://www.royans.net/arch/pregel-googles-other-data-processing-infrastructure/

Apache Mahout – Overview
http://mahout.apache.org/
The Apache Mahout™ machine learning library’s goal is to build scalable machine learning libraries.

InfoQ: Billy Newport Discusses Parallel Programming in Java
http://www.infoq.com/interviews/billy-newport-parallel

Sector/Sphere: High Performance Distributed Data Storage and Processing
http://sector.sourceforge.net/

MapReduce – The Fanfiction « Snail in a Turtleneck
http://www.snailinaturtleneck.com/blog/2010/03/15/mapreduce-the-fanfiction/

Map / Reduce – A visual explanation
http://ayende.com/Blog/archive/2010/03/14/map-reduce-ndash-a-visual-explanation.aspx

An Introduction to JavaScript Map/Reduce in Riak on Vimeo
http://vimeo.com/9188550

Graph algorithms (and MapReduce)
http://www.umiacs.umd.edu/~jimmylin/cloud-2010-Spring/session5-slides.pdf

Using MapReduce Functionality To Process Data
http://freemakelove.info/http:/freemakelove.info/html/y2010/2145_using-mapreduce-functionality-to-process-data.html

My Links
http://www.delicious.com/ajlopez/mapreduce

More links about Hadoop and other systems are coming.

Keep tuned!

Angel "MapReduced" Lopez
http://www.ajlopez.com
http://twitter.com/ajlopez

June 13, 2011

Running AjSharp in Azure

Filed under: .NET, AjSharp, Azure, Distributed Computing, Open Source Projects — ajlopez @ 9:37 am

My weekend code kata was something I was thinking since last year: run AjSharp in Azure Worker Roles. The idea is: a worker role instance can receives text via queue messages containing AjSharp code, and execute it. The output is send as a message to other queue.

The result was committed in my AjCodeKata project: you must download trunk\Azure\AzureAjSharp AND trunk\AjLanguage (where AjSharp projects reside).

The solution:

The projects:

AzureAjSharp.WorkerRole: sample worker role, with these lines added:

CloudStorageAccount account = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
Processor processor = new Processor(account);
processor.Start();

Azure.AjSharp: the class library. It contains Processor class. The constructors need a cloud account and the names of: requests queue, responses queue and blob container. The request queue has messages with AjSharp code to execute. Response queue has the output text of such executions. The above processor.Start() command initiates the read and process of AjSharp code.

AzureAjSharp.Console: it reads lines from console, and when it reads a “send” line, the text is converted to a cloud message, sending it to the request queue. It has a thread that reads the response queue and prints the results.

AzureLibrary: auxiliar classes.

AjSharpVS2010, AjLanguageVS2010: AjSharp implementation.

When I run the console application, I can send AjSharp code to worker roles:

And more: AjSharp supports Include(“filetobeincluded”); where the file contains AjSharp code. I modified the launch of AjSharp machine to have an Include subroutine implementation that reads the content from a blob container.

A graph:

Then, I uploaded some simple code (the files are in Examples folder in Azure.AjSharp project) to ajsfiles blob container (DevStorage in this test):

(I’m using Neudesic Azure Storage Explorer, but I could use CloudBerry Explorer for Azure Storage: it supports folders in a tree).

This is the test running (using Include) HelloWorld.ajs, and ForOneToTen.ajs:

Next steps:

- Write more utilities in AjSharp, to be included if they are needed: file and directory utilities, download and upload of blobs, send and receive message using queues, broadcast messages to all worker instances, download and load of assemblies, etc. Sky is the limit! ;-)

Then, you (or your program) can dinamically send tasks and receive results. Nice to have: Guids to identify tasks and their results; web interface; results stored as blob texts; cache (and flush) of included blob files, etc…

Keep tuned!

Angel “Java” Lopez

http://www.ajlopez.com

http://twitter.com/ajlopez

February 15, 2011

Azure: Fractal application

Filed under: .NET, Azure, Cloud Computing, Distributed Computing — ajlopez @ 10:21 am

In January, I reimplemented my Fractal application, now using Azure (my Azure-related posts). The idea is to calculate each sector of a fractal image, using the power of worker roles, store the result in blobs, and consume them from a WinForm application.

This is the solution:

The source code is in my AjCodeKatas Google project. The code is at:

http://code.google.com/p/ajcodekatas/source/browse/#svn/trunk/Azure/AzureFractak

If you are lazy to use SVN, this is the current frozen code: AzureFractal.zip.

The projects in the solution:

AzureFractal: the Azure cloud definition.

Fractal: it contains my original code from previous fractal applications. An independent library class.

Fractal.Azure: serialization utilities of fractal info, and a service facade to post that info to a Azure message queue.

AzureLibrary: utility classes I used in other Azure examples. They are evolving in each example.

FractalWorkerRole: the worker role that consumes messages indicating what sector (rectangle) of the Mandelbrot fractal to calculate.

Fractal.GUI: a client WinForm project that sends and receives message to/from the worker role, using Azure queues.

You should configure the solution to have a multiple startup:

The WinForm application sends a message to a queue, with the info about the fractal sector to calculate:

private void Calculate()
{
    Bitmap bitmap = new Bitmap(pcbFractal.Width, 
       pcbFractal.Height);
    pcbFractal.Image = bitmap;
    pcbFractal.Refresh();
    realWidth = realDelta * pcbFractal.Width;
    imgHeight = imgDelta * pcbFractal.Height;
    realMin = realCenter - realWidth / 2;
    imgMin = imgCenter - imgHeight / 2;
    int width = pcbFractal.Width;
    int height = pcbFractal.Height;
    Guid id = Guid.NewGuid();
    SectorInfo sectorinfo = new SectorInfo()
    {
        Id = id,
        FromX = 0,
        FromY = 0,
        Width = width,
        Height = height,
        RealMinimum = realMin,
        ImgMinimum = imgMin,
        Delta = realDelta,
        MaxIterations = colors.Length,
        MaxValue = 4
    };
    Calculator calculator = new Calculator();
    this.queue.AddMessage(
        SectorUtilities.FromSectorInfoToMessage(sectorinfo));
}

The worker role reads messages from the queue, and deserialize SectorInfo:

while (true)
{
    CloudQueueMessage msg = queue.GetMessage();
    if (msg != null)
    {
        Trace.WriteLine(string.Format("Processing {0}", msg.AsString));
        SectorInfo info = SectorUtilities.FromMessageToSectorInfo(msg);

If the sector is too big, new messages are generated:

if (info.Width > 100 || info.Height > 100)
{
    Trace.WriteLine("Splitting message...");
    for (int x = 0; x < info.Width; x += 100)
        for (int y = 0; y < info.Height; y += 100)
        {
            SectorInfo newinfo = info.Clone();
            newinfo.FromX = x + info.FromX;
            newinfo.FromY = y + info.FromY;
            newinfo.Width = Math.Min(100, info.Width - x);
            newinfo.Height = Math.Min(100, info.Height - y);
            CloudQueueMessage newmsg = 
              SectorUtilities.FromSectorInfoToMessage(newinfo);
            queue.AddMessage(newmsg);
        }
}

If the sector is small enough, then it is processed:

Trace.WriteLine("Processing message...");
Sector sector = calculator.CalculateSector(info);
string blobname = string.Format("{0}.{1}.{2}.{3}.{4}", 
info.Id, sector.FromX, sector.FromY, sector.Width, sector.Height);
CloudBlob blob = blobContainer.GetBlobReference(blobname);
MemoryStream stream = new MemoryStream();
BinaryWriter writer =new BinaryWriter(stream);
foreach (int value in sector.Values)
    writer.Write(value);
writer.Flush();
stream.Seek(0, SeekOrigin.Begin);
blob.UploadFromStream(stream);
stream.Close();
CloudQueueMessage outmsg = new CloudQueueMessage(blobname);
outqueue.AddMessage(outmsg);

A blob with the result is generated, and a message is sent to another queue to notify the client application.

The WinForm has a thread with a loop reading messages from the second queue:

string blobname = msg.AsString;
CloudBlob blob = this.blobContainer.GetBlobReference(blobname);
MemoryStream stream = new MemoryStream();
blob.DownloadToStream(stream);
blob.Delete();
this.inqueue.DeleteMessage(msg);
string[] parameters = blobname.Split('.');
Guid id = new Guid(parameters[0]);
int fromx = Int32.Parse(parameters[1]);
int fromy = Int32.Parse(parameters[2]);
int width = Int32.Parse(parameters[3]);
int height = Int32.Parse(parameters[4]);
int[] values = new int[width * height];
stream.Seek(0, SeekOrigin.Begin);
BinaryReader reader = new BinaryReader(stream);
for (int k = 0; k < values.Length; k++)
    values[k] = reader.ReadInt32();
stream.Close();
this.Invoke((Action<int,int,int,int,int[]>) ((x,y,h,w,v) 
=> 
this.DrawValues(x,y,h,w,v)), fromx, fromy, width, height, values);

Note the use of .Invoke to run the drawing of the image in the UI thread.

This is the WinForm app, after click on Calculate button. Note that the sectors are arriving:

There are some blob sectors that are still not arrived. You can drag the mouse to have a new sector:

You can change the colors, clicking on New Colors button:

This is a sample application, a “proof-of-concept”. Probably, you will get a better performance if you use a single machine. But the idea is that you can defer work to worker roles, specially if the work can be do in parallel (imagine a parallel render machine, for animations). If you run these an application in Azure, with many worker roles, the performance could be improved.

Next steps: implement a distributed web crawler, try distributed genetic algorithm, running in the Azure cloud.

Keep tuned!

Angel “Java” Lopez

http://www.ajlopez.com

http://twitter.com/ajlopez

December 13, 2010

Azure: Multithreads in Worker Role, an example

Filed under: .NET, Azure, Cloud Computing, Distributed Computing — ajlopez @ 9:41 am

In my previous post, I implemented a simple worker role, consuming and producing numbers from/to a queue. Now, I have a new app:

The worker role implements the generation of a Collatz sequence. See:

http://mathworld.wolfram.com/CollatzProblem.html
http://en.wikipedia.org/wiki/Collatz_conjecture
http://www.ericr.nl/wondrous/

You can download the solution from my AjCodeKatas Google project. The code is at:

http://code.google.com/p/ajcodekatas/source/browse/#svn/trunk/Azure/AzureCollatz

The initial page is simple:

The number range is send to the queue:

protected void btnProcess_Click(object sender, EventArgs e)
{
    int from = Convert.ToInt32(txtFromNumber.Text);
    int to = Convert.ToInt32(txtToNumber.Text);
    for (int k=from; k<=to; k++) 
    {
        CloudQueueMessage msg = new CloudQueueMessage(k.ToString());
        WebRole.Instance.NumbersQueue.AddMessage(msg);
    }
}

The worker role gets each of these message, and calculates the Collatz sequence:

I added a new feature in Azure.Library: a MessageProcessor that can consumes message from a queue, in its own thread:

public MessageProcessor(CloudQueue queue, Func<CloudQueueMessage, bool> process)
{
    this.queue = queue;
    this.process = process;
}
public void Start()
{
    Thread thread = new Thread(new ThreadStart(this.Run));
    thread.Start();
}
public void Run()
{
    while (true)
    {
        try
        {
            CloudQueueMessage msg = this.queue.GetMessage();
            if (this.ProcessMessage(msg))
                this.queue.DeleteMessage(msg);
        }
        catch (Exception ex)
        {
            Trace.WriteLine(ex.Message, "Error");
        }
    }
}
public virtual bool ProcessMessage(CloudQueueMessage msg)
{
    if (msg != null && this.process != null)
        return this.process(msg);
    Trace.WriteLine("Working", "Information");
    Thread.Sleep(10000);
    return false;
}

Then, the worker role is launching a fixed number (12) of MessageProcessor. In this way, each instance is dedicated to process many message. I guess that this is not needed in this example. But it was an easy “proof of concept” to test the idea. Part of Run method in worker role;

QueueUtilities qutil = new QueueUtilities(account);
CloudQueue queue = qutil.CreateQueueIfNotExists("numbers");
CloudQueueClient qclient = account.CreateCloudQueueClient();
for (int k=0; k<11; k++) 
{
    CloudQueue q = qclient.GetQueueReference("numbers");
    MessageProcessor p = new MessageProcessor(q, this.ProcessMessage);
    p.Start();
}
MessageProcessor processor = new MessageProcessor(queue, this.ProcessMessage);
processor.Run();

The ProcessMessage is in charge of the real work:

private bool ProcessMessage(CloudQueueMessage msg)
{
    int number = Convert.ToInt32(msg.AsString);
    List<int> numbers = new List<int>() { number };
    while (number > 1)
    {
        if ((number % 2) == 0)
        {
            number = number / 2;
            numbers.Add(number);
        }
        else
        {
            number = number * 3 + 1;
            numbers.Add(number);
        }
    }
    StringBuilder builder = new StringBuilder();
    builder.Append("Result:");
    foreach (int n in numbers)
    {
        builder.Append(" ");
        builder.Append(n);
    }
    Trace.WriteLine(builder.ToString(), "Information");
    return true;
}

The code of this example is in my

Next steps: more distributed apps (genetic algorithm, web crawler…)

Keep tuned!

Angel “Java” Lopez

http://www.ajlopez.com

http://twitter.com/ajlopez

December 9, 2010

Azure: a simple application

Filed under: .NET, Azure, Cloud Computing, Distributed Computing — ajlopez @ 9:19 am

This is my first post here, about Azure programming. An easy start: an application with one web role, and one worker role:

You can download the solution from my AjCodeKatas Google project. The code is at:

http://code.google.com/p/ajcodekatas/source/browse/#svn/trunk/Azure/AzureNumbers

In the initial web page you can enter a number to process:

If you send the number 10, this data is send to a queue:

protected void btnProcess_Click(object sender, EventArgs e)
{
    int number = Convert.ToInt32(txtNumber.Text);
    CloudQueueMessage msg = new CloudQueueMessage(number.ToString());
    WebRole.NumbersQueue.AddMessage(msg);
}

The worker role is reading the queue. It decrements the number, and if the result is still positive, it is reinjected in the queue:

        public override void Run()
        {
            // This is a sample worker implementation. Replace with your logic.
            Trace.WriteLine("NumberWorkerRole entry point called", "Information");
            CloudStorageAccount account = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
            QueueUtilities qutil = new QueueUtilities(account);
            CloudQueue queue = qutil.CreateQueueIfNotExists("numbers");
            while (true)
            {
                CloudQueueMessage msg = queue.GetMessage();
                if (msg != null)
                {
                    int number = Convert.ToInt32(msg.AsString);
                    Trace.WriteLine(string.Format("Processing number: {0}", number), "Information");
                    number--;
                    if (number > 0)
                    {
                        CloudQueueMessage newmsg = new CloudQueueMessage(number.ToString());
                        queue.AddMessage(newmsg);
                    }
                    queue.DeleteMessage(msg);
                }
                else
                {
                    Thread.Sleep(10000);
                    Trace.WriteLine("Working", "Information");
                }
            }
        }

You can see the output at Development Fabric UI:

Note the use of AzureLibrary to create a Queue:

        public CloudQueue CreateQueueIfNotExists(string queuename)
        {
            CloudQueueClient queueStorage = this.account.CreateCloudQueueClient();
            CloudQueue queue = queueStorage.GetQueueReference(queuename);
            
            Trace.WriteLine("Creating queue...", "Information");
            Boolean queuecreated = false;
            while (queuecreated == false)
            {
                try
                {
                    queue.CreateIfNotExist();
                    queuecreated = true;
                }
                catch (StorageClientException e)
                {
                    if (e.ErrorCode == StorageErrorCode.TransportError)
                    {
                        Trace.TraceError(string.Format("Connect failure! The most likely reason is that the local " +
                            "Development Storage tool is not running or your storage account configuration is incorrect. " +
                            "Message: '{0}'", e.Message));
                        System.Threading.Thread.Sleep(5000);
                    }
                    else
                    {
                        throw;
                    }
                }
            }
            return queue;
        }

I borrowed part of this code from Azure SDK samples.

Next steps to explore:

- Add instrumentation to worker role

- Use more instances, and generate more message (an explosion-like pattern)

- Add multithreading support in the worker role

- Example using table and blob storage

And the big ones:

- Inject and run AjSharp (or AjTalk) code at worker roles

- Implements a distributed application using roles (distributed genetic algorithm, distributed fractal or ray-tracer, montecarlo simulation, etc…)

Keep tuned!

Angel “Java” Lopez

http://www.ajlopez.com

http://twitter.com/ajlopez

Older Posts »

Theme: Shocking Blue Green. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.