Angel \”Java\” Lopez on Blog

December 3, 2007

Grid Computing Programming

Filed under: Grid Computing, Software Development — ajlopez @ 10:53 am

In previous posts, I described my projects AjMessages and AjAgents, giving some source code to play with:

AjMessages: a message processor

Agents using Concurrency and Coordination Runtime (CCR)

Thanks my experience in the venerable (and old) project AjServer (see Hacia el AjServer (Spanish)), I could write the core of AjMessage in a day, from scratch. It was a funny day, of hard coding. Since then, I spent many hours adapting the code to use a custom message, instead of a WCF message, and writing the capability of use pluggable input and output channels, so it can achieve transport independence, to some degree.

One of the features of AjMessage is to communicate many program instances, that could be executing in different machines. They can send message with dynamic configuration, so we can distribute tasks to those machines at runtime. I must still resolve the assembly remote distribution, some ideas to explore at the end of this post. The same action (action is the minimal logical step to execute) can be attended by different machines. AjAgents project points to use distributed agents, but for now, it’s only a local application. I envision that any agent could be running in any machine, in a transparent way. I think that the use of agents, or arbitrary tasks, could be a more flexible way of distribution, in contrast to message passing as in AjMessages. An agent is capable of lunch many subtasks, assign them to other agents in an asynchronous way; it can send partial results to partner agents, and it can dialog and negotiate with many other agents in a more organized way.

Let explore some ideas for AjMessages project. It could be distributed using a server machine that controls the other instances of the system:

These scenario has similiarities with the concept of grid computing. I want to enumerate some of these scenarios, how we can use the system to use it in a grid of nodes and servers. “Grid Computing”, as many technological “buzzword”, has a wide scope, but let try to define it.

According to the excellent article from IBM people:

New to Grid Computing

Grid Computing can use a pool of server, storage systems, and networs, as a unique big system, in such way, that we can manage all those resources in the execution of a task. For the user, or for the application, the grid appears as a whole system.

In the case of AjMessage, following Fabriq ideas, these behaviour can be obtained because the servers execute one or more applications, distributing each action, in a transparent way.

The grid computing concept allows us to use more processing power, without the need of expensive hardware or sofware, using load balancing and task distribution on common machines. The scalability is reached via scale out: more machine in the grid, better results. Depending on the system that organize the distribution, we can add more nodes, obtain more throughput, without touching the application logic.

I imagine a set of machines, composing a grid, and exposing this set to use by users. I think that Grid as a Service is a term that can be coined to describe that arrangement.

Applications

Back to the main topic: what use cases, scenarios, could we imagine to use a grid?

There is a tentative list:

- Genetic Algorithm Processing: A problem could not have a clear solution. Its complexity could grow in exponential form, and then, it can be intractable using conventional approaches. Using genetic algorithms, the program can test many partial solutions, and using change and selection, it can discover better solutions. This work can be parallelized, being an ideal task to run in a grid. I’m collecting some candidates at:

http://del.icio.us/ajlopez/geneticalgorithms

I’m impressed by the results of http://www.darwinathome.org, although I think most of those results are not emergent, but they are consecuences of the selected fitness function.

- Tree Search: In many artificial intelligence problems is needed to explore branches in a search tree. One case is the analysis of play move in a game. It can be extended to business decisions and planning. A grid can help in the decision calculus of the next move in a computer go program, one of the hardest problems in artificial intelligence game programming.

- Web Crawler: the task that explores a site, gets its page contents, analyzes them, detects links, and continues the exploration to other linked pages, is one that can be distributed in many nodes in a grid. While a node gets a page content, other generates tasks for other nodes, as indexing the retrieved content, and retrieve new pages in process.

- Batch Processing: A network of nodes can process a great amount of information, if this info can be splitted in parts. The job could be to trasform data from a database table, to log analysis, to statistic generation. If the input is divisible, each part can be send to different nodes. An example: a node could process January data, meanwhile other ones process the other months. ETL processing in general is another example.

- Email List Distribution: A typical case. A company that offers email list distribution needs to receive, process and resend an incoming message to a list of recipients. The email could need some personalization process. Then, the incoming email could be derived to one or more nodes in the grid, to further process.

- Message Processing: In the actual SOA world, an application receives tons of XML message. Each one needs control, transformation, and content routing. In a grid system, each message is derived to a node. When more throughput is needed, more nodes are added to the grid.

- Workflow Execution: As in the previous example, this is more a scale out distributed task, rather than a grid specific one. A workflow can be designed, and each step can be assigned to a node or set of nodes. For example, in a SaaS application, the steps to make a new tenant provisioning can be executed in a grid. .

- Map Reduce: It’s a programming model to process big data sets. A Map function is specified to process an input key/value pair, commonly many pairs. There is other Reduce function to apply to all intermediate key/value pairs that share the same key. A function Map can receive a document to process, generates word/document pair, and the Recude function take those pairs with the same word, to make a list of documents that contains that word. For a more detailed explanation, see the Google Labs paper about MapReduce: Simplified Data Processing on Large Clusters.

- Biology and Genetic Software Applications: I’m interested in science in general, and in biology in particular. I guessed that there are applications where a grid can be applied, and I think I’m right (recently, I reviewed the course material of Introducción a la Biología Molecular para Programadores given by Sebastian Bassi and his partners). It’s interesting to found that there are implementations like BLAST that can be ported to a grid. See one such approach in the case studies of Digipede.

- Rendering and Image Processing: Many of the rendering, lightning, making of realistic images can be run in parallel.

- Animation Creation: Even if an image cannot be processed in parallel, sometimes we can lunch different tasks, one for each image, in order to produce an animation. A grid can be used to scale out this heavy processing.

- Media Processing: Video compression, key frame detection, scene change detection, can be partitioned to be processed with a grid.

- Simulations: A wide subject. There are systems where it’s not clear what output would be produced given an input. A set of input data set must be processed. Then, each input data set could be given to a node or nodes in the grid. With more nodes, the simulation can produce more results.

Software and languages

Point of view change: a grid can be exposed using web service. An interface can be defined to send tasks to the grid, tasks that can be written by grid programmers using an special SDK or framework. What kind of software can be send to a grid? Some options:

- Complete Assemblies, invoking some (predeterminated or not) methods.

- Scripting Language Programs, running in a “sandbox” interpreter, in order to control the security and health of the node.

- Agents, consisting in assemblies or code to run in an agent virtual machine.

- Grid Domain Specific Languages, designed to take advantage of the grid computing concept.

Such grid can be offered as a service to other service (even other grids). The concept of Grid as a Service emerges. The rent of its power, service level control, health monitoring, and more, are applications to consider in the future for these scenarios.

Links and resources

I’ll write in more detail about grid computing. For now, you can read the mentioned IBM article:

New to Grid Computing

There is an interesting open source implementation in Java:

GridGain

(The drawing at the beginning of this post was “inspired” in one from GridGain; but in my version, the nodes can communicate each other, using the location independence of each action in AjMessage).

I’ve collected links about grid computing in my del.icio.us account (del.icio.us is addictive):

http://del.icio.us/ajlopez/gridcomputing

For this post, I’ve pay attention to

http://www.gridgain.com
http://www.digipede.net
http://www.gridgistics.net/

Digipede implementation is very interesing. They distributed assemblies. There is a server that receive tasks, distributes them into the grid nodes, where the Digipede agents are running. The system keeps a database with the launched pending, and terminated tasks. It expose a control web interface. A user applicaction can communicate with the Digipede server, using a dedicated web service.

GridGain has a feature: “gridifying” a Java method, using an annotation: interesting idea to explore.

Some crazy ideas

I would need medication, but there is a list of crazy ideas to implement:

- Code Generation in a Grid: To generate code, using my project AjGenesis or anything else, is used to execute a list of steps. Not all of these steps must be executed in order: most of them could be launched in parallel, ideally in a grid. A code generation engine can consists of agents, mini expert systems, specialized on completing the model, making transformation, taking decisions, and more, in order to generate a system. A grid can host all these pieces.

- Computer Go in a Grid: I mentioned above, related to tree search. There is some work, gridifying GNUGo. For me, it’s a super interesing topic. Again, a community of agents, running distributed in a grid, can achieve more results than a common approach. The game of Go is not like chess: no game program could beat a professional human, yet. It merits more creative aproximations to the problem. More about the Computer Go at:

http://del.icio.us/ajlopez/computergo
Computer Go

- “Gridified” Programming Language: I have ideas to extends AjBasic with CCR or something similar, or to implement something more oriented to functional programming, where some operator (list processing, others) could be easily gridifiable. It would be interesting to write such language: its programs could run in a sole machine, but, transparently, could be distributed to multiple nodes on a grid. AjG# is coming… ;-)

Conclusion

As you see, grid computing is a great topic. I want to thanks here to Gabriel Szlechtman: he suggested many of the enumerated scenarios.

Any other applications, implementations, to comment?

Angel “Java” Lopez
http://www.ajlopez.com/en

14 Comments »

  1. [...] Grid Computing Programming [...]

    Pingback by Agents in a Grid « Angel “Java” Lopez on Blog — January 3, 2008 @ 9:32 am

  2. [...] Imagen tomada de ajlopez.wordpress.com [...]

    Pingback by Felix J. Tapia » Prediccin de enzimas por computacin en red — March 20, 2008 @ 10:01 pm

  3. [...] Grid Computing Programming [...]

    Pingback by Agentes en Grid - Angel "Java" Lopez — May 8, 2008 @ 10:14 am

  4. [...] Grid as a Service Posted November 10, 2008 Filed under: Grid Computing | Since last year, I’m working with technologies related to distributed computing. Currently, my work is related to Windows High Performance Computing. But I was in touch with DSS/CCR, WCF, and I examined Java implementations, like GridGain. I mentioned Grid as a Service as an idea to implement in my post Grid Computing Programming. [...]

    Pingback by Grid as a Service « Angel “Java” Lopez on Blog — November 10, 2008 @ 11:21 am

  5. [...] Mencioné el concepto de Grid as a Service como una idea para implementar en mi posts Grid Computing Programming, Programando para una Grid,Más programando para una [...]

    Pingback by Grid as a Service - Angel "Java" Lopez — November 11, 2008 @ 8:28 am

  6. hi all,
    i am working in grid computing, my idea is to replicate checkpoints on all checkpoint servers, anyone if has any experience or program related to it please send me, i’ll b greatful.
    thanx in advance

    Comment by neha — February 9, 2010 @ 10:40 am

  7. Great post really thanks for all information by you.I like Genetic Algorithm Processing thing because if don’t about it we never cure for that.Generally i follow some alogrithm for optimization of website they tough but very effective…

    Comment by sami — September 17, 2010 @ 12:17 pm

  8. I am just getting into this stuff, thank you for the great info.

    Comment by Rolie R — July 27, 2011 @ 8:15 pm

  9. I am glad to get such type of useful article.

    Comment by Nazre Imam — June 4, 2012 @ 6:41 am

  10. Excellent post..Thanks for sharing it.

    Comment by Jewellery store in Ranchi — September 12, 2012 @ 8:01 am

  11. Like other types of survival kits, these would be packages
    of gear and supplies that may aid the surviving groups
    in the battle to live. When helping others, you use your know-how to get things quicker and easier done
    forever helping others. Make sure the clothing can withstand the weather elements.

    Comment by Isabell — May 2, 2013 @ 7:07 pm

  12. When choosing a tattoo design you should be
    sure not only to pick a design that fits your personal style but also one that
    has significant meaning to you. For a guy whose job description
    involves the occasional punch to the face, even he is not immune to the pain that can
    be associated with getting tattooed. Then again, you can
    also go for the Marqueson tribal tattoo designs.

    Comment by tattoo — June 29, 2013 @ 11:45 am

  13. Nice Article.i really leran something .keep it up.

    Comment by Web Development Surrey — July 5, 2013 @ 6:57 am

  14. Hand held vaporizer models are small sufficient to fit
    inside the palm of one’s hand; fantastic to just through in you pocket or purse anytime you travel. The nicotine shared in a surrounding with other people who smoke cigarette is what we call the second hand smoke which is very much dangerous and harmful than taking the tobacco itself. That project came to a halt when his car was sabotaged, the pressure release valve being clamped down tight, causing his car to explode while he was driving it.

    Comment by portable vaporizer reviews — July 19, 2013 @ 8:40 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Shocking Blue Green. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 67 other followers

%d bloggers like this: