Category Archives: Algorithms

Bioinformatics: Links, News And Resources (4)

Previous Post

Birds of a Feather: Functional Programming in Bioinformatics | Commercial Users of Functional Programming
http://cufp.org/conference/sessions/2013/birds-feather-functional-programming-bioinformatic

Needleman Wunsch Algorithm in C# – CodeProject
http://www.codeproject.com/Tips/638377/Needleman-Wunsch-Algorithm-in-Csharp

BioSmalltalk: A pure object system and library for bioinformatics
http://bioinformatics.oxfordjournals.org/content/early/2013/07/09/bioinformatics.btt398.abstract

L Fu – Dao: a novel programming language for bioinformatics
http://www.slideshare.net/jandot/l-fu-dao-a-novel-programming-language-for-bioinformatics

Homepage: Max-Planck-Institut für Informatik
http://www.mpi-inf.mpg.de/

Functional DSLs for Biocomputation
http://www.infoq.com/presentations/Functional-DSL-Biocomputation

BioSmalltalk: How to do a BLAST from Smalltalk
http://biosmalltalk.blogspot.com.ar/2012/07/how-to-do-blast-from-smalltalk.html

BioSmalltalk
http://biosmalltalk.blogspot.com.ar/

BLAST URLAPI
http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/new/node1.html

Oxford Journals | Life Sciences & Mathematics & Physical Sciences | Bioinformatics
http://bioinformatics.oxfordjournals.org/

Cell – A Whole-Cell Computational Model Predicts Phenotype from Genotype
http://www.cell.com/abstract/S0092-8674(12)00776-3

My Links
http://delicious.com/ajlopez/bioinformatics

Keep tuned!

Angel “Java” Lopez
http://www.ajlopez.com
http://twitter.com/ajlopez

Bioinformatics: Links, News And Resources (3)

Previous Post
Next Post

bioinformatics – node.js modules
https://nodejsmodules.org/tags/bioinformatics

YOKOFAKUN: Server-side javascript: translating a DNA with Node.js
http://plindenbaum.blogspot.com.ar/2010/12/server-side-javascript-translating-dna.html

Bioinformatics
http://paper.li/Karelman/1353414629

Karelman (Karelman) on Twitter
https://twitter.com/Karelman

sbassi/DNAFilter · GitHub
https://github.com/sbassi/DNAFilter

Bioinformatics Web Servers – University of Reading
http://www.reading.ac.uk/bioinf/

UCL-CS Bioinformatics: Introduction
http://bioinf.cs.ucl.ac.uk/introduction/

Python for Bioinformatics – Sebastian Bassi – Google Books
http://books.google.com.ar/books/about/Python_for_Bioinformatics.html?id=JCjRAAAACAAJ&redir_esc=y

Perl and Javascript: bioinformatics in a browser window
http://act.perl.org.il/ilpw2012/talk/3971

EMBER: Login
http://www.ember.man.ac.uk/login.php

Web apps for bioinformatics | KurzweilAI
http://www.kurzweilai.net/web-apps-for-bioinformatics

Contact Bioinformatics.fr
http://www.bioinformatics.fr/index.php
Bio-Javascript? – BioStar
http://www.biostars.org/p/5735/

bio-js – A bioinformatics framework in JavaScript – Google Project Hosting
http://code.google.com/p/bio-js/

The Sequence Manipulation Suite
http://www.bioinformatics.org/sms2/

biosmalltalk – Bioinformatics Library for Smalltalk – Google Project Hosting
http://code.google.com/p/biosmalltalk/

BioSmalltalk: A pure object system for doing bioinformatics with Smalltalk – SEQanswers
http://seqanswers.com/forums/showthread.php?t=25985

My Links
http://delicious.com/ajlopez/bioinformatics

Bioinformatics: Links, News And Resources (2)

Previous Post
Next Post

Data mining, forecasting and bioinformatics competitions on Kaggle
http://www.kaggle.com/

thebird.nl
http://thebird.nl/
Pjotr is a scientist/biologist/open source programmer,

BioTeam
http://bioteam.net/
BioTeam is a high-performance consulting practice. We are dedicated to delivering objective, technology agnostic solutions to the life science researchers. We leverage the right technologies customized to our client’s unique needs in order to enable them to reach their scientific objectives.

ANNOVAR website
http://www.openbioinformatics.org/annovar/annovar_db.html
Preparation of local annotation databases

biotoolbox – Tools for querying and analysis of genomic data
http://code.google.com/p/biotoolbox/

NCBI HomePage
http://www.ncbi.nlm.nih.gov/

Calling SNPs with Samtools
http://ged.msu.edu/angus/tutorials-2011/snp_tutorial.html

Cytoscape: An Open Source Platform for Complex Network Analysis and Visualizatio…
http://www.cytoscape.org/

ROCR: Classifier Visualization in R
http://rocr.bioinf.mpi-sb.mpg.de/

OpenWetWare
http://openwetware.org/wiki/Main_Page
OpenWetWare is an effort to promote the sharing of information, know-how, and wisdom among researchers and groups who are working in biology & biological engineering.

Circos
http://circos.ca/

DNA seen through the eyes of a coder
http://ds9a.nl/amazing-dna/

Ensembl
http://www.ensembl.org/index.html
The Ensembl project produces genome databases for vertebrates and other eukaryotic species, and makes this information freely available online.

DNAnexus
https://dnanexus.com/
Powering the genomics revolution

Integrated system of aging biomarkers
http://www.sens.org/node/2223

Monte Carlo Method
http://en.wikipedia.org/wiki/Monte_Carlo_Method

The PubChem Project
http://pubchem.ncbi.nlm.nih.gov/

RCSB Protein Data Bank
http://www.rcsb.org/pdb/home/home.do

Welcome to BioConductor — bioconductor.org
http://www.bioconductor.org/

GoPubMed
http://www.gopubmed.org/

KEGG: Kyoto Encyclopedia of Genes and Genomes
http://www.genome.jp/kegg/

Entrez cross-database search
http://www.ncbi.nlm.nih.gov/sites/gquery

European Bioinformatics Institute
http://www.ebi.ac.uk/

UCSC Genome Browser Home
http://genome.ucsc.edu/

Uri Alon’s Molecular Cell Biology Lab
http://www.weizmann.ac.il/mcb/UriAlon/

the Gene Ontology
http://www.geneontology.org/

B.A.B.A
http://baba.sourceforge.net/
BABA is an applet that tries to explains how some basic algorithms of bioinformatics work.

Human Genome Project Information
http://www.ornl.gov/sci/techresources/Human_Genome/home.shtml

Main Page – partsregistry.org
http://partsregistry.org/Main_Page
Collection of genetic parts that can be mixed and matched to build synthetic biology devices and systems.

Python course in Bioinformatics
http://www.pasteur.fr/recherche/unites/sis/formation/python/

BiologicalNetworks
http://biologicalnetworks.net/index.php

Vadlo
http://www.vadlo.com/
Life Sciences Search Engine

What is Quirrel?
http://whatisquirrel.com/index.html
Quirrel is a purely declarative query language designed for performing analytics and statistics on large-scale, multi-structured data sets.

Bioinformatica
http://www.bioinformatica.info/

My Links
http://delicious.com/ajlopez/bioinformatics

Bioinformatics: Links, News and Resources (1)

Next Post

Bioinformatics has many interesting problems, algorithms and software, related to parallelism, distributed computing, scalability, and algorithms. This is my first list of links about this fascinating topic, more lists are coming.

Bioinformatics and the Future of Hadoop
http://www.genomeweb.com/blog/bioinformatics-and-future-hadoop
The Future of Hadoop in Bioinformatics | insideHPC.com
http://insidehpc.com/2011/07/03/the-future-of-hadoop-in-bioinformatics/

Clojure or Scala for bioinformatics/biostatistics/medical research – Stack Overflow
http://stackoverflow.com/questions/5250459/clojure-or-scala-for-bioinformatics-biostatistics-medical-research

Riding the Elephant | The Molecular Ecologist
http://tomato.biol.trinity.edu/blog/2011/02/riding-the-elephant/

Protein Structure Methods and Algorithms
http://www.amazon.com/Protein-Structure-Methods-Algorithms-Bioinformatics/dp/0470470593

HPCwire: Scientists Ratchet Up Understanding of Cellular Protein Factory
http://www.hpcwire.com/offthewire/Scientists-Ratchet-Up-Understanding-of-Cellular-Protein-Factory-111146994.html

Molecular Animation – Where Cinema and Biology Meet
http://www.nytimes.com/2010/11/16/science/16animate.html

Microsoft Research Makes Microsoft Biology Foundation and MODISAzure-Based Environmental Service
http://www.microsoft.com/Presspass/press/2010/oct10/10-12MSeSciencePR.mspx

Bioinformatics Programming Using Python
http://oreilly.com/catalog/9780596154509/

Computer gamers crack protein-folding puzzle
http://www.newscientist.com/article/mg20727725.100-computer-gamers-crack-proteinfolding-puzzle.html

Mapreduce and Hadoop Algorithms in Bioinformatics Papers | Abhishek Tiwari
http://www.abhishek-tiwari.com/2010/08/mapreduce-and-hadoop-algorithms-in-bioinformatics-papers.html
Gamers beat algorithms at finding protein structures
http://arstechnica.com/science/news/2010/08/gamers-beat-algorithms-for-finding-protein-structures.ars

Nature paper decision | Foldit
http://fold.it/portal/node/987897

Microsoft Biology Foundation 1.0 Released – Parallel Programming with .NET
http://blogs.msdn.com/b/pfxteam/archive/2010/07/12/10037496.aspx

The Molecular Programming Project – Caltech – U
http://www.molecular-programming.org/

Boris Schmid, PhD
http://www.mendeley.com/profiles/boris-schmid/

Research field: Biological Sciences – Bioinformatics
Theoretical / Systems Biology: modeling of evolution, population dynamics, epidemiology, immunology, virology, networks.

bioinformatics toolkit in clojure: what would that look like? – Clojure | Google Groups
http://groups.google.com/group/clojure/browse_thread/thread/46945143998def7f?hl=en

Saaien Tist: Encounter with incanter – about clojure, incanter and bioinformatics
http://saaientist.blogspot.com/2010/06/encounter-with-incanter-about-clojure.html

Paul W.K. Rothemund
http://www.dna.caltech.edu/~pwkr/
I am interested in how processes in biology and chemistry can actually act as computers and execute molecular algorithms

Python and databases (Mysql and SQLite) « Python for Bioinformatics
http://py4bio.com/2010/05/28/python_databases_mysql_sqlite/

DataAllure: Hadoop for DNA sequence analysis
http://blog.dataallure.com/2010/04/hadoop-for-dna-sequence-analysis.html

Multi-core Parallelization in Clojure – a Case Study
http://www.slideshare.net/adorepump/multicore-parallelization-in-clojure-a-case-study

Hadoop for Bioinfomatics – Deepak Singh on Vimeo
http://vimeo.com/7351342

Analyzing Human Genomes with Hadoop » Cloudera Hadoop & Big Data Blog
http://www.cloudera.com/blog/2009/10/15/analyzing-human-genomes-with-hadoop/

My Links:
http://www.delicious.com/ajlopez/bioinformatics

Angel “YesIHaveAGenoma” Lopez 🙂
http://www.ajlopez.com
http://twitter.com/ajlopez

SimpleGA (1) Genetic Algorithms in Javascript/Node.js

Two week ago, I was writing:

https://github.com/ajlopez/SimpleGA

a Simple Genetic Algorithm base framework, that supports population, evaluation and mutator operators (I should add crossover operators). The base algorithm is based on

A Genetic Algorithm Tutorial (pdf)

You can create a population of genotypes, each one with an evaluation function. The library evaluates each genotypes and prepares another population, based on the value of each genotype, copying the best ones, and mutating some of the initial specimen based on fitness. The fitness value is v = fi / fa, where fi is the individual fitness evaluation, and fa is the fitness average. The integer part of v determines the count of copies of an individual that survives to the next generation. The fraction part is the probability of having a mutated copy in the next generation set of genotypes:

var newpopulation = [];

for (var k = 0; k < l; k++) {
	if (values[k] < 0)
		continue;
		
	var fitness = values[k] / total;
	
	if (fitness < 0)
		continue;
		
	var ntimes = Math.floor(fitness);
	var fraction = fitness - ntimes;
	
	for (var j = 0; j < ntimes; j++)
		newpopulation.push(population[k]);
		
	if (fraction > 0 && Math.random() <= fraction)
		newpopulation.push(population[k]);
}

if (mutators && mutators.length > 0) {
	l = newpopulation.length;
	var lm = mutators.length;
	
	for (k = 0; k < l; k++) {
		var mutator = mutators[Math.floor(Math.random() * lm)];
		newpopulation[k] = mutator.mutate(newpopulation[k]);
	}
}

(I should add crossover operator support). There is a sample implementing the Travelling Saleman Problem

https://github.com/ajlopez/SimpleGA/blob/master/samples/tsp/tsp.js

running in console using https://github.com/ajlopez/SimpleGA/blob/master/samples/tsp/program.js

It can run from the console, or you can launch a local web page:

https://github.com/ajlopez/SimpleGA/blob/master/samples/tsphtml/index.html

I wrote client/server version (web page at browser, Node.js server program) and a distributed version (the evaluation of many population running in many server nodes, coordinated by a Node.js server, showing results in a browser).

Next steps: add crossover operators, sample short descriptions, more samples.

Keep tuned!

Angel “Java” Lopez

http://www.ajlopez.com

http://twitter.com/ajlopez

MapReduce: Links, News and Resources (1)

Ones of my preferred topics in programming are algorithms and distributed computing. You can have both with MapReduce. These are some of my links (thanks to @asehmi for his help; he sent me some of these links).

http://en.wikipedia.org/wiki/MapReduce

MapReduce is a software framework introduced by Google in 2004 to support distributed computing on large data sets on clusters of computers.[1] Parts of the framework are patented in some countries.[2]

The framework is inspired by the map and reduce functions commonly used in functional programming,[3] although their purpose in the MapReduce framework is not the same as their original forms.[4]

MapReduce libraries have been written in C++, C#, Erlang, Java, OCaml, Perl, Python, PHP, Ruby, F#, R and other programming languages

MapReduce: Simplified Data Processing on Large Clusters
http://labs.google.com/papers/mapreduce.html

Parallel Processing Using the Map Reduce Programming Model
http://blog.diskodev.com/parallel-processing-using-the-map-reduce-prog

Graph Twiddling in a MapReduce World
http://www.computer.org/portal/web/csdl/doi/10.1109/MCSE.2009.120

Cloud9: a MapReduce library for Hadoop
http://www.umiacs.umd.edu/~jimmylin/cloud9/docs/index.html

MapSharp
http://mapsharp.codeplex.com/
An implementation of Map-Reduce in C#

Twister: iterative MapReduce
http://www.iterativemapreduce.org/

ySpace Qizmt – MySpace’s Open Source Mapreduce Framework
http://code.google.com/p/qizmt/

Cascading
http://www.cascading.org/
Cascading is a Data Processing API, Process Planner, and Process Scheduler used for defining and executing complex, scale-free, and fault tolerant data processing workflows on an Apache Hadoop cluster. All without having to ‘think’ in MapReduce.

Project Daytona – Microsoft Research
http://research.microsoft.com/en-us/projects/azure/daytona.aspx
Iterative MapReduce on Windows Azure

InfoQ: Introduction to Oozie
http://www.infoq.com/articles/introductionOozie
Combine multiple Map/Reduce jobs into a logical unit of work

InfoQ: Ville Tuulos on Big Data and Map/Reduce in Erlang and Python with Disco
http://www.infoq.com/interviews/tuulos-erlang-mapreduce

Spark Cluster Computing Framework
http://www.spark-project.org/

Preview of Storm: The Hadoop of Realtime Processing – BackType Technology
http://tech.backtype.com/preview-of-storm-the-hadoop-of-realtime-proce

Hadoop in Azure – Distributed Development – Site Home – MSDN Blogs
http://blogs.msdn.com/b/mariok/archive/2011/05/11/hadoop-in-azure.aspx

MapReduce: A Soft Introduction
http://www.javacodegeeks.com/2011/05/mapreduce-soft-introduction.html

Mapreduce & Hadoop Algorithms in Academic Papers
http://atbrox.com/2011/05/16/mapreduce-hadoop-algorithms-in-academic-papers-4th-update-may-2011/

MSDN Magazine: MapReduce in F# – Parsing Log Files with F#, MapReduce and Windows Azure
http://msdn.microsoft.com/en-us/magazine/gg983490.aspx

F#: With a few lines of code entered into the powershell and analyze gigabytes of cloud data! – Systems, architecture and engineering solutions!
http://blogs.msdn.com/b/socal-sam/archive/2011/04/26/f-with-a-few-lines-of-code-entered-into-the-powershell-and-analyze-gigabytes-of-cloud-data.aspx

Data-Intensive Text Processing with MapReduce
http://www.umiacs.umd.edu/~jimmylin/MapReduce-book-final.pdf

The Geomblog: Workshop on Parallelism, and a "breakthrough" in combinatorial geometry
http://geomblog.blogspot.com/2010/11/workshop-on-parallelism-and.html

Pragmatic Programming Techniques: Designing algorithms for Map Reduce
http://horicky.blogspot.com/2010/08/designing-algorithmis-for-map-reduce.html

Mapreduce and Hadoop Algorithms in Bioinformatics Papers | Abhishek Tiwari
http://www.abhishek-tiwari.com/2010/08/mapreduce-and-hadoop-algorithms-in-bioinformatics-papers.html

Pragmatic Programming Techniques: Map/Reduce to recommend people connection
http://horicky.blogspot.com/2010/08/mapreduce-to-recommend-people.html

High Scalability – Dremel: Interactive Analysis of Web-Scale Datasets – Data as a Programming Paradigm
http://highscalability.com/blog/2010/8/4/dremel-interactive-analysis-of-web-scale-datasets-data-as-a.html

Tutorial: MapReduce with Riak « myNoSQL
http://nosql.mypopescu.com/post/849130434/tutorial-mapreduce-with-riak

High Scalability – How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data

Pregel
http://portal.acm.org/citation.cfm?id=1807167.1807184

Pregel: Google’s other data-processing infrastructure | Scalable web architectures
http://www.royans.net/arch/pregel-googles-other-data-processing-infrastructure/

Apache Mahout – Overview
http://mahout.apache.org/
The Apache Mahout™ machine learning library’s goal is to build scalable machine learning libraries.

InfoQ: Billy Newport Discusses Parallel Programming in Java
http://www.infoq.com/interviews/billy-newport-parallel

Sector/Sphere: High Performance Distributed Data Storage and Processing
http://sector.sourceforge.net/

MapReduce – The Fanfiction « Snail in a Turtleneck
http://www.snailinaturtleneck.com/blog/2010/03/15/mapreduce-the-fanfiction/

Map / Reduce – A visual explanation
http://ayende.com/Blog/archive/2010/03/14/map-reduce-ndash-a-visual-explanation.aspx

An Introduction to JavaScript Map/Reduce in Riak on Vimeo
http://vimeo.com/9188550

Graph algorithms (and MapReduce)
http://www.umiacs.umd.edu/~jimmylin/cloud-2010-Spring/session5-slides.pdf

Using MapReduce Functionality To Process Data
http://freemakelove.info/http:/freemakelove.info/html/y2010/2145_using-mapreduce-functionality-to-process-data.html

My Links
http://www.delicious.com/ajlopez/mapreduce

More links about Hadoop and other systems are coming.

Keep tuned!

Angel "MapReduced" Lopez
http://www.ajlopez.com
http://twitter.com/ajlopez

Algorithms: Links, News, Resources (1)

Sometimes, I found an article or paper about an algorithm, or a class of algorithms. These are some my recent discoveries:

Texture synthesis
http://en.wikipedia.org/wiki/Texture_synthesis#Patch-based_texture_synthesis

First results from GHC’s new garbage collector – GHC
http://hackage.haskell.org/trac/ghc/blog/new-gc-preview

Superformula
http://en.wikipedia.org/wiki/Superformula

FunctionSource: Path finding with Canvas
http://functionsource.com/post/path-finding-with-canvas

F# Code: Hindley Milner Type Inference Sample Implementation
http://fsharpcode.blogspot.com/2010/08/hindley-milner-type-inference-sample.html

What is Hindley-Milner? (and why is it cool?) – Code Commit
http://www.codecommit.com/blog/scala/what-is-hindley-milner-and-why-is-it-cool

compiler – implementing type inference – Stack Overflow
http://stackoverflow.com/questions/415532/implementing-type-inference

LEGO Mindstorms Rubik’s Cube Solver
http://tiltedtwister.com/index.html

Algorithmic Game Theory and Artificial Intelligence
http://agtb.wordpress.com/2011/01/15/agt-and-ai/

Eternity II Solver
http://www.shortestpath.se/eii/index.html

the { buckblogs :here }: Maze Generation: Growing Tree algorithm
http://weblog.jamisbuck.org/2011/1/27/maze-generation-growing-tree-algorithm

Sorting Obsession
http://pepijndevos.nl/sorting-obsession/

Las máquinas poéticas de los libros imaginarios (i): Raimundo Lulio (Spanish)
http://laexcepciondelaregla.wordpress.com/2010/01/05/las-maquinas-poeticas-de-los-libros-imaginarios-i/

How many numbers are squares mod m
http://www.johndcook.com/blog/2008/11/19/how-many-numbers-are-squares-mod-m/

Maze Generation: Prim’s Algorithm
http://weblog.jamisbuck.org/2011/1/10/maze-generation-prim-s-algorithm

Maze Generation: Kruskal’s Algorithm
http://weblog.jamisbuck.org/2011/1/3/maze-generation-kruskal-s-algorithm

The Craig Web Experience: Understanding the Halting Problem
http://www.cgl.uwaterloo.ca/~csk/halt/

Amazon.com: Protein Structure Methods and Algorithms (Wiley Series in Bioinformatics) (9780470470596): Huzefa Rangwala, George Karypis: Books
http://www.amazon.com/Protein-Structure-Methods-Algorithms-Bioinformatics/dp/0470470593

Algorithmia
http://algorithmia.codeplex.com/
Algorithm and data-structure library for .NET 3.5 and up. Algorithmia contains sophisticated algorithms and data-structures like graphs, priority queues, command, undo-redo and more.

Azul’s Pauseless Garbage Collector
http://www.artima.com/lejava/articles/azul_pauseless_gc.html

Rete Algorithm
http://en.wikipedia.org/wiki/Rete_algorithm

Next generation of algorithms inspired by problem-solving ants
http://www.physorg.com/news/2010-12-algorithms-problem-solving-ants.html

Langton’s Ant
http://www.youtube.com/watch?v=1X-gtr4pEBU

An Events Based Algorithm for Distributing Concurrent Tasks on Multi-Core Architectures
http://geonumerics.mit.edu/publications/FinalReport01.pdf

YouTube – What different sorting algorithms sound like
http://www.youtube.com/watch?v=t8g-iYGHpEA

To Trie or not to Trie – a comparison of efficient data structures
http://bhavin.directi.com/to-trie-or-not-to-trie-a-comparison-of-efficient-data-structures/

How to differentiate a non-differentiable function — The Endeavour
http://bhavin.directi.com/to-trie-or-not-to-trie-a-comparison-of-efficient-data-structures/

The greatest program ever written
http://www.kuro5hin.org/story/2001/8/10/12620/2164
http://users.ox.ac.uk/~uzdm0006/scans/1kchess/

My links:
http://www.delicious.com/ajlopez/algorithm

Angel "Java” Lopez
http://www.ajlopez.com
http://twitter.com/ajlopez