Angel \”Java\” Lopez on Blog

May 25, 2013

Mass Programming Language (4) Lexer and Parser

Filed under: .NET, C Sharp, Mass, Open Source Projects, Programming Languages — ajlopez @ 10:38 am

Previous Post

In the Mass programming language implementation, I have an enumeration and a class:

A token represents a word of code to be processed. Lexer is in charge of separate the code in words/tokens. And Parse takes that token stream and returns expressions and commands:

Lexer constructor receives a string:

public Lexer(string text)
{
    this.text = text;
}

This string is processed to be separated into tokens. Notice the lexer distinguish between operators (like +) and separators (like parenthesis). It take into account the end of line as a token, too (in other programming language, like C, the end of line is simply a blank space). The main method of Lexer is NextToken that returns the next token in code text. In some situations, it is needed to save a consumend token, so there are methods like PushToken and its variants.

Internally, Parser manages a Lexer. You can get the next command calling ParseCommand method, and the next expression using ParseExpression. When the text is exhausted, those methods return null.

I should modify Lexer to consume a text stream, instead of a string, so it could process an input, like console.

I should think about unifying commands and expressions, a la Ruby where all is an expression.

Keep tuned!

Angel “Java” Lopez

http://www.ajlopez.com

http://twitter.com/ajlopez

May 10, 2013

TDD Kata (4): Lawnmover

Previous Post

In the previous post, I commented about by experience at pre-round at Google Code Jam.

The exercise B was Lawnmover:

https://code.google.com/codejam/contest/2270488/dashboard#s=p1

Problem

Alice and Bob have a lawn in front of their house, shaped like an N metre by M metre rectangle. Each year, they try to cut the lawn in some interesting pattern. They used to do their cutting with shears, which was very time-consuming; but now they have a new automatic lawnmower with multiple settings, and they want to try it out.

The new lawnmower has a height setting – you can set it to any height h between 1 and 100 millimetres, and it will cut all the grass higher than h it encounters to height h. You run it by entering the lawn at any part of the edge of the lawn; then the lawnmower goes in a straight line, perpendicular to the edge of the lawn it entered, cutting grass in a swath 1m wide, until it exits the lawn on the other side. The lawnmower’s height can be set only when it is not on the lawn.

Alice and Bob have a number of various patterns of grass that they could have on their lawn. For each of those, they want to know whether it’s possible to cut the grass into this pattern with their new lawnmower. Each pattern is described by specifying the height of the grass on each 1m x 1m square of the lawn.

The grass is initially 100mm high on the whole lawn.

I solved it, see:

https://github.com/ajlopez/TddOnTheRocks/tree/master/Lawnmover

As usual, the commit history by test:

https://github.com/ajlopez/TddOnTheRocks/commits/master/Lawnmover

I could solve the first small set provided by Google. But I failed with the big set, because my algorithm was not efficient enough. The big set should be solved in 8 minutes, and if you cannot, there is no other chance. My initial algorithm (see comments) attacked the largest number in cells. My second algorithm switched to start with the minimal numbers (my first guess was that both algorithms had the same time, but I was wrong). I learnt: some problems in Google are hard to test by hand, we don’t have the use case (input and expected output), that is the main difficult.

More katas are comming.

Keep tuned!

Angel “Java” Lopez
http://www.ajlopez.com
http://twitter.com/ajlopez

May 8, 2013

TDD Kata (3): Tic-Tac-Toe-Tomek

Previous Post
Next Post

Two weeks ago, I participated of the pre round of Google Code Jam. Exercise A was TicTacToeTomek:

https://code.google.com/codejam/contest/2270488/dashboard

Problem

Tic-Tac-Toe-Tomek is a game played on a 4 x 4 square board. The board starts empty, except that a single ‘T’ symbol may appear in one of the 16 squares. There are two players: X and O. They take turns to make moves, with X starting. In each move a player puts her symbol in one of the empty squares. Player X’s symbol is ‘X’, and player O’s symbol is ‘O’.

After a player’s move, if there is a row, column or a diagonal containing 4 of that player’s symbols, or containing 3 of her symbols and the ‘T’ symbol, she wins and the game ends. Otherwise the game continues with the other player’s move. If all of the fields are filled with symbols and nobody won, the game ends in a draw. See the sample input for examples of various winning positions.

Given a 4 x 4 board description containing ‘X’, ‘O’, ‘T’ and ‘.’ characters (where ‘.’ represents an empty square), describing the current state of a game, determine the status of the Tic-Tac-Toe-Tomek game going on. The statuses to choose from are:

  • “X won” (the game is over, and X won)
  • “O won” (the game is over, and O won)
  • “Draw” (the game is over, and it ended in a draw)
  • “Game has not completed” (the game is not over yet)

If there are empty cells, and the game is not over, you should output “Game has not completed”, even if the outcome of the game is inevitable.

An input file can be downloaded with different board positions, and our program should generate an output file with the results: X won, O won, tie, or the game is not finished, yet.

I wrote my solution using TDD, you can see the result at:

https://github.com/ajlopez/TddOnTheRocks/tree/master/TicTacToeTomek

The commit history, with test granularity:

https://github.com/ajlopez/TddOnTheRocks/commits/master/TicTacToeTomek

It was a simple exercise, and well adapted to TDD. Google accepted my solution to a small input file, and to a big one too. I should write about the other exercises (B, C, D) that are a bit more complicated for TDD:

- Each problem should be solved quickly. With TDD, you could implement an algorithm but it could be not efficient.

- The algorithm to implement is not evident, and you should write the increasing difficult input cases.

- Sometimes, given an input state, it is hard to reach (even by hand) a right output solution. Many of my attempts were rejected by Google, without much info about what is the right solution, so you must review manually each result.

Keep tuned!

Angel “Java” Lopez
http://www.ajlopez.com
http://twitter.com/ajlopez

May 6, 2013

TDD Kata (2): Alien Language

Previous Post
Next Post

Some week ago, Google Code Jam was mentioned in Spanish list TDDev.  One of the  past problems of this contest was Alien Language.

After years of study, scientists at Google Labs have discovered an alien language transmitted from a faraway planet. The alien language is very unique in that every word consists of exactly L lowercase letters. Also, there are exactly D words in this language.

Once the dictionary of all the words in the alien language was built, the next breakthrough was to discover that the aliens have been transmitting messages to Earth for the past decade. Unfortunately, these signals are weakened due to the distance between our two planets and some of the words may be misinterpreted. In order to help them decipher these messages, the scientists have asked you to devise an algorithm that will determine the number of possible interpretations for a given pattern.

A pattern consists of exactly L tokens. Each token is either a single lowercase letter (the scientists are very sure that this is the letter) or a group of unique lowercase letters surrounded by parenthesis ( and ). For example: (ab)d(dc) means the first letter is either a or b, the second letter is definitely d and the last letter is either d or c. Therefore, the pattern (ab)d(dc) can stand for either one of these 4 possibilities: add, adc, bdd, bdc.

I liked the problem. See in the page that we should read an input file with the words and patterns to process. We can download input files from the problem page, and upload an output file with our solution, generated by our problem. Then, Google will return if the answer is correct or not. So, I wrote a solution using  TDD:

https://github.com/ajlopez/TddOnTheRocks/tree/master/AlienLanguage

You can read commit history, with test granularity, at:

https://github.com/ajlopez/TddOnTheRocks/commits/master/AlienLanguage

One of the latests things I did was to refactor the detection of words that match a pattern. I could improve it, maybe using a word tree instead of a list of words, as I did in SimpleBoggle.

At the end, I wrote a console program that accepts an input file and writes an output file, as Google contest requires. I downloaded a big input file from Google, processed it, and uploaded the solution. Google answered: OK:

TDD rules ;-)

Keep tuned!

Angel “Java” Lopez
http://www.ajlopez.com
http://twitter.com/ajlopez

May 5, 2013

Mass Programming Language (3) Commands

Filed under: .NET, C Sharp, Mass, Open Source Projects, Programming Languages — ajlopez @ 6:16 pm

Previous Post
Next Post

Today let’s review command implementation in Mass (see repo). In the class library project, I have a dedicated namespace and folder for commands:

There are commands for if, while, for, for each, etc… Every command implements the ICommand interface:

public interface ICommand
{
    object Execute(Context context);
}

See that is very similar to IExpression. But I wanted to keep a separation between commands and expressions, at least for this first implementation, in order to have a clear separation of basis concerns.

A typical example of command is WhileCommand, partial code:

public class WhileCommand : ICommand
{
    private static int hashcode = typeof(WhileCommand).GetHashCode();

    private IExpression condition;
    private ICommand command;

    public WhileCommand(IExpression condition, ICommand command)
    {
        this.condition = condition;
        this.command = command;
    }

    public object Execute(Context context)
    {
        for (object value = this.condition.Evaluate(context); 
            value != null && !false.Equals(value);
            value = this.condition.Evaluate(context))
        {
            this.command.Execute(context);
            if (context.HasContinue())
                context.ClearContinue();
            if (context.HasBreak())
            {
                context.ClearBreak();
                break;
            }
        }

        return null;
    }
}

In Mass, every null or false value is false. All other value is true. I should refactor the code to have a central method IsFalse to be invoked in the above While code and in other commands, like IfCommand.

Another sample of command is ForEachCommand, partial code:

public class ForEachCommand : ICommand
{
    private string name;
    private IExpression expression;
    private ICommand command;

    public ForEachCommand(string name, IExpression expression, ICommand command)
    {
        this.name = name;
        this.expression = expression;
        this.command = command;
    }

    public object Execute(Context context)
    {
        var values = (IEnumerable)this.expression.Evaluate(context);

        foreach (var value in values)
        {
            context.Set(this.name, value);
            this.command.Execute(context);
            if (context.HasContinue())
                context.ClearContinue();
            if (context.HasBreak())
            {
                context.ClearBreak();
                break;
            }
        }

        return null;
    }
}

See that for Mass every IEnumerable value can be used as the element provider for ForCommand. I added some methods to the current context to signal the present of a break or continue (in the current version, I added return treatment, too).

And, as usual, all this code was developed using TDD flow: you can read repo commit history.

Next posts: Lexer, Parser, Mass samples, Mass scripting, using Mass from our .NET programs.

Keep tuned!

Angel “Java” Lopez

http://www.ajlopez.com

http://twitter.com/ajlopez

April 30, 2013

Mass Programming Language (2) First Expressions

Filed under: .NET, C Sharp, Mass, Open Source Projects, Programming Languages — ajlopez @ 9:31 am

Previous Post
Next Post

Before using Mass language in samples (see repo), I want to visit some implementation points. First, a news: now there is a solution (at (en https://github.com/ajlopez/Mass/blob/master/Src/Mass.sln) that can be compiled with Visual Studio C# Express.

The Mass solution has a class library project. There is a namespace dedicated to expressions:

An expression implements IExpression:

  
public interface IExpression
{
	object Evaluate(Context context);
}

A Context keeps a key/value dictionary, to save the current variables:

  
public void Set(string name, object value)
{
	this.values[name] = value;
}

public object Get(string name)
{
	if (this.values.ContainsKey(name))
		return this.values[name];

	if (this.parent != null)
		return this.parent.Get(name);

	return null;
}

(notice that it supports nested context: each context can have a parent).

A simple expression is ConstantExpression, that returns a constant, without using the received context. A constant is any .NET value/object:

  
public class ConstantExpression : IExpression
{
	// ...

	private object value;

	public ConstantExpression(object value)
	{
		this.value = value;
	}

	public object Evaluate(Context context)
	{
		return this.value;
	}

	// ...
}

Another expression is the one that evaluates a variable name in the current context:

  
public class NameExpression : IExpression
{
	// ...

	private string name;

	public NameExpression(string name)
	{
		this.name = name;
	}

	public object Evaluate(Context context)
	{
		return context.Get(this.name);
	}

	// ...
}

The context is important: it could be the context of the current function, of a closure, of current object, of current module, etc… The same name could refer defers differente associated variables, as in other programming languages.

As usual, all these classes (expressions, Context, …) were written using the flow of TDD. You can check the test project and the repo commits history were there is evidence of that flow.

Next posts: some additional expressions, commands, using Mass for scripting, using Mass from our .NET programs.

Kee tuned!

Angel “Java” Lopez

http://www.ajlopez.com

http://twitter.com/ajlopez

April 26, 2013

TDD Kata (1): Rock Paper Scissors Lizard Spock

Two weeks ago, I read at Spanish list TDDev a new kata published at the blog Aprendiendo TDD:

Piedra Papel Tijera Lagarto Spock

based on a problem published in (Spanish post)

http://www.solveet.com/exercises/Kata-Piedra-Papel-Tijera-Lagarto-Spock/20

I took the description from Wikipedia article:

http://en.wikipedia.org/wiki/Rock-paper-scissors-lizard-Spock

The rules of Rock-paper-scissors-lizard-Spock are:

  • Scissors cut paper
  • Paper covers rock
  • Rock crushes lizard
  • Lizard poisons Spock
  • Spock smashes (or melts) scissors
  • Scissors decapitate lizard
  • Lizard eats paper
  • Paper disproves Spock
  • Spock vaporizes rock
  • Rock breaks scissors

And then, I started to code the solution using TDD, in C#. You can check re result at:

https://github.com/ajlopez/TddOnTheRocks/tree/master/SpockGame

The commit history reveals commit by test:

https://github.com/ajlopez/TddOnTheRocks/commits/master/SpockGame

See the first test/commit: it didn’t compile. Then, it started to compile, but with red result. Then, I added the minimal code to pass the test, and refactor, and so on.

I decided to have two tests, one per permutation: testing Scissors cut Paper, and other testing Paper is cut by Scissors. I could refactor the test to have an internal method testing in both ways

The initial design was based on:

- Having an instance of Game class (the alternative was to have static methots, directly in the class)

- Having an enumeration for game options (Play.Scissors, etc…)

- Having an enumeration for the play result (PlayResult.Tie, PlayResult.FirstPlayer…)

Instead of having play result, I could put a method that compare two options, deciding which one is “greater”. Today, I could refactor the implementation to use that approach.

See the last refactor: I could leave the code as is, with if command deciding which play wins, to decide when the first player wins:

But I wrote a  new way:

based on Wikipedia article suggestion:

One way to remember the rules is to remember the standard "rock-paper-scissors" ordering, where each gesture defeats the one before it, and is defeated by the one after. But then add the two novel gestures near the word they approximately rhyme with:

  1. Rock
  2. Spock
  3. Paper
  4. Lizard
  5. Scissors

In this expanded list, each gesture is defeated by the following two options, and defeats the preceding two.

But I feel the previous code is more clear. Sometimes, we must decide between having a clear code vs a clever one.

More TDD katas are coming.

Keep tuned!

Angel “Java” Lopez
http://www.ajlopez.com
http://twitter.com/ajlopez

April 25, 2013

Mass Programming Language (1) Inception

Filed under: .NET, C Sharp, Mass, Open Source Projects, Programming Languages — ajlopez @ 9:09 am

Next Post

Three weeks ago, I was working on the implementation of an interpreted language, written in C#. The new language is called Mass (dedicated to  @MArtinSaliaS):

https://github.com/ajlopez/Mass

The current solution has three projects: a class library, its tests, and a console program, mass.exe, that launches Mass programs

You can run a hello.ms:

mass hello.ms

The classic Hello world source code:

println("Hello, world")

An example with classes and objects

class Person
	define initialize(firstname, lastname)
		self.firstname = firstname
		self.lastname = lastname
	end
	
	define getName()
		return self.lastname + ", " + self.firstname
	end
end

adam = new Person("Adam", "TheFirst")

println(adam.getName())

An example with access to .NET types and objects:

dirinfo = new System.IO.DirectoryInfo(".")

for fileinfo in dirinfo.GetFiles()
	println(fileinfo.Name)
end

The idea is  to have a dynamic language that leverages an underlying language provided with a rich class library and ecosystem, like in AjSharp and other of my projects. Before Mass, I was working on:

- Implementing Python in C# (see PythonSharp)

- Implementing Ruby in C# (see RubySharp)

- AjSharp (see repo and post)

But this time I wanted to implement something with simple sintax and semantic. Indeed, I was playing with “simple” ideas for a compiler over Javascript, see SimpleScript (1) First Ideas.

Then, with Mass, I deliberately wanted to avoid:

- Multiple commands in the same line (I discarded ‘;’ like in Ruby)

- Syntax based in spaces and indentation (Python discarded)

- Function invocation using only the name; Mass impose the explicit use of parenthesis (Ruby discarded; Mass is like JavaScript)

- Base values and classes (integers, strings, lists, etc…) having a crowd of methods (like Ruby and Python). No, Mass prefers to expose and use the underlying language/class library.

Then, I wanted:

- Functional values, as first-class citizens, like in JavaScript. So, having to put explicit parenthesis to invoke a function allows me to use the name of the function as a functional value

- Dynamic objects: each object can be extended at any moment, with new instance variables, object functions, a la JavaScript

- Syntax based in lines: each command has its own line. No command separation

- Syntax based in keywords: the end of a command list is marked with ‘end’, no braces

- As far as possible, only one way to do something, instead of the many ways motto a la Perl

- Complete keywords, then ‘define’ instead of ‘def’

- Simple inheritance at classes. But Mass could be expressive without written classes, using native classes from .NET framework and other libraries. It could be used as an scripting language.

- Explicit setting of variables that are out of the local scope (a topic for next posts)

- Variable scope by file, like in the require of JavaScript/NodeJs/CommonJS

- Module by file, with a require that automatically searches in directories, a la NodeJs/CommonJs. Notably, Mass can consume node_modules folder, so Mass module can be published and installed using NPM!

- Package manager, using NPM. You can use package.json to declare the dependencies, and publish new modules at NPM (using ‘mass-‘ as the suggested namespace).

In upcoming posts, I will write more details about implementation, guiding design ideas, examples. But now, you can see the code and the test examples at public repo. And yes, all was written by baby steps, using TDD.

Keep tuned!

Angel “Java” Lopez

http://www.ajlopez.com

http://twitter.com/ajlopez

February 2, 2013

New Month’s Resolutions: February 2013

Filed under: C Sharp, JavaScript, NodeJs, Open Source Projects — ajlopez @ 3:07 pm

Time to review my January Resolutions:

- Start SimpleScript [complete] see repo
- Start SimpleBoard [complete] see repo
- Start SimpleChess [complete] see repo
- Start SimpleGo [complete] see repo
- Start and publish a version of SimpleMapReduce, with local and distributed sample [partial] see repo (only local version)
- Start and publish a version of SimpleFunc, object with functions serialization [complete] see repo
- Start Memolap, C# in-memory multidimensional OLAP-like library and sample [partial] see repo (sample WIP)
- Start SimpleMemolap, the same but in JavaScript/Node.js [complete] see repo
- Start SimpleRules, forward-chaining rule engine, that compiles to JavaScript [complete] see repo

Additionally, I was working on:

- Update AjConsorSite [complete] see repo
- Update Inmob [complete] see repo
- Start SimpleKeeper, Zookeeper-like server [complete] see repo
- Start and publish first version MultiNodes [complete] see repo
- Update SimpleStorm, publish new version using SimpleQueue 0.0.2 [complete] see repo
- Update SimpleQueue publish new version [complete] see repo
- Update SimpleRemote publish new version using SimpleMessages 0.0.3, and async methods [complete] see repo
- Publish first version AjFabriqNode [complete] see repo
- Start and publish first version NodeDelicious [complete] see repo
- Refactor SimpleBroadcast to use SimpleMessages 0.0.3 [complete] see repo
- Publish new version SimpleMessages 0.0.3 [complete] see repo
- Start and publish first version MProc, middleware layer for message processing Node.js, [complete] see repo
- Start and publish first version SimpleTags, engine to manage items with arbitrary data and tags [complete] see repo
- Start and publish first version ObjectStream, bidirectional and unidirectional object streams for Node.js [complete] see repo
- Start and publish first version SimplePipes, yet another flow library in Node.js [complete] see repo
- Start and publish first version SimpleSudoku, sudoku solver in JavaScript/Node.js [complete] see repo
- Start and publish first version SimplePermissions, permissions by Subject, Role, and Context. Model in-memory [complete] see repo
- Start and publish first version SimpleGlobals, inspired by Mumps Globals [complete] see repo
- Start and publish first version SimpleInvoke, chained invocation of functions with callbacks, JavaScript/NodeJs [complete] see repo

For this new month:

- Start SimpleDatabase
- Update SimplePermissions
- Update SimpleStorm (implements ack, maybe integrate with MultiNodes)
- Update SimpleMapReduce
- Update SimpleKeeper, leader, distributed example
- Update AjFabriqNode, integrate with new SimpleMessages, maybe with MultiNodes
- Update MultiNodes
- Update AjGenesisNode, to use global commands and tasks

I will give a seminar (full day) of Node.js at Rosario, Argentina

Keep tuned!

Angel “Java” Lopez
http://www.ajlopez.com
http://twitter.com/ajlopez

December 26, 2012

AjTalk in C# (3) Environments

Filed under: AjTalk, C Sharp, Open Source Projects, Programming Languages, Smalltalk — ajlopez @ 4:46 pm

Previous Post

Some weeks ago, I added environments to my open source AjTalk Smalltalk Virtual Machine, C# version. What is an environment, in my jargon? It’s a dictionary for named artifacts, like classes. Smalltalk global is a classical environment. But I want to add support of other named environments, to avoid class name collisions. Usually, classic Smalltalk have pool dictionaries, but I want something more dynamic. Then, I added Environment, see my tests:

https://github.com/ajlopez/AjTalk/blob/master/Src/AjTalk.Tests/AssertTests/EnvironmentTests.st

At first, Smalltalk is the current environment:

"Current environment is Smalltalk"
[Environment current == Smalltalk] assert.

You can create new environments:

env := Environment new: #MyEnvironment.

Automatically, the new environment is registered/added to the current one, in this case, Smalltalk:

"The new environment was defined as global at Smalltalk"

[(Smalltalk at: #MyEnvironment) isNil not] assert.
[(Smalltalk at: #MyEnvironment) == MyEnvironment] assert.
[(Smalltalk at: #MyEnvironment) == env] assert.

[MyEnvironment isNil not] assert.
[MyEnvironment == env] assert.

Every new Environment has an entry to Smalltalk:

"Dotted expression syntax sugar for MyEnvironment at: #Smalltalk"

[MyEnvironment.Smalltalk == Smalltalk] assert.

You can switch to a new environment:

env setCurrent.

"Current environment check"

[Environment current == env] assert.
[Environment current == Smalltalk.MyEnvironment] assert.

And then, this is the key feature, the class definitions define a new class AT CURRENT environment:

"Define a class at current env environment, no change to syntax"

Object subclass:#MyClass
    instanceVariableNames:''
    classVariableNames:''
    poolDictionaries:''
    category:''
.

[(env at: #MyClass) isNil not] assert.
[(Smalltalk at: #MyClass) isNil] assert.

Orthogonally to environments, I implemented modules: a way to search and load file outs, and running them at a new environment (similar to Python import, or (more or less) to Node.js/CommonJS require). But this is a topic for another post ;-)

Keep tuned!

Angel “Java” Lopez

http://www.ajlopez.com

http://twitter.com/ajlopez

Older Posts »

Theme: Shocking Blue Green. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 28 other followers