Angel \”Java\” Lopez on Blog

September 8, 2010

Writing an Interpreter in .NET (Part 4)

Filed under: .NET, Programming Languages, Test-Driven Development — ajlopez @ 9:47 am

In this post and example, I add a lexer, to process a text with the source code of our program. The lexer splits the code in “words”, the tokens of out input.

The new solution:

The code can be downloaded from InterpreterStep04.zip. I refactored the previous version: now, Interpreter class library project has three new folders: Commands, Expressions, Compiler.

The new code is:

The TokenType is an enumeration:

    public enum TokenType
    {
        Name,
        String,
        Integer,
        Operator,
        Separator
    }

Token represents one “word” in our language:

    public class Token
    {
        public TokenType TokenType { get; private set;  }
        public object Value { get; private set;  }
        public Token(TokenType type, object value)
        {
            this.TokenType = type;
            this.Value = value;
        }
    }

Value holds the detected word: its integer value if it is an integer, the word if it is a name.

Lexer is in charge of detecting the next token from a text reader or a string:

       public Lexer(TextReader reader)
        {
            this.reader = reader;
        }
        public Lexer(string text)
            : this(new StringReader(text))
        {
        }
        public Token NextToken()
        {
            int ch;
            for (ch = this.NextChar(); ch != -1 && char.IsWhiteSpace((char)ch); ch = this.NextChar())
                ;
            if (ch == -1)
                return null;
//...
        }

Lexer was written in “baby-steps”, following the evolution of tests (I’m using TDD in this series). Tests like:

        [TestMethod]
        public void ProcessNameWithWhitespaces()
        {
            Lexer lexer = new Lexer("  one  ");
            Token token = lexer.NextToken();
            Assert.IsNotNull(token);
            Assert.AreEqual(TokenType.Name, token.TokenType);
            Assert.AreEqual("one", token.Value);
            Assert.IsNull(lexer.NextToken());
        }
        [TestMethod]
        public void ProcessTwoNames()
        {
            Lexer lexer = new Lexer("one two");
            Token token = lexer.NextToken();
            Assert.IsNotNull(token);
            Assert.AreEqual(TokenType.Name, token.TokenType);
            Assert.AreEqual("one", token.Value);
            token = lexer.NextToken();
            Assert.IsNotNull(token);
            Assert.AreEqual(TokenType.Name, token.TokenType);
            Assert.AreEqual("two", token.Value);
            Assert.IsNull(lexer.NextToken());
        }
        [TestMethod]
        public void ProcessNameAndSeparator()
        {
            Lexer lexer = new Lexer("one;");
            Token token = lexer.NextToken();
            Assert.IsNotNull(token);
            Assert.AreEqual(TokenType.Name, token.TokenType);
            Assert.AreEqual("one", token.Value);
            token = lexer.NextToken();
            Assert.IsNotNull(token);
            Assert.AreEqual(TokenType.Separator, token.TokenType);
            Assert.AreEqual(";", token.Value);
            Assert.IsNull(lexer.NextToken());
        }

The current version of lexer only recognize names, some separators like “;”, and operators like “=”. It don’t process delimited strings, real numbers or other separators and operators. My idea is to add more capabilities when write the test.

All tests in green:

Good code coverage:

Next steps: write an expression parser, then add command parsing, new commands like if and for, function definition. I could add compile to native .NET code, using Reflection.Emit or CodeDom.

Keep tuned!

Angel “Java” Lopez

http://www.ajlopez.com

http://twitter.com/ajlopez

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: