Sunday, January 14, 2007

[Tech]EAExpression - building a new DSL

Today everybody speaks about DSL's (Domain Specific Languages) and how they can help solving problems in a specific domain better and easier than general purpose languages.

Senactive is developing a Sense and Respond System (InTime) which uses runtime objects - so called events - to send information from one component to another. For more information on the system see our website. The problem we had was, how we can describe criteria like filters or rules on runtime objects within design time that can be easily described from users and evaluated within the runtime. Our fist approach was using XPath, because C# has a very flexible XPath engine where you can implement the navigator on your own objects (maybe I will post about this another time or Rupert as he did it :-)). We implemented the navigator some time ago and the events can be navigated using XPath expression. The power the language gave us was really greate (functions, adressing ...) but the tradeoff was, that the language is not really intuitive for business users.

So we decided to implement a new language called EAExpression (Event Access Expression) Language. The language should be easier to understand for business users but should also have a similar power XPath gave us. So we decided to implement a DSL and after some brainstorming we came up with the following key points our language must provide:
  • addressing events - this includes events itself and their attributes which can be primitive types, collection, dictionaries, other events ...
    we use a "." notation for this e.g. Event1.Attr1.Attr2 can be used to address the Attribute "Attr1" of Event1 which in this case again is a Event where the Attribute "Attr2" is evaluated. For Collections we use the Syntax Attr1[1] and for Dictionaries Attr1["Test"]
  • Constant values (e.g. Strings "Test", integer 12, float 12.5f, boolean true|false, ...)
  • calculation at least we need to calculate +,-,+,/ and % (modulo)
  • boolean expressions AND, OR, XOR, NOT
  • comparison expressions =, if possible chained
After we knew what we wanted to support, we were looking for a flexible and easy to use Lexer and Parser and came up with ANTLR. It is really a very powerful tool with greate amount of helpful documentation that is needed if you develop a new language the first time. ANTLR is a Java tool which can generate code for the Lexer, Parser and TreeParser to Java, C#, C++ and Python.

Starting to play with it felt like sitting in a compiler course at the university :-) . I really didn't believe i would need this stuff once again - I always said "compilers are just for geeks" but i changed mind as i got deeper into the stuff again. I took us about one and a half week to learn the stuff needed and build up the lexer and parser. But if I need to do it again I would calculate a maximum of two days for a related grammar.

After the expression is evaluated, there is an AST (Abstract Syntax Tree) of your language code which can be easily navigated with ANTLR buy implementing a TreeParser. The TreeParser can now be used to generate code that should be executed in the runtime. This was the hardest part within building our language because of missing support in the C# API to evaluate the numeric operations for the datatypes in a generic way. We were looking for a Expression library that could support us but in the end we had to do it on our own. We used Wrapper classes for the datatypes in order to evaluate the calculations type safe, an other solution would be to use reflection.

In the end we build a wrapper classes for the whole Language called EAExpression like the XPathExpression Object in C#.
Now you can do things like this:



----Code---
Event1 ev1 = new Event();
ev1["Attr1"] = 12;
ev1["Attr2"] = 15;
ev1["Attr3"] = 1;


EAExpression expr = EAExpression.Compile("Attr1 < Attr2 + Attr3");
bool val = (bool)expr.Evaluate(ev1);

----Code----

This is just a very simple example where the first 4 lines shows how an event is created.

We have learned several things while building the language:
  • don't be afraid to create one for a specific purpose - some times it is really useful to do it
  • it was much less work than I expected, because there are several tools out there that can help you
  • the language you build should be as simple and easy as possible. Don't try to do fancy stuff or allow several ways to do the same thing. This will be more confusing than helpful to users.
  • in the end we needed to introduce defined functions e.g. Now() for the DateTime. Now in C# which was a little bit tricky
Things we don't have until now:
  • Autocompletition and syntax highlighting for user inputs within the GUI - I will post as soon as we have it and how we will solve it because i think it is a very important part of languages for business folks.

2 comments:

Alexander Schatten said...

Gerd, interesting article!

However, as a non-expert in that field, I wonder how your approach compares to technologies like Open Architectureware, where you can definde domain specific languages derived from meta-models and the frameworks supports you in building either graphical editors or creating text-editors for DSLs including syntax highlighting, ...

The second technology I think of are "object query languages" like GPath in Groovy. Would this not do more or less what you are searching for?

Gerd Saurer said...

Open Archictectureware is a powerful tool within the java world it integrates very nice into eclipse and gives you the possibility to design whole languages incl. the editor for eclipse. It was not our intention to build a language that can be used within the development environment further more we wanted to build an evaluation and expression language for our clients. However it is not available for c#, which is the language the product is implemented in. There is also an similar possibility for c# - The DSL Extension Tools for VS. I considered them but we wanted to build a more lightweight solution special because we didn't need the Integration within VS.

GPath is a subset of the language we created. It uses nearly the same syntax as we do for addressing objects - with missing support for boolean-, comparison expressions ... I haven't found a something equivalent in the c# area - nothing new the community is much smaller there and it's sometimes hard to find libraries that support your ideas :-).