Internal DSL Becomes External DSL

Written By John Sonmez

Lately I have been working on creating a language for automated testing to allow for even easier syntax than that of the internal DSL which I had previously created.

I have been thinking a lot lately about internal DSLs vs external DSLs.  It started to feel a little strange to me to twist a language like Java to look like I wanted it to look.

Let me give you an example:

Yes, the syntax is fluent-like, but it is using static methods, modifying enumerations, using parameter lists, and a bit of a strange naming convention.  All with the goal of making it more readable and easier to program from an IDE. (Auto complete friendly.)

What I really want the line to say is something more like:

The second syntax is much easier to read. Someone can write it without knowing any Java.  Someone can probably even write tests without any training at all, just by looking at other tests.  If I were to create a small executable that could run the scripts, someone could write and run tests without having to setup an IDE, worry about jars or DLLs, or even JUnit.

Taking it to the next level…

So I decided to give it a shot and see how hard it would be to create an external DSL from my internal DSL.  Part of what kept me leaning back on the internal DSL vs the external DSL is that I thought writing a language would be hard.

Turns out it is not.

Enter ANTLR, enter IBESScript.

In researching, I found a really good tool called ANTLR which is designed to create a parser and lexer for a custom language defined by a EBNF notation.  That was a mouthful, what does that mean?  Basically this:  ANTLR allows you to specify the syntax of a language and creates Java or C# code that will parse strings of that syntax to break them up into the individual language constructs and then execute them as you specify.  So really, really what does that mean?  Ok, let me give you an example:

I tell ANTLR a little bit about the syntax of:

I give it some instructions about how to break apart the pieces of the new language and what Java code should be generated to perform the language constructs in that language.

ANTLR produces for me some Java code to read in my new language and creates code that looks like:

Pretty cool and pretty easy to do actually.  Especially for a simple DSL that is mostly just commands.

The entry barrier from internal DSL to external DSL has just been obliterated.  I may have to change my mind on internal DSLs.

Put down the pitchforks

So, am I saying internal DSLs are now bad and external DSLs are now good?  No, no far from it (although sometime in the future I may change my mind.)  All I am really saying here is that I thought creating an external DSL was going to be much harder than it really is, but now that I am finding out how easy it is, I am starting to see it as an actual option instead of an internal DSL.

I was wondering about some of the tools we use that are using internal DSLs.  This post prompted me to think about it more deeply, which prompted this post, which ultimately lead me here.  Thanks Richard Cirerol for making me think about this.

Perhaps we should consider replacing some of our internal DSLs with external DSLs, especially in cases where the internal DSL code looks like a real bastardization of the normal source code of the host language.  At that point, aren't we just trying to force a bizarre syntax into a box it doesn't fit in?  Yes, yes, MSpec I am talking about you.

What do you think?  Has anyone created an external DSL out there?