At the last day of the PDC, Chris Anderson and Giovanni Della-Libera showed us how to build textual DSL’s with the Oslo modeling language. I wrote a blog post about this session and in that post you can find this image:

image

Normally you would use the MG.exe compiler or the MG Build Task to compile your languages. But as the slide above indicates, the MGrammar team provided us with an in-memory version of the MG build task. So let’s try this out!

For this exercise I reuse the language and instance data I created in my blog post: My First Language in MGrammar

Module

First, I created the Module class. This class represents the  MGrammar textual DSL. It also contains the name of the parser you want to create (more on that later on in this post) and the result bytes of the compiled DSL:

   1: public class Module
   2: {
   3:     public string MGrammar { get; set; }
   4:     public string ParserName { get; set; }
   5:     public byte[] Bytes { get; set; }
   6: }

 

DynamicMGrammarCompiler

The dynamic compiler class I created can be used in two scenarios:

  1. You compile the module, cache it in your host application and reuse this module when parsing data through your DSL. Compiling a module is an expensive operation; it burns quite some CPU cycles, so reusing it seems like a good idea.
  2. You compile the module and parse your data through your DSL every single call.

The above scenarios result in three methods: CompileModule, ParseData and CompileModuleAndParseData:

   1: public class DynamicMGrammarCompiler
   2: {
   3:  
   4:     /// <summary>
   5:     /// Compile the module, and store the graph in the byte[], so the 
   6:     /// client application can cache this compiled image
   7:     /// </summary>
   8:     /// <param name="module"></param>
   9:     /// <returns></returns>
  10:     public Module CompileModule(Module module) 
  11:     {
  12:  
  13:         MGrammarCompiler compiler = new MGrammarCompiler(); 
  14:         ErrorReporter reporter = ErrorReporter.Standard;
  15:         TextReader reader = new StringReader(module.MGrammar);
  16:         FileStream fs = null;
  17:         try
  18:         {
  19:             SourceItem item = new SourceItem(module.ParserName, reader);
  20:             item.ContentType = GContentType.Mg;
  21:             SourceItem[] items = new SourceItem[] { item };
  22:             compiler.SourceItems = items;
  23:             compiler.Target = Target.Mgx;
  24:             compiler.TypeCheckActions = true;
  25:             compiler.OutFile = Path.GetTempFileName();
  26:             compiler.Execute(reporter);
  27:             //read bytes 
  28:             string filepath = Path.Combine(Path.GetDirectoryName(compiler.OutFile), Path.GetFileNameWithoutExtension(compiler.OutFile) + ".mgx");
  29:             fs = new FileStream(filepath, FileMode.Open, FileAccess.Read);
  30:             byte[] bytes = new byte[fs.Length];
  31:             fs.Read(bytes, (int)0, (int)fs.Length);
  32:             fs.Close();
  33:             module.Bytes = bytes;
  34:         }
  35:         finally
  36:         {
  37:             if (null!=fs)
  38:             {
  39:                 fs.Close();
  40:             }
  41:             reader.Close();
  42:         }
  43:         return module; 
  44:     }
  45:  
  46:     /// <summary>
  47:     /// Parse the instancedata through the compiled image within the module
  48:     /// </summary>
  49:     /// <param name="module"></param>
  50:     /// <param name="instanceData"></param>
  51:     /// <returns></returns>
  52:     public object ParseData(Module module, string instanceData) 
  53:     {
  54:  
  55:         object result = null;
  56:         MemoryStream ms = null;
  57:         StringReader sr = new StringReader(instanceData);
  58:         try
  59:         {
  60:             ms = new MemoryStream(module.Bytes);
  61:             DynamicParser parser = MGrammarCompiler.LoadParserFromMgx(ms, module.ParserName);
  62:             if (null == parser)
  63:             {
  64:                 throw new NullReferenceException(string.Format("Language with name '{0}' not found in MGrammar image!", module.ParserName));
  65:             }
  66:             
  67:             result = parser.ParseObject(sr, ErrorReporter.Standard);
  68:         }
  69:         finally
  70:         {
  71:             if (null != ms)
  72:             {
  73:                 ms.Close();
  74:             }
  75:             sr.Close();
  76:         }
  77:         return result;
  78:     }
  79:  
  80:     /// <summary>
  81:     /// Both Compile and Parse.
  82:     /// </summary>
  83:     /// <param name="module"></param>
  84:     /// <param name="instanceData"></param>
  85:     /// <returns></returns>
  86:     public object CompileModuleAndParseData(Module module, string instanceData)
  87:     {
  88:         module = CompileModule(module);
  89:         return ParseData(module, instanceData);
  90:     }
  91: }

CompileModule creates a MGrammarCompiler instance and loads theMGrammar we want to compile in a TextReader. Then, I create a SourceItem instance which will tell the MGrammarCompiler what type of source it can expect and provides the source (textual DSL) itself.
Then, I tell the compiler that the target of the compilation should be a MGX file. I also provide a temporary file name to write the result to, and then call Execute!
The last step is opening the created file and reading its bytes and returning them within the module instance.

ParseData receives a module and some input instance data. I first read the instance data and create a stream of the module bytes. Then I create a DynamicParser using the stream and the provided ParserName. Important to know here is that the parser name should be the module name and the language name concatenated with a dot. As I’m using my RssLanguage from the inwit module, my parser name should be “inwit.RssLanguage”.
If a correct parser is created, we use it parse our instance data and create a graph. I just return this graph to the caller.

CompileModuleAndParseData just combines the above two steps in one call.

Test Application

As a test application I created a console App:

   1: class Program
   2: {
   3:     static void Main(string[] args)
   4:     {
   5:         Console.WriteLine(">>Initializing...");
   6:         string filePathMGrammar =       @"C:\Users\Robert Jan\Desktop\My Documents\Oslo\MyOslo\RssLanguage.mg";
   7:         string filePathInstanceData =   @"C:\Users\Robert Jan\Desktop\My Documents\Oslo\MyOslo\FeedsInput.m";
   8:         
   9:         string mGrammar = File.ReadAllText(filePathMGrammar);
  10:         string instanceData = File.ReadAllText(filePathInstanceData);
  11:  
  12:         inwit.Module module = new inwit.Module();
  13:         module.MGrammar = mGrammar;
  14:         module.ParserName = "inwit.RssLanguage";
  15:  
  16:         inwit.DynamicMGrammarCompiler compiler = new inwit.DynamicMGrammarCompiler();
  17:         
  18:         //first compile and cache module
  19:         Console.WriteLine(">>Step 1: compiling module");
  20:         module = compiler.CompileModule(module);
  21:  
  22:         //second reuse module, and just parse instancedata
  23:         Console.WriteLine(">>Step 2a: parsing instance data");
  24:         object graph = compiler.ParseData(module, instanceData);
  25:         Console.WriteLine(">>Step 2b: Writing result:");
  26:         inwit.Helper.WalkMGraphTree(graph);
  27:         
  28:         //third, do all in one call
  29:         Console.WriteLine(">>Step 3a: compiling module and parsing data in one call...");
  30:         graph = compiler.CompileModuleAndParseData(module, instanceData);
  31:         Console.WriteLine(">>Step 3b: Writing result:");
  32:         inwit.Helper.WalkMGraphTree(graph);
  33:  
  34:         Console.ReadLine();
  35:     }
  36: }

I’m using the Helper.WalkMGraphTree method from my previous post again to output the result graph.

The result of the console app looks like:

image


Posted in: Oslo , MGrammar  Tags:

Currently rated 5.0 by 1 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Robert Jan - Wednesday, November 05, 2008 - 9:23 AM

Update: Something went wrong with the code snippets; should be shown correctly now!

Now we have MGrammar mode correctly running in Intellipad, let’s try out some stuff.

Let’s create a language that understands textual representation of the title, location, URL and email address of an RSS Feed. I also want the language to skip whitespace and comments, and the email address should be validated.

So as sample instance data I wrote this:


 

   1: Title: inwit.nl
   2: Url: http://inwit.nl
   3: RssFeedUrl: http://feeds.feedburner.com/inwitnl
   4: Email: rj@vanholland.net
   5:  
   6: //this is comment
   7: /*
   8: this is also comment
   9: 
  10: */
  11:  
  12: Title: IntellipadBlog
  13: Url: http://blogs.msdn.com/intellipad
  14: RssFeedUrl: http://blogs.msdn.com/intellipad/rss.xml
  15: Email: oslo@microsoft.com

As you can see, just two instances of a Feed type with some comments and whitespacing in there.
Now, let’s write a language that swallows this data. What I in fact did was create three languages:

  • A common language with some stuff you’d want to use more often; perhaps the Email Language should be moved here also.
  • The language that understands an Email Address
  • The actual RSS language

The Common Language
This language should cover the part of understanding white spacing and comments; after having looked at the demo done at the PDC and after having looked around in the “C:\Program Files\Microsoft Oslo SDK 1.0\Samples\MGrammar\Languages” directory of  your SDK installation I came up with this:

   1: language InwitCommon
   2: {
   3:  
   4:     token Skippable = Whitespace | Comment;
   5:     
   6:     token Comment = CommentToken;
   7:     token CommentToken 
   8:         = CommentDelimited
   9:         | CommentLine;
  10:     token CommentDelimited = "/*" CommentDelimitedContent* "*/";
  11:     token CommentDelimitedContent = 
  12:         ^('*')
  13:         | '*'  ^('/');
  14:     token CommentLine = "//" CommentLineContent*;
  15:             token CommentLineContent = ^(
  16:              '\u000A' // New Line
  17:           |  '\u000D' // Carriage Return
  18:           |  '\u0085' // Next Line
  19:           |  '\u2028' // Line Separator
  20:           |  '\u2029'); // Paragraph Separator
  21:           
  22:           
  23:           
  24:    token Whitespace = WhitespaceToken+;
  25:    token WhitespaceToken = WhitespaceCharacter+;
  26:             token WhitespaceCharacter 
  27:         = '\u0009'   // Horizontal Tab
  28:         | '\u000B' // Vertical Tab
  29:         | '\u000C' // Form Feed
  30:         | '\u0020' // Space
  31:         | NewLineCharacter;
  32:         
  33:    token NewLineCharacter 
  34:         = '\u000A' // New Line
  35:         | '\u000D' // Carriage Return
  36:         | '\u0085' // Next Line
  37:         | '\u2028' // Line Separator
  38:         | '\u2029'; // Paragraph Separator
  39: }
Now, in your language you can use the Skippable token from this language to set as an interleave; this will let your language skip whitespacing and comments.

The Email Language
I wanted to have some sort of Email address validation within my language. So I came up with this:

   1: language EmailAddressLanguage
   2:     {
   3:         token EmailAddress = 
   4:         localpart
   5:         at
   6:         domainpart;
   7:            
   8:         token abzABZ = ('A'..'Z' | 'a'..'z')+;
   9:         token digits = ('0'..'9')+;
  10:         token otherChars = ('!' | '#' | '$' | '%' | '&' | "'" | '*' | '+' | '-' | '/' | '=' | '?' | '^' | '_' | '`' | '{' | '|' | '}' | '~')+; 
  11:         token allButDot = (abzABZ | digits | otherChars)+;
  12:         token all = (allButDot | dot)+;
  13:         token dot = ('.')#1;
  14:         
  15:         token localpart = 
  16:         (allButDot)+ | 
  17:         allButDot dot all* allButDot+;
  18:         
  19:         token at = "@";
  20:         
  21:         token domainpart = 
  22:         (allButDot)+ dot all* allButDot+;
  23:     }

It’s far from being perfect! It validates email addresses but in some cases doesn’t work correctly yet:
You can have an email address like “bla..bla@hotmail..com” and it will validate. I haven’t looked much deeper in it yet, because this was just a small test but if someone feels like improving this part, please do so and post a comment with your solution!

The RSS Language
Then, I wrote the RSS language itself, which looks like this:

   1: language RssLanguage
   2:     {
   3:         syntax Main = f:Feeds => f;
   4:         
   5:         syntax Feeds = Feed*;
   6:         
   7:         syntax Feed = 
   8:         "Title" ":" t:Title 
   9:         "Url" ":" u:Url
  10:         "RssFeedUrl" ":" r:RssFeedUrl
  11:         "Email" ":" e:EmailAddressLanguage.EmailAddress
  12:         =>
  13:         Feed{
  14:             Title{t},
  15:             Url{u},
  16:             RSS{r},
  17:             Email{e}
  18:             };
  19:         
  20:         @{Classification["Keyword"]} token Title = ('A'..'Z' | 'a'..'z' | '.')+;
  21:         
  22:         token Url = "http://" ('A'..'Z' | 'a'..'z' | '.' | '/')+;
  23:         
  24:         token RssFeedUrl = Url;
  25:         
  26:         
  27:         interleave WhiteSpacing = " " | "\r" | "\n";
  28:         interleave Skippable = InwitCommon.Skippable;
  29:     }

It defined that the Main is a sequence called ‘Feeds’ which contains items of the type Feed. An input Feed will consist of a Title, Url, RssFeedUrl and Email and will be shaped to a Feed with a Title, Url, RSS and Email element.
You can see that I use the EmailAddressLanguage and the InwitCommon language within this language.

Full Listing
To simplify, here is the full listing in one module:

   1: module inwit
   2: {
   3:     language RssLanguage
   4:     {
   5:         syntax Main = f:Feeds => f;
   6:         
   7:         syntax Feeds = Feed*;
   8:         
   9:         syntax Feed = 
  10:         "Title" ":" t:Title 
  11:         "Url" ":" u:Url
  12:         "RssFeedUrl" ":" r:RssFeedUrl
  13:         "Email" ":" e:EmailAddressLanguage.EmailAddress
  14:         =>
  15:         Feed{
  16:             Title{t},
  17:             Url{u},
  18:             RSS{r},
  19:             Email{e}
  20:             };
  21:         
  22:         @{Classification["Keyword"]} token Title = ('A'..'Z' | 'a'..'z' | '.')+;
  23:         
  24:         token Url = "http://" ('A'..'Z' | 'a'..'z' | '.' | '/')+;
  25:         
  26:         token RssFeedUrl = Url;
  27:         
  28:         
  29:         interleave WhiteSpacing = " " | "\r" | "\n";
  30:         interleave Skippable = InwitCommon.Skippable;
  31:     }
  32:     
  33:     language EmailAddressLanguage
  34:     {
  35:         token EmailAddress = 
  36:         localpart
  37:         at
  38:         domainpart;
  39:            
  40:         token abzABZ = ('A'..'Z' | 'a'..'z')+;
  41:         token digits = ('0'..'9')+;
  42:         token otherChars = ('!' | '#' | '$' | '%' | '&' | "'" | '*' | '+' | '-' | '/' | '=' | '?' | '^' | '_' | '`' | '{' | '|' | '}' | '~')+; 
  43:         token allButDot = (abzABZ | digits | otherChars)+;
  44:         token all = (allButDot | dot)+;
  45:         token dot = ('.')#1;
  46:         
  47:         token localpart = 
  48:         (allButDot)+ | 
  49:         allButDot dot all* allButDot+;
  50:         
  51:         token at = "@";
  52:         
  53:         token domainpart = 
  54:         (allButDot)+ dot all* allButDot+;
  55:     }
  56:     
  57:     language InwitCommon
  58:     {
  59:     
  60:         token Skippable = Whitespace | Comment;
  61:         
  62:         token Comment = CommentToken;
  63:         token CommentToken 
  64:             = CommentDelimited
  65:             | CommentLine;
  66:         token CommentDelimited = "/*" CommentDelimitedContent* "*/";
  67:         token CommentDelimitedContent = 
  68:             ^('*')
  69:             | '*'  ^('/');
  70:         token CommentLine = "//" CommentLineContent*;
  71:                 token CommentLineContent = ^(
  72:                  '\u000A' // New Line
  73:               |  '\u000D' // Carriage Return
  74:               |  '\u0085' // Next Line
  75:               |  '\u2028' // Line Separator
  76:               |  '\u2029'); // Paragraph Separator
  77:               
  78:               
  79:               
  80:        token Whitespace = WhitespaceToken+;
  81:        token WhitespaceToken = WhitespaceCharacter+;
  82:                 token WhitespaceCharacter 
  83:             = '\u0009'   // Horizontal Tab
  84:             | '\u000B' // Vertical Tab
  85:             | '\u000C' // Form Feed
  86:             | '\u0020' // Space
  87:             | NewLineCharacter;
  88:             
  89:        token NewLineCharacter 
  90:             = '\u000A' // New Line
  91:             | '\u000D' // Carriage Return
  92:             | '\u0085' // Next Line
  93:             | '\u2028' // Line Separator
  94:             | '\u2029'; // Paragraph Separator
  95:     }
  96: }

And this is what it looks like when writing it within Intellipad:

lang

 

Language Compilation

Next step, is to compile the module ‘RSSLanguage.mg’ I just created; we use the mg.exe compiler provided by the Oslo SDK to do this:

mg
We get an .MGX file out of this. When renamed to a file with a .ZIP extension, I tried to open this file but it’s password protected. Anyone knows the secret password? :)

 

Run-time Language utilization

Last but not least I’d like to use my language within the .NET runtime. Luckily, the Oslo SDK provides us some base classes to do this. I created a new C# Console Application to test test things out.
First add references to the System.Dataflow and Microsoft.M.Grammar assemblies which can be found within the Bin directory of the Oslo SDK.:

image

Then, I wrote this code:

   1: using System;
   2: using System.Collections.Generic;
   3: using System.Linq;
   4: using System.Text;
   5: using System.Dataflow; // DynamicParser, GraphBuilder
   6: using Microsoft.M.Grammar; // MGrammarCompiler
   7:  
   8: namespace ConsoleApplication
   9: {
  10:     class Program
  11:     {
  12:         static void Main(string[] args)
  13:         {
  14:             try
  15:             {
  16:                 string imageFileName = @"C:\Users\Robert Jan\Desktop\My Documents\Oslo\MyOslo\ConsoleApplication\RssLanguage.mgx";
  17:                 string inputFileName = @"C:\Users\Robert Jan\Desktop\My Documents\Oslo\MyOslo\ConsoleApplication\FeedsInput.m";
  18:                 //inwit == module name
  19:                 //RssLanguage == language name
  20:                 string parserName = "inwit.RssLanguage";
  21:                 
  22:                 DynamicParser parser = MGrammarCompiler.LoadParserFromMgx(imageFileName, parserName);
  23:  
  24:                 object output = parser.ParseObject(inputFileName, ErrorReporter.Standard);
  25:  
  26:                 Helper.WalkMGraphTree(output);
  27:  
  28:             }
  29:             catch (Exception e)
  30:             {
  31:                 Console.WriteLine(e.Message);
  32:             }
  33:             Console.ReadLine();
  34:         }
  35:     }
  36: }

First, I Create a DynamicParser instance, and provide it with the compiled language image file (the .MGX file) and with the parserName. The parser name is the name of the module and the name of the language concatenated.

I then parse the input file using the ParseObject method, and we will get the result.

I wrote a nice Helper function that walks the result tree, and outputs its contents to the Console. Feel free to use it yourself (after giving me a comment here of course :)).

   1: using System;
   2: using System.Collections.Generic;
   3: using System.Linq;
   4: using System.Text;
   5: using System.Dataflow;
   6:  
   7: namespace ConsoleApplication
   8: {
   9:     class Helper
  10:     {
  11:  
  12:         public static void WalkMGraphTree(object rootNode)
  13:         {
  14:             IGraphBuilder builder = new GraphBuilder();
  15:             WalkNode(rootNode, builder);
  16:  
  17:         }
  18:         private static void WalkNode(object node, IGraphBuilder builder)
  19:         {
  20:             if (node.GetType().Name == "SequenceNode")
  21:             {
  22:                 foreach (object sequenceElement in builder.GetSequenceElements(node))
  23:                 {
  24:                     
  25:                     WalkNode(sequenceElement, builder);
  26:                 }
  27:                 Console.WriteLine();
  28:             }
  29:             else if (node.GetType().Name == "SimpleNode")
  30:             {
  31:                 Identifier id = builder.GetLabel(node) as Identifier;
  32:                 WriteLine(id.Text,false);
  33:                 foreach (object successorElement in builder.GetSuccessors(node))
  34:                 {
  35:                     WalkNode(successorElement, builder);
  36:                 }
  37:                 Console.WriteLine();
  38:             }
  39:             else
  40:             {
  41:                 WriteLine(Convert.ToString(node),true);
  42:             }
  43:         }
  44:  
  45:         private static void WriteLine(string line, bool newline)
  46:         {
  47:             Console.Write(line + " ");
  48:             if (newline)
  49:             {
  50:                 Console.Write(Environment.NewLine);
  51:             }
  52:         }
  53:         
  54:     }
  55: }

Now when I run the Console App, the output looks like this:

image

 

Summary

Here’s the summary of the steps I took, and the end result accomplished:

  • First, we created our languages; we separated some functionalities in separate languages and used these within  the RssLanguage
  • We created some input data and tested the languages combined with the input data within Intellipad
  • We compiled the languages with MG.exe into an .MGX image file.
  • We created a .NET applications which loads the image file and parses the input data through the language.
  • We created a Helper method which walks the result graph tree, and shows us the result within our Console.

Valuable links
Steef-Jan gave some pretty good links last Monday, I’d like to highlight one of those and give you two others:

Go and read what Martin Fowler has to say about Oslo and also check out what MSDN has to say about MGrammar:


Posted in: MGrammar , Oslo  Tags:

Currently rated 5.0 by 2 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Robert Jan - Sunday, November 02, 2008 - 4:06 PM

I installed the Oslo CTP SDK on my laptop, and wanted to try out some MGrammar stuff. I looked all around the tool to find the switch to enable the three pane view ‘MGrammar Mode´ as was shown in the demo’ s on the PDC.

So when browsing the SDK folder I found the Sample Sources directory :
” C:\Program Files\Microsoft Oslo SDK 1.0\Bin\Intellipad\Sample Sources”

It contains a VS solution file which contains a Microsoft.M.Grammar.IntellipadPlugin project; so it seems to me that I should do something with this project.
So I fired up the solution and built it right away.

Within the components directory of the Intellipad tool, I created a new folder called “Microsoft.M.Grammar.IntellipadPlugin” and I copied over a bunch of files:

image

The Private directory contains the “ModeMenuItem.xcml” file which can be found in the Private directory at the samples source location.

Also copy the Microsoft.M.Grammar.dll assembly into this new component directory and you are good to go.

Now when you fire up Intellipad, and open an .mg file, you will see an “MGrammar Mode” menu item appear. Click it, select  “Tree Preview” and open up the .mg file, and you will see the three pane view like in the demo’s on the PDC.

UPDATE 3/11/2008:
Just found at that Owen Evans found this (and his solution is easier :) )out a bit earlier than I did, check his blog at http://bgeek.net/2008/10/28/getting-oslos-intellipad-to-show-mgrammar-mode/
Also check out this intellipad blog: http://blogs.msdn.com/intellipad/archive/2008/10/29/creating-and-editing-mgrammar-files-with-intellipad.aspx

 


Posted in: MGrammar , Oslo  Tags:

Currently rated 5.0 by 1 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Page List

    Calendar

    «  March 2010  »
    MoTuWeThFrSaSu
    22232425262728
    1234567
    891011121314
    15161718192021
    22232425262728
    2930311234
    View posts in large calendar

    Recent Comments

    Feedburner Statistics 3/8/2010
    29 Readers ~ 78 hits ~ 1 reach

    Disclaimer
    The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

    © Copyright 2010 Inwit.nl