miranda

Write a custom Caching AST Transformation with Groovy

By breskeby | June 21, 2010

At the last JAX in Mainz I attended a talk of Hamlet D’Arcy called “code generation on the jvm”. The title wasn’t that inviting. But since I knew him and his groovy addiction, I’ve known it would be worth it. Besides a tiny introduction to spring roo and another library I don’t remember, he gave a nice introduction to Groovy AST Transformations. BTW, AST is a abbreviation of Abstract Syntax Tree.

What is an AST Transformation?

In short:

The purpose of AST Transformations is to let developers hook into the compilation process to be able to modify the AST before it is turned into bytecode that will be run by the JVM

Groovy is shipped with several build in AST Transformations. If you still have no clue what an AST Transformation is, or what it can do for you, have a look at the singleton example, that explains how a simple (groovy) class is converted into a singleton using AST Transformations.

Adding Caching
I won’t discuss general pros and cons of caching here. In this post I want to show how to create a custom AST Transformation which caches method calls. Think of an expensive method call like any kind of remote call or some image processing depending on one input parameter:

1
2
3
4
5
6
7
8
9
10
11
class SomeServiceClass {

   public String getRemoteValue(String input) {
        //...
        //make expensive remote call
        //or do a lot of calculations here
        //...
       
        return value
   }
}

In some cases it would be nice to cache the method results. There are different ways to do this:

  • in 1995 – The java developer would change the implementation of the SomeServiceClass into something like this:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    public class SomeServiceClass {
       private Map<string , String> cachedRemoteValues = new HashMap</string><string ,String>();

       public String getRemoteValue(String input) {
            String returnValue = cachedRemoteValues.get(input);
            if(null == returnValue){
                //...
                //make expensive remote call
                //or do a lot of calculations here
                //and store value locally in returnValue
                //...
               
                //store calculated value in hashmap
                cachedRemoteValues.put(input, returnValue)
            }
           
            return returnValue;
       }
    }

  • In 2004 – The smart guys would have wrote an aspect, that does this for you and compile their code with the iaic compiler
  • In 2006 – The state of the art guys would have wrote an aspect, but weaving it into their code at runtime
  • In 2008 – Today (thanks to the osgi hype), those of you who want to code at the bleeding edge would use equinox aspects ( http://www.eclipse.org/equinox/incubator/aspects/ ) to weave different versions of an caching aspect into your service bundle on.

But what sexy solution could we use in 2010 to get this done? What is sexier than:

1. using a sexy modern language like groovy
2. using a DSL (Domain Specific Language) to describe a what you really want
3. hooking into the compilation, juggling with AST nodes and tell the compiler directly what you want?

So lets get into the details. What is our target. I think it would be nice to mark all methods I want be cachable with an annotation named “@Cached” the example above would look like (no surprises here):

1
2
3
4
5
6
7
8
9
10
11
12
class SomeServiceClass {

   @Cached
   public String getRemoteValue(String input) {
        //...
        //make expensive remote call
        //or do a lot of calculations here
        //...
       
        return value
   }
}

Writing an Annotation that works as a marker for AST Transformations doesn’t much differ from normal Annotations. All it needs are some more arguments.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
package com.breskeby.example

import org.codehaus.groovy.transform.GroovyASTTransformationClass
import java.lang.annotation.ElementType
import java.lang.annotation.Target
import java.lang.annotation.RetentionPolicy
import java.lang.annotation.Retention

/**
 * Created by IntelliJ IDEA.
 * User: Rene
 * Date: 10.06.2010
 * Time: 23:25:30
 * To change this template use File | Settings | File Templates.
 */

@Retention (RetentionPolicy.SOURCE)
@Target ([ElementType.METHOD])
@GroovyASTTransformationClass (["com.breskeby.example.CachedTransformation"])
@interface Cached {

}

The first annotations should be known. RetentionPolicy.SOURCE means that the annotation is discarded by the compiler and not available at runtime or in the generated class. Since this annotation is only needed as a marker during the compilation, this is pretty obvious. ElementType.METHOD as parameter of @Target indicates that our annotation is only applicable for methods.
The real interesting part of the code snippet above is

@GroovyASTTransformationClass (["com.breskeby.example.CachedTransformation"])

This annotation indicates, that an ASTTransformation is linked to this Annotation. As a parameter you need to add the full qualified classname of an associated ASTTransformation. The class c.b.e.CachedTransformation implements the ASTTransformation interface.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
@GroovyASTTransformation(phase = CompilePhase.INSTRUCTION_SELECTION)
class CachedTransformation implements ASTTransformation {

 void visit(ASTNode[] astNodes, SourceUnit sourceUnit) {
    if(!astNodes) return
    if(!astNodes[0]) return
    if(!astNodes[1]) return
    if(!(astNodes[0] instanceof AnnotationNode)) return
    if(!(astNodes[1] instanceof MethodNode)) return

    //validate AnnotationNode
    MethodNode annotatedMethod = astNodes[1]
    if(annotatedMethod.parameters.length != 1) return
    if(annotatedMethod.returnType.name == "void") return

    ClassNode declaringClass = annotatedMethod.declaringClass
    makeMethodCached(declaringClass, annotatedMethod)
  }
}

The @GroovyASTTransformation provides information about how and when to apply the transformation. Further informations about compile phases can be found here. The whole AST Transformation itself is implemented via the visitor pattern.
Our implementation of the visit method checks that the annotaded method has only one parameter and that the result value isn’t void. The transformation can not know what to cache inside the method if the return value is void. After all these checks are done we call makeMethodCached to make the method cached (surprise! surprise!). The method makeMethodCached does the real work. We should take a look at it, shouldn’t we? The whole method is shown in the following listing:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
void makeMethodCached(ClassNode classNode, MethodNode methodNode) {
   // add field of hashmap for cached objects
   def cachedFieldName = methodNode.getName();
   FieldNode cachedField =
    new FieldNode("cache$cachedFieldName", Modifier.PRIVATE, new ClassNode(Map.class), new ClassNode(classNode.getClass()),
      new ConstructorCallExpression(new ClassNode(HashMap.class), new ArgumentListExpression()));
    classNode.addField(cachedField)

    //augment method with cache calls
    Parameter[] params = methodNode.getParameters()
    //methodNode
    String parameterName = params[0].getName()
    List<statement> statements = methodNode.getCode().getStatements();
    Statement oldReturnStatement = statements.last();
    def ex = oldReturnStatement.getExpression();
    def ast = new AstBuilder().buildFromSpec  {
      expression{
          declaration {
                variable "cachedValue"
                token "="
                methodCall {
                    variable "cache$cachedFieldName"
                    constant 'get'
                    argumentList {
                      variable parameterName
                    }
                }
          }
      }
      ifStatement {
          booleanExpression {
              variable "cachedValue"
          }
          //if block
          returnStatement {
              variable "cachedValue"
          }
          //else block
          empty()
      }
      expression{
          declaration {
            variable "localCalculated$cachedFieldName"
            token "="
            {-> delegate.expression < < ex}()
          }
        }
        expression {
          methodCall {
            variable "cache$cachedFieldName"
            constant 'put'
            argumentList {
              variable parameterName
              variable "localCalculated$cachedFieldName"
            }
          }
        }
        returnStatement {
              variable "localCalculated$cachedFieldName"
        }
    }

    statements.remove(oldReturnStatement)
    statements.add(0,ast[0]);
    statements.add(1,ast[1]);
    statements.add(ast[2])
    statements.add(ast[3])
    statements.add(ast[4])
  }

At first we add a FieldNode to our ClassNode. This is the private Map we use to store our cached Elements. After that we temporally store the name of the parameter and the expression of the return statement. Trust me, we need both, later…

Now its time to create some AST nodes. To do that groovy has a build-in AstBuilder. This builder offers different capabilities for that. In this example here we use the buildFromSpec method. Maybe this is a more verbose way than buildFromCode or buildFromString. But that’s a nice exercise to get a better understanding of an Abstract Syntax Tree. To get into the relationship of written code and the corresponding Abstract Syntax Tree in different compile phases you can use the groovy console and its “inspect AST” feature. The best documentation of the AST Specification DSL I found in the internet was the AstBuilderFromSpecificationTest class in groovy trunk.

Using AstBuilder.buildFromSpec we create five nodes here. Let’s take a look at each of them

  1.  // def cachedValue =  cacheMethodName.get("parameter")
    expression{
        declaration {
           variable "cachedValue"
           token "="
           methodCall {
              variable "cache$cachedFieldName"
              constant 'get'
              argumentList {
                 variable parameterName
              }
           }
        }
     }
    

    This calls a get on the hashmap with the parameter value of the method parameter.

  2.  // if(cachedValue) return cachedValue
     ifStatement {
        booleanExpression {
           variable "cachedValue"
        }
        //if block
        returnStatement {
           variable "cachedValue"
        }
        //else block
        empty()
     }
    

    This is a simple if statement. if the cachedValue is not null return cachedValue

  3.  // def localCalulatedCachedField = ...
     expression{
        declaration {
           variable "localCalculated$cachedFieldName"
           token "="
           {-> delegate.expression < < ex}()
        }
     }
    

    The third expression assigns a local variable to the expression of the returnstatement we stored at the beginning. Doing this via Specification is a bit tricky. We have to bring our stored expression into the spec. What makes it work is that for "declaration {}", the 3rd call has to be a closure execution that pushes one expression (type = Expression) into AstSpecificationCompiler's expression list. (Roshan Dawrani told me that. Ask him for further details...)

  4.  expression {
        methodCall {
           variable "cache$cachedFieldName"
           constant 'put'
           argumentList {
              variable parameterName
              variable "localCalculated$cachedFieldName"
           }
        }
     }
    

    The 4th expression puts the value stored in a variable in expression three into the hashmap

  5.  returnStatement {
        variable "localCalculated$cachedFieldName"
     }
    

    The last expression is a simple return statement. After we stored the calculated Expression in a hashmap (see expression 4) we return the value

  6. After creating our different AST Nodes we have to rearrange the list of Statements of the method we want to cache. First we remove the old return statement, and then we add the expressions above to the statement list of the method:

        statements.remove(oldReturnStatement)
        statements.add(0,ast[0]);
        statements.add(1,ast[1]);
        statements.add(ast[2])
        statements.add(ast[3])
        statements.add(ast[4])
    

    Now we’re done with adding caching to a method via AST Transformations. I pushed the whole example including tests to github.

    Limitations of this example
    This example uses the simpliest approach of caching. Introducing caching to your application, can bring different performance improvements, but can also introduce different problems. We didn’t care about cache invalidation in the example above. Furthermore using a simple HashMap can be a problem too. You should always use a SoftReference-based Map to do caching (see kabutz. Maybe I change this in a later post.

    links:

    One Response to “Write a custom Caching AST Transformation with Groovy”

    1. Dierk König Says:
      June 21st, 2010 at 09:57

      see also: Groovy’s @Lazy AST transformation ;-)

    Comments