Multi-stage Programming for Mainstream Languages Edwin Westbrook Mathias Ricken Jun Inoue Yilong Yao Tamer Abdelatif1 Walid Taha Rice University {emw4,mgricken,ji2,yy3}@cs.rice.edu, eng.tamerabdo@gmail.com, taha@cs.rice.edu Abstract Multi-stage programming (MSP) constructs enable a disciplined approach to program generation. In the purely functional setting, it is possible to statically type-check MSP constructs to ensure that they can only generate well-typed programs. Despite numerous attempts, it has been difficult to extend this guarantee in the presence of key features of mainstream languages, especially imperative constructs. This paper proposes a new method for achieving this guarantee and shows that it is powerful enough to express classic applications of MSP in Java. Our key insight is that safety can be regained by ensuring that the bodies of escapes are weakly separable from the rest of the code. This means that computational effects occurring inside an escape can only be visible outside the escape through types guaranteed to not contain code. Our method is simpler than prior proposals, and we expect that it can be intuitively understood by programmers. We formalize a calculus to demonstrate the soundness of the proposed approach. An implementation called Mint, which extends the Java OpenJDK compiler, is used to validate both the expressivity of the system and the performance gains attainable by using MSP in this setting. Categories and Subject Descriptors D.3.1 [Programming Languages]: Formal Definitions and Theory; D.3.3 [Programming Languages]: Language Constructs and Features General Terms Languages Keywords Multi-staged languages, Multi-stage programming, Type systems, Java a need for a type system that makes MSP accessible to general programmers and domain experts. 1.1 Contributions To address this need, we propose a new approach to type-safe MSP. We argue that this approach is better suited for type-safe MSP in mainstream language than previous proposals. Our contributions include: • A minimal language extension to support MSP in Java, combin- ing the three standard MSP constructs with a reflection library of staged versions of the standard Java reflection classes (Section 2.2). • The notion of weak separability, which limits the computational effects that can occur inside the bodies of escapes (Section 3). Weak separability is enforced by a small set of restrictions that ensure that any effects that can be observed outside escape expressions do not involve code objects. We expect that the restrictions will be easily and intuitively understood by mainstream programmers. • Demonstration of the expressivity of a language with these restrictions through both standard pure examples and examples with imperative features (Section 4). • A type system based on weak separability, an operational se- 1. Introduction mantics that formalizes the runtime behavior of an objectoriented MSP language with effects, and a proof that running any well-typed program is guaranteed to be free of any runtime errors, including possible scope extrusion and generation (and execution) of ill-formed code (Section 5). Full proofs are available in Appendix A. • An implementation of this proposal, published online at Multi-stage programming (MSP) languages provide a hygienic quasi-quotation mechanism intended for program generation. Hygiene ensures that generated programs are free of accidental variable capture, a problem that makes using strings to generate programs in preprocessors like cpp notoriously hard to use. Research on functional languages has shown that it is possible to statically check MSP programs to ensure that they can only be used to generate well-typed programs [22, 23, 4]. Unfortunately, extending this static typing guarantee to mainstream languages has proved to be challenging. In particular, standard features of mainstream languages, such as imperative assignment, lead to scope extrusion, in which variables in code escape the scopes where they are defined. Several approaches to solving this problem have been proposed. Two of these proposals use record polymorphism and index the types of code objects with their free variables [13, 1], while another one uses delimited control to express effects and to limit their scope [11]. These are powerful systems that give the expert MSP user fine-grained control over scoping in code. However, there is still 1 Ain http://plresearch.org/JavaMint (Section 6). The implementation is based on the Java OpenJDK compiler from Sun Microsystems. • Validation of the performance impact of MSP in Mint, showing that it is consistent with prior studies (Section 7). 1.2 Comparisons with Related Efforts Shams University. Several efforts have been made to accommodate effects in the context of multi-stage programming, as well as to accommodate object-oriented features. In what follows we summarize the most closely related efforts. Early efforts to develop sound type systems for MSP languages with effects focused on introducing imperative features to functional MSP languages [22, 3, 2, 13, 11]. All of these support manipulation of open terms and guarantee well-formedness of the generated code, but they significantly differ in the approaches and extents to which they support effects. Calcagno et al. [3] allows imperative operations on codes but do not support imperative operations on open terms. Kim et al. [13] support unrestricted imperative op- public static Integer power ( Integer x , Integer n ) { if ( n == 1) return x ; else return x * power (x , n -1); } public static Code < Integer > spower ( Code < Integer > x , int n ) { if ( n == 1) return x ; else return <| ‘x * ‘( spower (x , n -1)) | >; } public static abstract class PowerFun { public abstract int apply ( int x ); } Code CodePower17 = <| new PowerFun () { public int apply ( final int x ) { return ‘( spower ( <| x | > , 17)); } } | >; PowerFun spower17 = CodePower17 . run (); int val = spower17 . apply (2); Figure 1. The unstaged power function erations on open terms but choose not to provide α-equivalence for future-stage code. Their system delegates hygiene to a specialized binder λ∗ , whose operation can be explained only in terms of an implicit “gensym.” They present an inferable polymorphic type system. Ancona and Moggi [2] incorporate imperative operations on open terms and provide hygiene. The imperative primitive in all of these, except for Kameyama et al. [11], are ML-style “boxed” references, which is not in line with Java semantics. Pervasive, unboxed references, an essential feature of Java, exascerbate the problem. (See Section 3 for a detailed discussion.) Kameyama et al. [11] use delimited control as their imperative primitive, which is more general than mutable stores. They maintain hygiene and support imperative operations on open terms, but they choose not to allow any side effect to occur inside a future-stage binder that is visible from the code outside. Until recently, efforts to introduce MSP to the object-oriented setting focused on engineering aspects. The staged extensions of Java by Schultz et al. [16], Kamin et al. [12], and Zook et al. [24] focus on implementation, applications, and on quantifying the performance benefits. These extensions were not formalized. Neverov and Roe [14] formalize a core typed, Java-like calculus but leave the type soundness unproved. Their calculus also does not have side effects. Huang et al. [9] state that their system guarantees well-formedness and well-typedness of generated code, but they do not prove such a result or formalize their system. In later work, Huang et al. [8] focus on reflection, and do not allow manipulation of arbitrary code values (in particular open terms). They prove soundness, but their system does not model side effects. Aktemur [1] and Kim et al. [13] rely on a form of record typing that makes the type and the type system complex. As such, our approach is closest to that of Calcagno et al. [3], in which code values involved in effects are checked against certain closedness criteria. In contrast, we identify and solve the problem of finding an appropriate notion that can work with Java’s complex object model. Figure 2. The staged power function. Code objects can be escaped or run. Escapes are written as ‘ and allow code objects to be spliced into other brackets to create bigger code objects. For example, Code < Integer > x = <| 2 + 3 | >; Code < Integer > y = <| 1 + ‘x | >; stores <| 1 + (2 + 3) |> into y. Run is provided as a method run() that code objects support. For example, executing int z = y . run (); 2. Programming in Mint As noted earlier, Mint extends Java with three MSP constructs and a library of staged reflection primitives. The guiding principle in Mint’s design is parsimony. In this section we introduce the design from the programmer’s perspective. 2.1 Staging Constructs after the above example sets z to 6. Basic MSP in Mint can be illustrated using the classic power function example. Figure 1 displays the unstaged power function in Java. Figure 2 displays a staged version. This staged method spower takes in an argument x that is a piece of code for an integer, along with an integer n, and returns a piece of code that multiplies x by itself n times. To use spower we create code for an anonymous inner class PowerFun. The generated class, which is assigned to the variable CodePower17, has an apply method with a body that is generated by spower called with exponent 17. This creates code that multiplies the input by itself 17 times. The code CodePower17 is then compiled and run with the run() method, which produces a PowerFun and assigns it to spower17, and finally val is bound to the result of calling the apply method of spower17 on 2, computing 217 . 2.2 Staged Reflection Primitives Mint extends Java 1.6 with the three standard MSP constructs: brackets, escape, and run [22, 23, 4]. Brackets are written as <| |> and delay the enclosed computation by returning it as a code object. For example, <| 2 + 3 |> is a value. Brackets can contain a block of statements if it surrounded by curly brackets: <| { C . foo (); C . bar (); } | >; Neverov observed that staging and reflection in languages like C# and Java can be highly synergistic [14]. He also noticed that fully exploiting this synergy requires providing a special library of staged reflection primitives. Mint provides such a library. The primitives are based on those in the standard reflection primitives in the Java library, including the Class and Field classes. 1 To represent these in Mint, the library adds two corresponding types, ClassCode and FieldCode. The ClassCode type is indexed by the class itself, just like the type it is modelled after. For example, the corresponding class for Integer objects has type ClassCode. Any ClassCode object provides methods 1 The Code objects have type Code, where T is the type of the expression contained. For example, <| 2 |> has type Code. A bracketed block of statements always has type Code. Mint reflection library does not support all reflection primitives. For example, the Method and Constructor require multiple arguments. This requires adding indexed types to Java, and is therefore outside the scope of this work. for manipulating class corresponding to the methods of Class. For example, the cast method of ClassCode takes any code object of type Code and inserts an unsafe cast in the code object, yielding a code object of type Code. Because the cast is inserted into the code, any exceptions raised by the cast will not happen until the code is run with the run() method. The class also provides methods for looking up a class by name and for retrieving the fields, methods, and constructors of a class. The type FieldCode represents a field in class A that has type B. It provides a get method which takes a Code value and returns a value of type Code. This method constructs field selection (intuitively, a <| (‘a).f |> code fragment) on that object. The type also provides a getType method to return a ClassCode object for the type B. The following example illustrates the use of these classes. The code defines a method fieldIter that uses the getFields() method to iterate over all the statically known fields of an object of type A: Code < Void > fieldIter ( FieldFun fun , Code o , CodeClass clazz ) { Code < Void > c = <| { } | >; for ( CodeField f : clazz . getFields ()) c = <| { ‘c ; ‘( fun . call ( f . get ( o ) , f . getType ())); } | >; return c ; } interface FieldFun { // for any T Code < Void > call ( Code c , CodeClass t ); } detail, explain why simple approaches are inadequate, and how the notion of separability can lead to a practical solution. 3.1 Basic Errors in Untyped MSP Programs Three basic types of errors can arise in any language that supports staging constructs, namely, (1) running or escaping a non-code value, (2) using a variable before it is bound, (3) similar behavior that can result from using the run construct. Any sound type system must prevent these type of errors. An example of the first type can be seen in what follows: <| { Code < Void > z = foo (); ‘(17); } | > Because 17 is not a code value, this escape operation would fail. An example of the second type can be illustrated with the following code: <| { Code < Void > z = foo (); ‘( z ); } | > Incorrect use of the run construct can lead to essentially the same kind of error. For example, we can write code that effectively reduces to the same code we have above: <| { Code < Void > z = foo (); ‘( <| z | >. run ()); } | > This is a classic example of how the run construct dynamically changes the level of a term. 3.2 When can Scope Extrusion Occur? For each field in class A, the method creates a projection of o for that field and passes it and the staged class object for the type of the field to fun.call(), where user-defined processing is performed. The code objects returned by the FieldFun.call method are accumulated in the code object c. For example, calling fieldIter with the following FieldFun creates a code object that recursively prints an object and all of its fields: public static class PrintFieldFun implements FieldFun { public separable Code < Void > call ( Code c , ClassCode t ) { return <| { System . out . println ( ‘ c ); } | >; } } Scope extrusion occurs when any one of the following situations arises in the body of an escape: 1. Assigning a code object to a variable or field that is reachable outside the escape, for example: <| { Integer y = foo (); Integer z = ‘( x = <| y | >); } | >; 2. Throwing an exception that contains a code object, for example: Code < Integer > meth ( Code < Integer > c ) { throw new C o d e C o n t a i n e r E x c e p t i o n ( c ); } <| { Integer y ; ‘( meth ( <| y | >)); } | > In Section 4 these two types will be used to implement a staged serializer in Mint. 3. Cross-stage persistence (CSP) of a code object, an example of which is displayed in Figure 3. The first two cases are traditional conditions for scope extrusion. The first example extrudes y from its scope by assigning <| y |> to the variable x bound outside of the scope of y, while the second example throws an exception containing <| y |> outside the scope of y. The third case is more subtle. The call to doCSP in the example injects the anonymous inner subclass of Thunk into the returned code using CSP, yielding <| new IntCodeFun () { Code < Integer > call ( Integer y ) { return T . call (); }}. call (1) | > 3. The Scope Extrusion Problem A key challenge in statically ensuring safety in imperative MSP languages is preventing scope extrusion. In MSP languages, free variables arise when escaped computations involve code fragments containing variables bound inside surrounding brackets. In other words, they arise in programs that build a code fragment containing a binding construct, and in the body of the binding construct there is an escaped computation that refers to a variable introduced by that binding construct. In the purely functional setting, this can never lead to scope extrusion. However, in the presence of effects, the escaped computation can leak the code value to, say, a global store. Generally, the store is taken to exist outside the scope of the term being evaluated, and therefore, there is no obvious way to associate a unique binder with a variable that occurs free in a code fragment in the store. In this section, we review the basic types of errors that can arise in an MSP language, examine the scope extrusion problem in more where T is the Thunk that returns <| y |>. In a substitution-based semantics, calling T with 1 would return <| 1 |>, and no scope extrusion would occur. However, at the more realistic level of an environment-based semantics that uses an environment to implement substitution efficiently, T would return the literal value <| y |>; preventing this behavior would require the run() method interface IntCodeFun { Code < Integer > call ( Integer y ); } interface Thunk { Code < Integer > call (); } class ThunkCSPer { Code < Code < Integer > > doCSP ( Thunk f ) { return <| f . call () | >; } } <| new IntCodeFun () { Code < Integer > call ( Integer y ) { return ‘( ThunkCSPer . doCSP ( new Thunk () { Code < Integer > call () { return <| y | >; }})) }}. call (1) | > Mint type-checker enforces this by ensuring that the following hold of any term inside an escape: 1. Assignment is made only to variables bound within the term; 2. Exceptions are only thrown when the exception value is either an exception caught by a previous catch in the program fragment, or a constructor call new C(e1, ..., en) where the ei are code-free; 3. Cross-stage persistence occurs only for final variables of codefree types; 4. Only methods and constructors whose bodies are weakly separable are called. The first three clauses directly address the three cases of scope extrusion in the previous section. Note that the final restriction on CSP variables exists so that the value of the variable does not change over the lifetime of the code object; Java has a similar restriction for variables referenced inside anonymous inner classes. The last clause ensures that all methods called from the body of a weakly separable program fragment also satisfy weak separability. To check this condition, methods that are going to be called from the body of an escape are explicitly annotated in Mint with the new keyword separable. Note that a call to run() is not weakly separable, so incorrect use of it as described in Section 3.1 is prevented as well. Weak separability statically ensures that no code object created inside an escape can leak out of the escape. Thus, scope extrusion is not possible. The restrictions are easy to understand and follow, and the compiler can point out exactly where the errors are if the user violates them. We also believe the reasons behind them are simple to understand, given an explanation of scope extrusion. The remaining question is whether the system is too restrictive. This is answered in the following section. Figure 3. Cross-stage Persistance of Code Objects to traverse the definition of the call() method of T and to replace <| y |> with <| 1 |>, which would be difficult in the JVM. This behavior, known as the “hidden free-variable problem,” arises commonly in environment-based implementations of multi-stage languages [19]. A substitution-based semantics does not allow us to capture it. The object hiding the free variable is usually a closure in λ calculus-based models, and when computation proceeds via substitution, the entire λ term (as well as open terms therein) is not moved off to an explicit heap, which makes scope extrusion impossible. To prevent this problem, the type system must place special restrictions on CSP with reference types. In a language with unboxed references, this must be extended to all (non-primitive) types, and its impact on the expressivity of the language is more pervasive. 3.3 Weak Separability 4. Expressivity We can prevent the three situations mentioned in the previous section using the following notion: Definition 1. A program fragment is separable if it is observable from the surrounding runtime environment only through its return value. A separable program fragment appears purely functional; it does not have any side effects at all. Mint, however, does allow side effects as long as they only involve values that are code-free: Definition 2. A type is code-free if all of its fields are code-free, and its class is final, meaning it is not allowed to be subclassed. A value is code-free if its type is code-free. The requirement that a class is final ensures that a subclass with an additional field of type Code cannot be substituted at runtime. Code-free types include number types such as Integer and Double, the String class, arrays of code-free types, and all of Java’s reflection classes such as Class and Field. It does not include Object, as this type is not final. This is justified because an Object could be a code object at runtime. We can now define the notion of weak separability that describes the code Mint allows inside escapes: Definition 3. A program fragment is weakly separable if it is observable from the enclosing runtime environment only through its return value or through side effects involving only code-free values. Requiring that code inside escapes is weakly separable is sufficient to prevent scope extrusion. This is proved in Section 5. The Weak separability does not severely restrict expressiveness, and many useful MSP programs can be written in Mint. Intuitively, this is because code generators do not rely heavily on computational effects. Most classic applications of MSP, such as interpreters, use code generators that are purely functional. This does not mean that the generated code is functional, just that the generators are. In addition, the run() method is only ever called at the top level in almost all applications of MSP, and cross-stage persistence is mostly used for primitive types. To illustrate these points, the remainder of this section describes the implications of weak separability and examines a number of MSP examples in Mint, including: staging an interpreter, a classic MSP example; staging a for loop to do loop unrolling, demonstrating a generator for imperative code; and a staged serializer that uses Mint’s reflection capabilities. The performance of these examples is evaluated in Section 7. 4.1 Programming with Weak Separability While classes used in CSP and in escapes need to be code-free, the restrictions that this places on programs can be avoided in most cases. In practice, there are two main difficulties. First, only weakly separable methods can be called from within escapes. This excludes most existing classes, such as those in the standard Java API, from being used in escapes. However, there is no restriction placed on the code generated by an escape, so the restriction is essentially on code generators themselves. We have yet to find a case when inseparable calls were necessary inside an escape. The second difficulty is that subtype polymorphism cannot be used in CSP, because classes used in CSP need to be final. For example, programs that use the Runnable interface to implement the command design pattern [7] cannot execute the commands abstractly if they use CSP, as in this example: class MyCmd implements Runnable { ... } public void someMethod () { Runnable cmd = new MyCmd (); // error : Runnable not final Code < Void > cv = <| { cmd . run (); } | >; } return f . get ( _s ). apply ( _e . eval (e , f )); } } Variable lookup is performed in a variable environment by calling the Env.get(String s) method, returning an integer. In applications, function lookup is done using the FEnv.get(String s) method, returning a Fun object with an int apply(int v) method, which is then applied to the argument of the application. interface Env { public int get ( String y ); } interface FEnv { public Fun get ( String y ); } interface Fun { public int apply ( int param ); } We can regain the ability to perform dynamic dispatch by making subclasses be final and rewriting our program to use either static variables or final local variables, as follows: final class MyCmd implements Runnable { ... } public static Runnable cmd1 = new MyCmd (); public void someMethod () { final MyCmd cmd2 = new MyCmd (); // ok : cmd1 is static , not CSP Code < Void > cv2 = <| { cmd1 . run (); } | >; // ok : MyCmd is final Code < Void > cv2 = <| { cmd2 . run (); } | >; } Two empty environments, env0 and fenv0, unconditionally throw an exception in their get methods to signal a failed lookup. The environments are extended using the ext and fext methods. For instance, ext is static Env ext ( final Env env , final String x , final int v ) { return new Env () { public int get ( String y ) { if ( x . equals ( y )) return v ; else return env . get ( y ); } }; } 4.2 Staged Interpreter Staged interpreters are a classic application of MSP [20, 21]. To demonstrate that staged interpreters can be written in Mint, we have implemented an interpreter for a small programming language called lint [20], which supports integer arithmetic, conditionals, and recursive function definitions of one argument. The unstaged interpreter represents expressions with the Exp interface, and instantiates this interface with one class for each kind of AST node in the language. This interface specifies the single method eval for evaluating the given expression, which takes two environments, one for looking up variables and the other for looking up defined functions. For example, integers, addition, variables and application of defined functions are implemented as follows: interface Exp { public int eval ( Env e , FEnv f ); } class Int implements Exp { private int _v ; public Int ( int value ) { _v = v ; } public int eval ( Env e , FEnv f ) { return _v ; } } class Add implements Exp { private Exp _l , _r ; public Add ( Exp l , Exp r ) { _l = l ; _r = r ; } public int eval ( Env e , FEnv f ) { return _l . eval (e , f ) + _r . eval (e , f ); } } class Var implements Exp { private String _s ; public Var ( String s ) { _s = s ; } public int eval ( Env e , FEnv f ) { return e . get ( _s ); } } class App implements Exp { private String _s ; private Exp _e ; public App ( String s , Exp e ) { _s = s ; _e = e ; } public int eval ( Env e , FEnv f ) { Recursive functions are implemented using anonymous inner classes to express closures. The code below creates a function environment fenv1 with the declaration of the identity function id(x) = x: final Exp body = new Var (" x "); FEnv fenv1 = fext ( fenv0 , " id " , new Fun () { public int apply ( final int param ) { return ( body . eval ( ext ( env0 , " x " , param ) , fext ( fenv0 , " id " , this ))); } }); The staged interpreter redefines the Env.eval method to return Code, so that evaluating an expression yields code to compute its value. The variable environment returns Code, and the function environment returns Code. interface Exp { public escape_safe Code < Integer > eval ( Env e , FEnv f ); } interface Env { public escape_safe Code < Integer > get ( String y ); } interface FEnv { public escape_safe Code get ( String y ); } The return type of the FEnv.get method uses a wildcard with an upper bound of Fun. This is necessary since the type of the value produced by the code object is not exactly Fun, but rather a subtype of Fun. The Exp.eval, Env.get, and FEnv.get methods are marked as separable so that they can be called from inside an escape. Staging the above AST classes yields the following: interface Exp { public separable Code < Integer > eval ( Env e , FEnv f ); } class Int implements Exp { /* ... */ public separable SafeCode < Integer > eval ( Env e , FEnv f ) final int v = _v ; return <| v | >; } } class Var implements Exp { /* ... */ public separable SafeCode < Integer > eval ( Env e , FEnv f ) return e . get ( _s ); } } class Add implements Exp { /* ... */ public separable SafeCode < Integer > eval ( Env e , FEnv f ) return <| ‘( _l . eval (e , f )) + ‘( _r . eval (e , f )) | >; } } class App implements Exp { /* ... */ public separable SafeCode < Integer > eval ( Env e , FEnv f ) return <| ‘( f . get ( _s )). apply ( ‘( _body . eval (e , f ))) | >; } } { Evaluating a program is now a two-step process. The eval method now returns code for an integer, running that code returns the integer. Section 7 provides performance comparisons between staged and unstaged interpretation. 4.3 Loop Unrolling { { As discussed above, weak separability does not restrict the computational effects in generated code; it does so only in the code generators themselves. As an example of this, we consider a code generator for loop unrolling, and how it can be used to unroll a loop with non-local side effects. We can write a generic loop in standard Java as follows: public static void roll ( int start , int stop , int step , Iter I ) { for ( int x = start ; x < stop ; x += step ) I . iteration ( x ); } { The Int and Var classes are straight-forward. Operations like Add recursively evaluate their subtrees and splice together the returned code objects. Function application in App again uses the get method to look up the function named by _s. The return type of get is now Code, meaning that get returns code for the defined function. This code is spliced into the returned code, and its result is applied to the evaluation of the argument using the apply method. The argument for apply is obtained by splicing in the code object returned by _body.eval.escapes both the function and the code value from evaluating the argument, and then returns code to apply the function to the argument. If _s does not name a valid, defined function, then the get method throws an exception. This is the only computational effect in the whole staged interpreter that happens inside a code generator. It is weakly separable because the thrown exception need only contain the string argument _s that was not found in the environment; the exception therefore is code-free. The functions to extend the environments associate names with code objects now. During lookup, we have to use the == operator instead of the String.equals method, because the latter has not been declared separable. This is not a serious impediment, though, since Java strings are immutable and can be interned. Staging Env.ext yields: static separable Env ext ( final Env env , final String x , final SafeCode < Integer > v ) { return new Env () { public separable SafeCode < Integer > get ( String y ) { // error : if ( x . equals ( y )) if ( x == y ) return v ; else return env . get ( y ); } }; } This uses an interface called Iter to specify an arbitrary action for each iteration of the loop through the iteration method, which has return type void. To unroll this loop, we can stage the roll method as follows: public static separable Code < Void > unroll ( int start , int stop , int step , SIter I ){ Code < Void > c = <| { } | >; for ( int x = start ; x < stop ; x += step ){ c = <| { ‘c ; ‘( I . iteration ( x )); } | >; } return c ; } This method uses an interface SIter to specify a code object for each iteration of the loop through the iteration method, which for SIter has return type Code. These code objects are accumulated into a code object c containing the sequence of statements for the whole loop. This code generator is written in an imperative style consistent with the prevailing Java culture. The body of this method is weakly separable because c is bound inside the method. The code object returned by I.iteration is not. For example, the following class generates code that accumulates the indices used in the loop iteration into an object given by cell: static class sIncrIter implements SIter { Code < IntCell > cell ; public separable Code < Void > iteration ( final int i ) { return <| { ( ‘ cell ). value += i ; } | >; } } 4.4 Serializer Generator In the code that creates recursive functions, we again use anonymous inner classes for closures. Since the function environment associates names with code for functions, we have to put a reference to the function we are creating in brackets. However, a reference to this is not allowed, so we create a final local variable fthis and return code for it. final Exp body = new Var (" x "); FEnv fenv1 = fext ( fenv0 , " id " , <| new Fun () { public int apply ( final int param ) { final Fun fthis = this ; return ‘( body . eval ( ext ( env0 , " x " , <| param | >) , fext ( fenv0 , " id " , <| fthis | >))); } } | >); A serializer is a program that recursively converts an object and all of its fields to a string representation. Serializers are often slow, however, because they must use Java’s reflection primitives to determine the fields of an object at runtime. Here we show how to write a staged serializer, which generates a serializer for a given static type. This approach performs the necessary reflection when the serializer is generated, and then generates code to serialize all of a given object’s fields: public static separable Code < Void > sserialize ( ClassCode type , final Code obj ) { if ( type . getCodeClass ()== Byte . class ) return <| { writeByte ( ‘(( Code < Byte >) obj )); } | >; else if ( type . getCodeClass ()== Integer . class ) return <| { writeInt ( ‘(( Code < Integer >) obj )); } | >; Code < Void > result = <| { } | >; for ( final FieldCode fc : type . getFields ()) { result = <| { ‘ result ; ‘( s s e r i a li z e F i e l d ( fc , obj )); } | >; } return result ; } The code to write primitive fields is generated directly. Nonprimitive fields are visited recursively. The code is then spliced together and returned. This example was inspired by a similar example in the Metaphor paper [14]. 5. Type Safety We now turn to formalizing a subset of Mint, called Lightweight Mint (LM), and to proving type safety. Type safety implies that scope extrusion is not possible in Mint. LM is based on Lightweight Java [18] (LJ), a subset of Java that includes imperative features. LM includes staging constructs (brackets, escapes, and run), assignments, and anonymous inner classes (AICs). These features—especially the staging constructs and AICs—make the operational semantics and type system large; staging constructs alone double the number of rules in the operational semantics, while AICs increase the complexity of the type system. All of these features, however, are necessary to capture the safety issues that arise in Mint. Specifically, assignments are required to cause many forms of scope extrusion, and AICs are required to create the scopes (i.e., the additional variable bindings) that can be extruded. AICs also lead to more complex possibilities for scope extrusion as shown in Section 3.2, which we wish to show are prevented by our system. A significant development of the type system is the use of a sequence of store typings rather than a single store typing. This sequence is a stack that grows from left to right, where a new “frame” is pushed onto the stack when we enter a new scope (i.e., when new variables are bound) in a code object. Earlier frames can only refer to locations in later frames if the latter are code-free, ensuring that scope extrusion cannot occur through assignments. The key lemma involved in this approach is the Smashing Lemma, which allows a stack of n + 1 frames to be smashed into a valid stack of n frames by “smashing” the code-free locations in the last frame into the penultimate frame. To simplify the formalism somewhat, we disallow assignments to local variables in LM. All assignments must instead be to object fields. This completely disallows assignments in escapes, however, in which assignments are only allowed to local variables. To rectify this problem, we add a restricted form of let, written as let x = new C (...) in ... extensible class names D final class names F variables x field names f method names m heap locations l classes C ::= D | F separability marker S ::= sep | insep types τ ::= C | Code S , τ class declarations CL ::= class C extends D 0 { τi fi I ; Mj J } i j method declarations M n ::= S τ m( τi xi i ){en } class hierarchy P ::= CLi i programs p ::= P, e0 expressions en ::= x | l | en .f | (en .f := en ) | en .m( en i ) i | let x ⇐ new C( en i ) in en i 0 | new D( en I ) { Mj J } i i j | |en+1 | | ‘en−1 [n > 0] | en .run() n values v ::= l | en−1 [n > 0] NB: Production rules marked [n > 0] can be used only if n > 0. Figure 4. Lightweight Mint syntax. 5.1 Syntax In this section, we formalize the syntax of LM. We use the following sequence notation: Notation. We write Ai J for a sequence with index i ranging i=I over I..J, inclusive. I may be omitted, and it defaults to 1. The superscript is omitted in addition if the index range is clear from context. In general, sequences indexed by different variables have different bounds. The sequence may be explicitly written out like a, b, c, . . . with no subscript. The empty sequence is written . Given sequences s1 and s2 , their concatenation is written s1 ◦ s2 . We may write Ai i , A to mean Ai i ◦ A if the intention is clear. ei J [i0 → x] is the same sequence as ei i except that ei0 is i=I replaced by x. The syntax of LM is given in Figure 4. Expressions are stratified into levels. An expression is at level n if, for every point in the expression, the nesting of escapes is at most n levels deeper than brackets. Clearly, a level-n expression is also a level-(n + 1) expression. This stratification induces a similar structure on method declarations. A complete program must not have any unmatched escapes, so the bodies of methods declared in the class hierarchy are required to be at level 0. Likewise, the initial expression in a program is required to be at level 0. Values are also stratified: a value at level 0 is just a heap location, and a value at level > 0 is any lower-level expression. We categorize classes as final (F ) or extensible (D) depending upon their names. In the implementation, they are rather categorized according to the manner in which they are declared, but using disjoint sets of names gives a simpler system. Code S , τ falls under neither classification. We do not allow an AIC to have fields or methods that its parent does not, although we allow method overrides. Additional fields or methods can be emulated by declaring (statically) a new subclass with those fields and creating anonymous subclasses of those. We do not include the syntax (new C( . . . )) for instantiating ordinary (i.e., non-AIC) classes because one can write (let x ⇐ new C( . . . ) in x) instead. Sequencing (e1 ; e2 ) is also omitted be- which always allocates a new instance of a class C which is not an AIC. We then relax the restrictions on escapes to allow field assignments if the object containing the field was allocated by a let inside the escape. Local variable assignment can then be modeled by replacing any local variable binding x of type C for which there is an assignment by a let-binding of a new variable x_cell of type CCell, defined as follows: public class CCell { public C x ; } Uses of x, including assignments to x, can then be replaced by uses of x_cell.x. cause this sequence can be written seq.call(e1 , e2 ), where seq.call is a method that ignores its first argument and returns its second. As a technical point, the code type is indexed by a separability marker which indicates whether a code object is itself separable. Specifically, Code sep, τ is the type of code objects containing separable code, which is a subtype of the standard code type, written Code insep, τ . This distinction is necessary in the case of a separable expression which itself contains a nested escape ~e, since we must know for type preservation that ~e is guaranteed to reduce only to separable code. In this case, e must have type Code sep, τ . All judgments and functions in the following discussions implicitly take a class hierarchy P as a parameter. We avoid writing it out explicitly because it is fixed for each program and there is no fear of confusion. operational terms heaps runtime type tags heap elements pseudo-expressions pseudo-values H : l→h T :: = C | sub D { Mi0 i } | Code h :: = (C, li i ) | (Code, |e0 | ) | (sub D { Mi0 i }, lj j ) en :: = en | M n b v n :: = v n | M n−1 [n > 0] b fin 5.2 Operational Semantics Figure 5 shows preliminary definitions that we need for the operational semantics. A heap is a finite mapping from locations to heap elements, where a heap element contains a runtime type tag with either the contents of the object or a code value if the tag is Code. We use the phrase pseudo-expressions to refer to syntactic elements that are either expressions or method declarations, and similarly we use pseudo-values to refer to values or method declarations. An evaluation context E n,k is indexed by two levels, the level n outside of the context and the level k inside. The intent is for any well-typed level-n expression to be decomposed uniquely as E n,k [rk ] where rk is a redex at level k, unless the expression is a (level-n) value. There are two variants of evaluation contexts, n,k one that yields an expression (Ee ) when plugged in, and one that n,k yields a method declaration (EM ). Both variants can be plugged with expressions only. The function fields() extracts the fields of a type, while the method() function looks up a method. method() respects the method overriding rules. mbody() extracts the specified method’s formal arguments and body. Code types do not have methods (run() is formally not a method). mname extracts the method name from a method declaration. Figure 6 shows the small-step semantics for Lightweight Mint. n This is given as the judgment H1 , e1 H2 , e2 stating that heap H1 and expression e1 take a single step at level n to heap H2 and expression e2 . This judgment is the closure under n, k-evaluation k contexts of the primitive one-step relation at level k. Most of prim evaluation contexts n,k n,k E n,k ::= Ee | EM n,k n,k EM ::= S τ m( τi xi i ){Ee }[n > 0] n,k n,k n,k Ee ::= •[n = k] | Ee .f | (Ee .f := en ) n n,k n,k | (v .f := Ee ) | Ee .m( en i ) i n n,k | v n .m( vi i , Ee , en j ) j n n,k | let x ⇐ new C( vi i , Ee , en j ) in en j n n,k | let x ⇐ new C( vi i ) in Ee [n > 0] n n,k n | new D( vi i , Ee , en j ){ Ma a } j n,k n−1 n n | new D( vi i ){ Mj , EM , Ma a }[n > 0] j n+1,k n−1,m n,k | |Ee | | ‘Ee [n > 0] | Ee .run() fields(τ ) or fields(T ) fields(Object) = fields(Code) = fields(Code S , τ ) = fields(sub D { Mi0 i }) = fields(D) fields(C) = τi fi i ◦ fields(D ) where class C extends D { τi fi i ; . . . } ∈ P mname(M n ) mname(S τ m( . . . ){ . . . }) = m method(m, τ ) or method(m, T ) method(m, sub D { Mi0 }i ) ( Mi0 if mname(Mi0 ) = m = method(m, D) otherwise method(m, C) ( 0 0 Mj if mname(Mj ) = m = method(m, D ) otherwise 0 assuming that class C extends D { . . . ; Mj j } ∈ P . mbody(M 0 ) or mbody(m, τ ) or mbody(m, T ) mbody(S τ m( τi xi i ){e0 }) = ( xi i , e0 ) mbody(m, τ ) = mbody(method(m, τ )) mbody(m, T ) = mbody(method(m, T )) Variables returned by mbody are always fresh. Figure 5. Preliminary definitions for operational semantics. the primitive reduction steps are straightforward, including rules for class instantiation, method invocation, and assignment. These reductions only occur at level 0, to prevent reductions from occurring inside code objects. Since local variables are immutable, we model method invocation and let-form execution by substitution. The local binding L found in LJ [18] and similar formalisms is therefore unnecessary, and the small-step judgment is made between heap-term pairs rather than bindings-heap-term triples. Note that using substitution is not the same as a substitution-based semantics such as discussed in Section 3.2, because substitution here does not substitute into data in the heap. There are also three staging-related reduction rules, for escape, run, and brackets. The rules for escape and run remove an expression from its brackets, with the only difference being that escape reduces only at level 1 (escape is illegal at level 0) and run only reduces at level 0. These are standard in multi-stage languages [19], except that the code values are on the heap. The rule for brackets allocates a code object on the heap. CSP, which can be regarded as execution at arbitrarily high levels, is automatically taken care of by substitution and does not give rise to a redex. 5.3 Type System Figure 7 gives preliminary definitions for the type system. A variable typing (or type environment) comes in pairs, separated by a | . The predicate iscf(τ ) means that τ is code-free. Note that iscf() is defined co-inductively. The auxiliary functions ftypes(), ftypei (), and mtype() are similar to those defined for the operational semantics, but they extract type information. Figure 8 shows the type system. The top-level judgment p asserts that program p is well-formed. This ensures that p is a valid “initial state” of execution: the class hierarchy P contained in p H, en n prim H, en l ∈ dom H typing terms 0 prim 0 H[l → (sub D { Mj j }, li i )], l 0 H, new D( li i ){ Mj j } H(l) = (T, li i ) H, l.fi0 0 fields(T ) = fi prim i variable typing store typing variable typing pair pseudo-types iscf(τ ) Γ : x → τn fin Σ : l→τ Γ ::= (Γ|Γ) S τ ::= τ | τi i → τ b fin H, li0 H(l) = (T, li i ) H, (l.fi0 := l ) H(l) = (T, . . .) H, l.m( li i ) 0 prim 0 prim H[l → (T, li i [i0 → l ])], l mbody(m, T ) = ( xi i , e0 ) H, [ li i / xi i ][l/this]e0 ¬ iscf(Code S , τ ) NB: Object is a D. cf(Σ) ¬ iscf(D) ∃i. ¬ iscf(ftypei (F )) ¬ iscf(F ) cf(Σ) = Σ|L where L = {l ∈ dom(Σ) : iscf(Σ(l))}. cf(Σ) locs(b) = l : l is a subterm of e e b ftypes(τ ) ftypes(τ ) = τi i l ∈ dom H H, let x ⇐ new C( li i ) in e0 H(l) = (Code, |e0 | ) H, ‘l 1 prim 0 prim H[l → (C, li i )], [l/x]e0 H(l) = (Code, |e0 | ) H, l.run() 0 prim H, e 0 H, e 0 assuming fields(τ ) = τi fi i l ∈ dom H H, |e | 0 0 prim ftypei (τ ) or ftype(f, τ ) 0 H[l → (Code, |e | )], l ftypei (τ ) = ftype(fi , τ ) = τi assuming τi fi ∈ fields(τ ) mtype(m, τ ) or mtype(M 0 ) H, en b n H, en b k H2 , ek 2 prim n,k k n H1 , E [e1 ] H2 , E n,k [ek ] 2 H1 , ek 1 mtype(S τ m( τi xi i ){e0 }) = τi i → τ mtype(m, τ ) = mtype(method(m, τ )) assuming class C extends D { . . . } ∈ P Figure 7. Preliminary definitions for the type system. S Figure 6. Small-step semantics for Lightweight Mint. must be well-formed; the expression e contains in p must be welltyped; and e must contain no store locations. This last check must be explicitly added here, because the typng rules for let forms and AICs allow frames to be pushed onto the stack of store typings. A class hierarchy P is well-formed, given by judgment P , if P is acyclic, field names and types (including inherited ones) do not clash within each class, and each class is well-formed. We omit a formalization of the first two checks but will use them implicitly by assuming that auxiliary functions like fields(τ ), mtype(m, τ ) are always unambiguous and that the sequence returned by fields is finite and has no duplicates. Classes are well-formed if they contain no locations, their methods are well-typed, and any methods they share with their superclass have the same type as in the superclass. The bottom half of Figure 8 concerns typing for pseudob b expressions. This is given by the judgment Σi i ; Γ n en : τ |S which states that the pseudo-expression en has type τ at level n b b under the stack Σi i of store typings and the pair Γ of contexts. If S = sep, this judgment further states that the pseudo-expression en is weakly separable. The reason the variable typing Γ is partib tioned into two parts is to check weak separability: the right part of Γ contains the variables that were bound within the current method or enclosing escape. These are the variables whose fields can be assigned to without violating weak separability. We always assume that no variables are repeated in Γ and no locations are repeated in Σi i . Most of the rules for typing pseudo-expressions are straightforward. The first rule generalizes subtypes to supertypes. The next two rules look up the types for variables and locations in the context and store typing, respectively, where CSP is only allowed (by allowing k or n, respectively, to be non-zero) if the associated type is code-free. Further, in order for a variable to be typed as separable, it must occur in the second half of the context pair. The next rule types let-expressions by extending the current context with the let-bound variable, while the rule after types field lookups by typing the object and then looking up the relevant field type. Note that, in typing the body of a let form, a new frame Σ can be added to the current stack Σi i , to allow for the possibility of heap locations containing code objects with the variable x free. The next three rules type field assignments (e1 .f := e2 ) by checking the type of e1 is some τ1 and then checking that the type τ2 of e2 is the appropriate field type of τ1 . The first of these rules applies to arbitrary e1 , and requires τ2 to be code-free if the assignment is to be weakly separable. The second and third rules for assignments allow the assignment to be weakly separable if either e1 is a variable in the right half of Γ, or e1 is a location in the last frame of the store typings and the whole assignment is typable at level 0, respectively. τ τ τ1 τ Object C C τ2 τ1 τ2 τ3 τ3 Σi i ; Γ n sub D { Min } n Σi i ; Γ1 |Γ2 , this : Dn n Mj : τj |Sj j n n > 0 ∨ dom(∪i Σi ) ⊇ locs(sub D { Mj j }) n n Σi i ; Γ1 |Γ2 sub D { Mj j } n where τj = mtype(mname(Mj ), D). τ Code S , τ τ Code S , τ Code sep, τ Code insep, τ Σi i ; Γ n class C extends D { . . . } ∈ P C D M n : τi i → τ |S n S Σi i , Σ; Γ1 , Γ2 , xi : τin i |∅ Σi i ; Γ1 |Γ2 n en : τ |S i S τ m( τi xi i ){en } : τi → τ |S S p P ; ∅|∅ 0 Σi i ; Γ e0 : τ |S P, e0 Σi i ; Γ locs(e0 ) = ∅ H ∀l ∈ dom(∪i Σi ). Σi i ; Γ Σi i ; Γ h:τ h:τ τ h:τ Σi i ; Γ Σi i ; Γ ≥1 0 H(l) : (∪i Σi )(l) H P Σi i ; Γ CLi i τ (∪i Σi )(lj ) Σi i ; Γ ftypej (C) (C, lj ) : C j acyclic no field names clash CLi i CLi i Σi i ; Γ |e0 | : Code S , τ |S (Code, |e0 | ) : Code S , τ ≥1 CL ; ∅|this : C 0 0 Mi0 i locs(Mi0 ) = ∅ i mtype(mname(Mi ), D) = undef or mtype(Mi0 ) class C extends D { τj fj j ; Mi0 i } n 0 0 sub D { Mj j } Σi i ; Γ (∪i Σi )(lk ) ftypek (D) k Σi i ; Γ i 0 (sub D { Mj j }, lk k ) : D where Γ ≥1 (x) = τ n ⇐⇒ Γ(x) = τ n ∧ n ≥ 1 (likewise for Γ). Σi i ; Γ τ en : τ |S τ Σi i ; Γ Σi i ; Γ n n n en : τ |S b Γ(x) = τ n Σi i ; Γ iscf(τ ) ∨ k = 0 n+k (∪i Σi )(l) = τ Σi i ; Γ iscf(τ ) ∨ n = 0 n e : τ |S b x : τ |S n l : τ |S 2 j=1 Σi i ; Γ1 |Γ2 n en : ftypej (C)|S j j Σi i , Σ; Γ1 , Γ2 |x : C n n en : τ |S Σi i ; Γ1 |Γ2 n (let x ⇐ new C( en j ) in en ) : τ |S j n Γ(x) = τ1 Σi i ; Γ1 |Γ2 n en : τ2 |sep ftype(f, τ1 ) = τ2 iscf(τ1 ) ∨ x ∈ dom Γ2 Σi i ; Γ1 |Γ2 n (x.f := en ) : τ2 |sep Σi i ; Γ Σi i ; Γ n en : τ |S Σi i ; Γ n en : τj |S j ftype(f, τ1 ) = τ2 Σi i ; Γ n en .f : ftype(f, τ )|S (e1 .f := e2 ) : τ2 |insep (∪i Σi )(l) = τ1 iscf(τ1 ) ∨ (n = 0 ∧ l ∈ dom ΣI ) ftype(f, τ1 ) = τ2 Σi I ; Γ n en : τ2 |sep i Σi I ; Γ i j n (l.f := en ) : τ2 |sep n Σi i ; Γ n en : τ |S Σi i ; Γ n en : τj |S j S mtype(m, τ ) = τj j → τ n Σi i ; Γ Σi i ; Γ Σi i ; Γ n n en : ftypej (D)|S j n sub D { Mk k } j Σi i ; Γ en .m( en j ) : τ |S j Σi i ; Γ n n new D( en j ) { Mk k } : D|S j Σi i ; Γ1 , Γ2 |∅ n+1 e : τ |S Σi i ; Γ1 |Γ2 n |e| : Code S , τ |S en : Code S , τ |sep n+1 Σi i ; Γ Σi i ; Γ n e : Code S , τ |S e.run()|insep Σi i ; Γ ‘e : τ |S n Figure 8. Type system for Lightweight Mint. The next rule, after those for assignment, types method calls by looking up the type of the given method, while the following rule types AICs by checking the class definition and the argument types. Finally, the last three rules type brackets, escape, and run, where typing |e| requires typing e at the next level and adds the code type, typing ‘e requires typing e at a code type on the previous level and removes the code type, and typing e.run() types e at a code type on the same level and removes the code type. Brackets can always be weakly separable, run is never weakly separable, and escapes ‘e are only weakly separable if e has type Code sep, τ . The remainder of Figure 8 has rules for the following judgments. The judgment Σi i ; Γ n sub D { Min } states that an AIC that subclasses D with method definitions Min is wellformed. This requires the methods Min to have the appropriate types. It also requires, if n = 0, that all the locations in the AIC are contained in dom(∪i Σi ), effectively ensuring that no new frames can be added to the stack of store typings. The judgment S Σi i ; Γ n M n : τi i → τ |S states that method M has input types τi , output type τ , and further is weakly separable if S = sep. Note that this rule is allowed to push a new frame onto the stack of store typings when the level n > 0. This is because there may be some locations in the store that contain code that include the free variables bound inside M . Note also that passing inside a method resets the vertical bar | in Γ to the end, indicating that weakly separable expressions in the method cannot freely access variables bound at or before the method M . The judgment Σi i ; Γ H states that the store H is wellformed under the given stack of store typings. This judgment includes the typing context Γ because the store may contain code with free variables. This judgment requires that, for all locations l in the stack of store typings, the heap for H(l) is well-typed. Note that there may be more locations in H than in the domain of Σi i , allowing the possibility that other frames could be pushed onto this stack. The judgment Σi i ; Γ h : τ is then used to state that heap form h has type τ . The rules for this judgment require that the expressions contained in the heap form h are well-typed. The typing context used to type these expressions is the restriction of Γ to the variables of level greater than 0. This is because heap forms are allowed to have code objects with free variables in them, but these free variables must be bound in other code objects, meaning they must have been bound at level greater than 0. Note that, as a side effect of these definitions, if Σi i ; Γ H holds then H restricted to dom(∪i Σi ) is closed under reachability, meaning that no location in this domain can reference a location outside of it. 5.4 Soundness b Σi i ; Γ n (H, en ) : τ |S then specifies that the configuration (H, en ) is well-typed. The rules for this judgment are identical to b those for pseudo-expression typing except that each rule also requires the heap H be well-formed with respect to the current environment Σi i ; Γ. For example, the rule for let forms becomes: Σi i ; Γ Σi i , Σ; Γ, x : C n Σi i ; Γ n n n (H, en ) : ftypej (C)|S j j (H, en ) : τ |S Σi i ; Γ en j ) j n H (H, let x ⇐ new C( in e ) : τ |S A second technical difficulty is that a reduction step inside a let form or AIC that pushes a new frame Σ onto Σi i might modify a code-free location in dom(∪i Σi ) to reference a location in the new frame Σ. The resulting heap would thus not be well-formed under Σi i , because this portion of the heap would not be closed under reachability. To deal with this problem requires the Smashing Lemma, which smashes the last two Σ’s of Σi i into one, giving a shorter store typing sequence. Lemma 2 (Smashing). If 1. 2. 3. 4. Σi I ; Γ1 |Γ2 H1 i H1 |L = H2 |L where L = dom(∪i Σi ) − dom(cf(∪i Σi )) Σi I , Σ; Γ1 |Γ2 H2 i Γ1 ∪ Γ2 ⊇ Γ1 ∪ Γ2 I−1 , (ΣI i then Σi ∪ cf(Σ)); Γ1 |Γ2 H2 . Note that the Smashing Lemma is at the heart of proving the absence of scope extrusion, as it states that any code locations that could potentially cause scope extrusion are not reachable outside their respective scopes. We are now ready to prove Preservation. The statement below is an abridged version. For technical reasons, we need to add some more hypotheses and conclusions to make the proof work. The details of those technicalities are left to the Appendix. Lemma 3 (Preservation). If Σi i , ΣR ; Γ1 |Γ2 n and (H1 , en ) b1 (H2 , en ), then ∃ΣR such that b2 n (H1 , en ) : τ |S b1 1. ΣR ⊇ ΣR b2 2. Σi i , ΣR ; Γ1 |Γ2 n (H2 , en ) : τ |S 3. H1 |L = H2 |L where L = dom(∪i Σi ) − dom(cf(∪i Σi )) Proof is by induction on the typing judgment, and is given in the appendix. 6. Implementation To verify the expressivity of the design and obtain performance results, we created an implementation of Mint by modifying OpenJDK [15], a Java Development Kit (JDK) based entirely on open source. Since we only modified the compiler and maintain full binary compatibility, the generated class files can be executed with any Java Runtime Environment, version 6 or higher. The only change required when running multi-stage programs is the placement of a small library on the boot classpath, making the compiler for future-stage code available. The compiler included in OpenJDK contains a pretty printer geared towards converting abstract syntax trees (ASTs) to Java source that can be compiled again. By using the pretty printer, we are able to generate source for future stages with minimal changes to the compiler. The fact that we are generating human-readable source also has debugging benefits. In the future, the performance of compiling code objects may be increased by serializing and deserializing the ASTs directly, thereby circumventing the compiler’s parser. After the source input has been parsed and entered into symbol tables, the OpenJDK compiler without our modifications proceeds Type soundness is proved by the usual Preservation and Progress lemmas. Progress is proved with the following lemma: Lemma 1 (Unique Decomposition). If Σi i ; Γ n en : τ |S b and en is not a pseudo-value then en is uniquely decomposed as b b en = E n,m [rm ], where = denotes syntactic equality modulo α b conversion. Proof. By straightforward induction on en . b Our statement of Unique Decomposition implies Progress because any well-typed expression is either a value or contains a redex that can be contracted by the operational rules. In addition, uniqueness also ensures that our semantics is deterministic. The proof of Preservation is more complicated. One technical difficulty is that there is no restriction on the additional frames that may be introduced by the typing rule for methods; i.e., this rule could add locations to the store typing that are not in the current heap. To address this problem, we introduce typing for configurations, or pairs of heaps and pseudo-expressions. The judgment in five phases. The Mint compiler adds a sixth stage, called Staging Translation, yielding the following stages in order: • Attribution: Names and expressions in the AST are resolved and types are assigned to the AST nodes. Most type errors are detected at this stage. • Flow Analysis: Unreachable code and the use of uninitialized variables is detected. • Staging Translation: Brackets are translated into ASTs that cre- ate code objects. This phase was introduced in the Mint compiler and does not exist in the original OpenJDK. • Type Translation: Generic type information is erased. • Lowering: “Syntactic sugar” like inner classes and foreach The node that represents an escape in a bracket body stores the AST of the expression that was escaped; a reference to a CSP variable contains the AST of the identifier. Furthermore, all variables introduced inside brackets are gensym-renamed [5]. For each such variable that needs to be renamed, a let expression binds a dynamically created, fresh name to a string variable (gensyn$$1 in the example below), and the value of that string variable is used wherever the identifier to be renamed used to occur. More concretely, the code generated for the last bracket in the example above is approximately as follows: Code < Integer > x = let final String gensym$$1 = varGenSym (); new MSPTreeCode ( new InteriorNode ( new StrTree (" let int ") , new StrTree ( gensym$$1 ) , new StrTree ("= 1; 2 * ") , new CSPTree ( csp ) , new StrTree (" + 3 * ") , MSPTreeCode . escape ( c ) , new StrTree (" + ") , new StrTree ( gensym$$1 ) )); loops are replaced by simpler constructs. • Generation: Bytecode is generated for the AST and class files are written. The main modifications to the OpenJDK compiler, other than adding an additional compilation stage, were in Attribution. In Attribution, we perform the type-checking necessary for brackets, escape, and run, ensuring specifically that the body of each escape is weakly separable. Attribution also checks the separability of methods and constructors declared with the separable modifier and reports errors if unsafe operations are performed. Finally, attribution also records the stage at which a variable is defined and the stage it is used: If the variable is used at a later stage than it is defined, it will be prepared for cross-stage persistence (CSP). If the variable is used in an earlier stage than the one it is defined in, an error is reported. During Staging Translation, the new compiler stage in the Mint compiler, each bracket is replaced with a constructor call to MSPTreeCode, a concrete implementations of Code, which is given as an interface in Mint. The body of the bracket is passed to the constructor for MSPTreeCode as a simplified tree in which most of the AST has been converted into strings using the pretty printer; only escapes, CSP variables, and variable identifiers to be gensym-renamed are maintained as separate nodes. Mint also extends Java to include a let construct to bind values in an expression, as opposed to a statement block. For instance, the expression let int x=1, y=2*x; 3*y evaluates to 6, and the scope of y begins after the comma. Our let construct therefore matches LISP’s let*. We employ the let construct in our implementation to store the freshly generated names when we rename variables in brackets to avoid accidental capture. For example, in the program fragment final int csp = 1; Code < Integer > c = <| 123 | >; Code < Integer > x = <| let int lv = 1; 2 * csp + 3 * ‘c + lv | >; This code first creates a gensym called gensym$$1 for the variable lv. It then creates an MSPTreeCode containing a tree of all the objects mentioned above: StrTree is used for nodes containing strings, including the string given by gensym$$1; CSPTree is used for CSP variables; and MSPTreeCode.escape(c) is used to implement escapes, by copying the tree contained in the code object c. When a code object is run, the proper values are filled in for escapes, CSP variables, and gensym-renamed identifiers. The entire tree is flattened and pasted into a template to create the Java source of a class implementing Code, with the bracket’s body in its run method. The name for this class is also generated fresh. Escapes and gensym-renamed identifiers are simple to process: The subtrees included by escapes are processed recursively, and renamed identifiers are treated like strings. CSP variables, on the other hand, are not in scope inside the new code object and need to be treated specially: the code object contains an Object array, called the CSP table, that is initialized with the values of the CSP variables in the constructor. References to CSP variables are replaced with array accesses. The source that is compiled when the code object x in the example above is run looks like this: public class $$Code1$$ implements SafeCode < Integer > { private Object [] ct ; public $$Code1$$ ( Object [] t ) { ct = t ; } public Integer run () { return ( let int var$$$1 = 1; 2 * (( Integer ) ct [0]) + 3 * (123) + var$$$1 ); } } the body of the bracket in the last two lines is translated into a data structure containing: • the string "let int " • a gensym for lv • the string "= 1; 2 * " • the AST of CSP variable csp • the string " + 3 * " • the AST of escaped expression c • the string " + " • and a gensym for lv. The fresh symbol var$$$1 has been substituted for all occurrences of variable lv. The body of the escaped code object c is present without overhead, as desired. The reference to the CSP variable csp has been replaced by an access to the CSP table ct. The source is passed as a string to the Mint compiler, where it is parsed, analyzed and translated as described above. The compiler then generates bytecode in memory. Since a single compilation unit in Java may be compiled into several class files, e.g. because of inner classes, the compiler returns a set of class name-bytecode pairs. Since anonymous inner classes are assigned names in the compiler using an internal numbering scheme, the class names have to be returned along with the generated bytecode. The bytecode for the generated classes is added to a hash table, with the class names used as keys. A custom class loader intercepts attempts by the Java virtual machine (JVM) to load a class and checks if the hash table has bytecode available for the requested class. If so, the bytecode generated by the Mint compiler is used; otherwise, the custom class loader uses Java’s default class loader, which attempts to load the class from a file. A new instance of the generated class is created using reflection and the values of the CSP variables are passed to the constructor, filling the code object’s CSP table. Finally, the new instance’s run method is called to execute the code in the bracket. It is important use the custom class loader for all classes, not just those generated from brackets. If the same class is loaded by different class loaders, the JVM considers their instances incompatible and throws a ClassCastException if an object is assigned to a variable of the same class loaded by another class loader. This problem can be avoided by installing the custom class loader in a small launcher application before the program’s main method is executed. The launcher is included in the runtime library, together with with the Mint compiler and the Code and SafeCode interfaces. Benchmark power fib mmult eval-fact eval-fib unroll serialize speedup 4.6x 4.1x 1.5x 8.4x 10.0x 1.4x 18.0x unstaged µs 0.079 0.070 1.8 0.83 19.0 0.140 1.5 staged µs 0.017 0.017 1.3 0.1 1.9 0.097 0.08 gen µs 1.3 8.2 12.0 1.7 2.4 2.9 6.2 compile µs 33,000 35,000 84,000 37,000 57,000 47,000 35,000 Figure 9. Benchmark results. the eval-fact and eval-fib benchmarks by about an order of magnitude. Finally, the serializer benchmark benefited the most from staging: the removal of call overhead and reflection reduced the execution time by a factor of 18.3. Note that the compiler overhead is currently significant. In the future, we hope to reduce this by circumventing the compiler’s parser. 8. Related Work Finding a practical static type system for safe imperative MSP has been a long-standing challenge. One of the earliest approaches to the problem introduced the notion of ”closedness types” [3], which express that a code object has no free variables. Effects in this system are limited to closed code, so that no scope extrusion is possible. Two possible drawbacks of this approach are that (1) it requires additional program annotations to mark closed code, and (2) there are cases for which this constraint can be too restrictive. Recently, Kameyama, Kiselyov, and Shan introduced a new approach to dealing with imperative MSP using delimited control [10, 11]. While the approach makes use of advanced control features (shift and reset), the essence of this approach is very similar to weak separability. Instead of limiting effects to be contained inside escapes, however, this work places an implicit reset just inside every variable-binding construct in code, so that no effects can move out of variable bindings. This is a strong version of the separability constraint used in Mint. The system of Kim et al, and the more recent one by Aktemur extending this system, explicitly include the types of all free variables in the type of a code fragment [13, 1]. Because this would be too restrictive in a simply typed setting, record polymorphism (or rho polymorphism) is used in both proposals to make the types more flexible. There are two potential limitations to this approach. First, types can easily get quite big, and in languages where types must be explicitly stated (such as Java), this can be a burden on the programmer. A second, more technical point, is that the practice of multi-stage programming shows that it is often convenient to use many common type conversions (isomorphisms) to convert a value from being a code of a function to a function that maps a code argument to a code result, and to use other similar conversions, known in the partial evaluation community as two-level eta-expansions. Many such expansions cannot be written in this system. There have been a number of systems that combine MSP with object-oriented languages. Jumbo [12] adds MSP to Java, while Meta-AspectJ [24] adds MSP to AspectJ. Both of these systems ensure that generated code is syntactically well-formed, but they do not ensure that generated code satisfies type-checking, which means that code generation could fail at runtime. The Metaphor system [14] combines MSP with reflection in C#, allowing code objects that compute fields and types as well as expressions. Metaphor includes a type system that ensures that generated code passes typechecking, and that also allows simple operations on types such as conditions. Metaphor does not handle the scope extrusion problem, however. 7. Performance In order to measure the performance impact of MSP in Mint, we have benchmarked a set of Mint examples. These include the following: • power is the power example from Section 2.1, called with base 2 and exponent 17. • fib recursively computes the 17th element of the generalized Fibonacci function starting from 2 and 3. • mmult performs an optimized matrix multiplication, in which every 1 in the left matrix omits the floating-point multiplication at runtime and every 0 omits the multiplication and the addition. The benchmark is called with a four-dimensional rotation matrix as the left matrix and an arbitrary four-by-four matrix as the right. • eval-fact calculates the factorial of 10 using the lint inter- preter discussed in Section 4.2. • eval-fib calculates the 10th number in the standard Fibonacci sequence using the lint interpreter. • unroll performs the loop unrolling example of Section 4.3, us- ing the accumulator discussed in that section over 10 iterations. • serialize uses the serializer generator discussed in Section 4.4 to write the primitive fields contained in an object hierarchy two levels deep to an output stream. Each operation in the benchmarking process (staged, gencode, compile, unstaged) is run for a number of repetitions so that the total time for that operation is 1-2 s. The average runtime of a single repetition is then calculated for each operation. Timings were recorded on an Apple MacBook with a 2.0 GHz Intel Core Duo processor, 2 MB of L2 cache, and 2 GB main memory, running Mac OSX Tiger. The results are given in Figure 9. Performance improved in all cases. The speedups achieved range from 1.4 to 18.3, with speedup defined as unstaged time divided by staged time. The mmult and unroll benchmarks involved mostly tight for loops and could not be sped up substantially. On the other hand, the staged versions of power and fib reduced the call overhead involved in the recursive functions and executed almost five times faster than the unstaged code. Staging the lint interpreter improved the performance of There are also a number of other approaches to code generation in object-oriented languages. Compile-time reflection [6] and the SafeGen [9] and MorphJ [8] systems are aimed at increasing the expressivity of object-oriented languages by adding a compile-time language of reflection and code generation. The compile-time languages allow the programmer to generate parts of class definitions in a generic manner by iterating over the methods and fields of the class, making it easy to write automatic unit testing and logging, for example. Other approaches are aimed at increasing performance. The JSpec system [16], for instance, performs automatic program specialization, which examines user code and unfolds method calls and other overheads that can be determined statically. Runtime code generation [17], in contrast, is a low-level means for a program to generate code at runtime in terms of virtual machine bytecodes, which has been shown to allow considerable speedup. [6] Manuel F¨ hndrich, Michael Carbin, and James R. Larus. Reflective a program generation with patterns. In GPCE ’06: Proceedings of the 5th International Conference on Generative Programming and Component Engineering, pages 275–284, 2006. [7] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design patterns: Abstraction and reuse of object-oriented design, 1993. [8] Shan Shan Huang and Yannis Smaragdakis. Expressive and safe static reflection with MorphJ. In PLDI ’08: Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 79–89, 2008. [9] Shan Shan Huang, David Zook, and Yannis Smaragdakis. Statically safe program generation with safegen, 2005. [10] Yukiyoshi Kameyama, Oleg Kiselyov, and Chung chieh Shan. Closing the stage: from staged code to typed closures. In PEPM ’08: Proceedings of the 2008 ACM SIGPLAN Symposium on Partial Evaluation and Semantics-based Program Manipulation, pages 147– 157, 2008. [11] Yukiyoshi Kameyama, Oleg Kiselyov, and Chung chieh Shan. Shifting the stage: Staging with delimited control. In PEPM ’09: Proceedings of the 2009 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation, pages 111–120, 2009. [12] Sam Kamin, Lars Clausen, and Ava Jarvis. Jumbo: Run-time code generation for Java and its applications. In CGO ’03: Proceedings of the International Symposium on Code Generation and Optimization, pages 48–56, 2003. [13] Ik-Soon Kim, Kwangkeun Yi, and Cristiano Calcagno. A polymorphic modal type system for lisp-like multi-staged languages. SIGPLAN Not., 41(1):257–268, 2006. [14] Gregory Neverov and Paul Roe. Metaphor: A Multi-stage, ObjectOriented Programming Language. In GPCE ’04: Proceedings of the 3rd International Conference on Generative Programming and Component Engineering, pages 168–185, 2004. [15] OpenJDK Project. http://openjdk.java.net. [16] U.P. Schultz and J.L. Lawall C. Consel. Automatic Program Specialization for Java. ACM Transactions on Programming Languages and Systems, 25(4):452–499, 2003. [17] Peter Sestoft. Runtime code generation with JVM and CLR, 2002. http://www.itu.dk/~sestoft/rtcg/rtcg.pdf. [18] Rok Strniˇa, Peter Sewell, and Matthew Parkinson. The Java module s system: Core design and semantic definition. In OOPSLA ’07: Proceedings of the 22nd Annual ACM SIGPLAN Conference on Object Oriented Programming Systems and Applications, pages 499– 514, 2007. [19] Walid Taha. Multistage programming: Its theory and applications. PhD thesis, Oregon Graduate Institute, 1999. Supervisor-Tim Sheard. [20] Walid Taha. A gentle introduction to multi-stage programming. In DSPG ’03: Proceedings of the International Seminar on DomainSpecific Program Generation, 2003. [21] Walid Taha. A gentle introduction to multi-stage programming, part ii. In Generative and Transformational Techniques in Software Engineering II, 2007. [22] Walid Taha, Zine el-abidine Benaissa, and Tim Sheard. MultiStage Programming: Axiomatization and Type Safety (Extended Abstract). In ICALP’98: 25th International Colloquium on Automata, Languages, and Programming, pages 918–929, 1998. [23] Walid Taha and Michael Florentin Nielsen. Environment classifiers. SIGPLAN Not., 38(1):26–37, 2003. [24] David Zook, Shan Shan Huang, and Yannis Smaragdakis. Generating AspectJ Programs with Meta-AspectJ. In GPCE ’04: Proceedings of the 3rd International Conference on Generative Programming and Component Engineering, pages 1–18, 2004. 9. Conclusion This paper has proposed a practical approach to adding MSP to mainstream languages in a type-safe manner that prevents scope extrusion. The approach is simpler than prior proposals, and we expect that it will be easily and intuitively understood by programmers. The key insight is that safety can be ensured with weak separability, which places straightforward restrictions on the forms and types of computational effects that occur inside escape expressions, so that these effects cannot cause code to leak outside of escapes. The proposal has been validated both by proving that weak separability is enough to ensure safety and by demonstrating by example that many useful MSP applications can still be written that adhere to these restrictions. A future direction for this work is to try to simplify the idea of weak separability to more closely match the intuition behind the concept. We believe there is some system similar to environment classifiers, in which quantifying on type variables can be used to implicitly capture the property that we wish to express. Instead of quantifying a type variable at the occurrence of run() as in environment classifiers, however, we believe that weak separability can be expressed by quantifying a type variable at the occurrence of an escape. This would simplify the type system and possibly add more expressive power to the language. Acknowledgments We thank Yannis Smaragdakis for his helpful comments. References [1] Baris Aktemur. Type Checking Program Generators Using the Record Calculus, 2009. http://loome.cs.uiuc.edu/pubs/ transformationForTyping.pdf. [2] Davide Ancona and Eugenio Moggi. A fresh calculus for name management. In GPCE ’04: Proceedings of the 3rd International Conference on Generative Programming and Component Engineering, volume 3286, pages 206–224, 2004. [3] Cristiano Calcagno, Eugenio Moggi, and Walid Taha. Closed Types as a Simple Approach to Safe Imperative Multi-stage Programming. In ICALP ’00: Proceedings of the 27th International Colloquium on Automata, Languages and Programming, pages 25–36, 2000. [4] Cristiano Calcagno, Eugenio Moggi, and Walid Taha. ML-like inference for classifiers. In ESOP ’04: Proceedings of the 13th European Symposium on Programming, pages 79–93, 2004. [5] Cristiano Calcagno, Walid Taha, Liwen Huang, and Xavier Leroy. Implementing multi-stage languages using asts, gensym, and reflection. In GPCE ’03: Proceedings of the 2nd International Conference on Generative Programming and Component Engineering, pages 57–76, New York, NY, USA, 2003. Springer-Verlag New York, Inc. A. Proofs Proof. By hypothesis, ; ∅|this : C 0 0 We give a detailed proof of Lightweight Mint’s type safety in this section. Due to space limitations, discussions of a number of subtle technical details of the type system have been omitted in the main text. Notation. A variable typing pair Γ is assumed to decompose as Γ1 |Γ2 , and similarly Γ = Γ1 |Γ2 . Definition 4. A store typing sequence is well-formed iff all of its code-free bindings are gathered in the first Σ, and the individual store typings have pairwise disjoint domains: I≥2 cf(Σi ) = ∅ I i=2 cf(Σ1 ) = Σ1 dom(Σi ) ∩ dom(Σj ) = ∅ Σi I i i=j M 0 : mtype(M 0 )|S. By a similar reasoning as Proposition 4, we can prepend ∅, ∅ to each store typing sequence in the typing derivation, which is necessarily of the form ∅, ∅, . . . , ∅ . NB: Proposition 5 replaces the ill-formed store typing sequence in the typing for M 0 with the well-formed ∅, ∅ . Hereafter, the assumption that all store typing sequences and typing derivations are well-formed is in effect. It is needed for typings of configurations under updated environments to propagate through congruence rules. This statement is formalized below as n Lemma 6. It states that when a small step H1 , e1 H2 , e2 has been taken on a subterm e1 of some bigger term, the heap attached to the typing of any other disjoint subterm can be safely replaced by H2 . The freshness assumption is crucial to this lemma. Lemma 6 is the only part of the proof that needs the wellformedness assumption. We will update the statements of the Smashing and Preservation Lemmas to observe the well-formedness assumption, but the effect on the Smashing Lemma is mainly simplification rather than a change in its meaning, and the modification to the Preservation Lemma is only concerned with propagating the right well-formedness conditions. Lemma 6. Suppose Σi i ; Γ n . Definition 5. Let the metavariable P T range over proof trees of configuration typing judgments. P T satisfies the disjointness criterion if any two store typing sequences Σi I and Σj J that i j appear in P T have a common prefix of length at least 2 (∃K ≥ 2. Σk = Σk K ) and the rest have disjoint domains ( dom(Σi )∩ k=1 dom(Σj ) i,j>K ). P T is well-formed iff it only uses well-formed store typing sequences and satisfies the disjointness criterion. Definition 6. A location l is said to appear in P T iff P T uses a store typing sequence that contains l in the domain of the sequence’s union. An l is said to be fresh for P T iff it does not appear in P T . An l is local to P T iff for any P T , if P T and P T are disjoint subtrees of a well-formed tree, then l is fresh for P T . An l is fresh or local in a configuration typing judgment iff it is fresh or local, respectively, for some proof tree of the typing judgment. Note that l is local to P T iff it appears in a Σ that is introduced in P T as a result of an extension of the store typing sequence that happens within P T . Note also that this well-formedness concern for derivation trees is only for configurations, and expression typing derivations are always well-formed. We would like to assume that all proof trees and store typing sequences are well-formed. This does not reduce the expressivity of the type system because a user program must not contain locations, and therefore it can be typed by a derivation that only uses store typing sequences of the form ∅, ∅, . . . , ∅ . This claim is made precise by the following propositions. Proposition 4. If an initial configuration is typed as ; ∅|∅ 0 (H1 , en ) : τ |S b b (∗) and Σj j ; Γ H2 and ∪j Σj ⊇ ∪i Σi and ∀l. H1 (l) = H2 (l) =⇒ l is fresh for (∗) or l ∈ dom(∪j Σj ). Then Σi i ; Γ n (H2 , en ) : τ |S. b b Proof. Induction on en . If we can invoke IH on every immediate b suberm, then the conclusion becomes obvious. The only obstacle to invoking IH is that if a subterm is typed under an extended environment, say Σj j , Σ; Γ , we must check that Σj , Σ; Γ H2 . Note that Γ1 ∪ Γ2 ⊇ Γ1 ∪ Γ2 in general. For any l ∈ dom Σ, we have H1 (l) = H2 (l) since locations that the heaps disagree on are fresh for (∗) unless it is in ∪j Σj . Then (∗) =⇒ Σi i , Σ; Γ H1 (l) : Σ(l) H2 (l) : Σ(l). by inversion by Σh weakening and H1 (l) = H2 (l) (∅, e0 ) : τ |S (∗) =⇒ Σj j , Σ; Γ by a not necessarily well-formed derivation and locs(e0 ) = ∅, then ∅, ∅ ; ∅|∅ 0 (∅, e0 ) : τ |S by a well-formed derivation. Proof. The expression part contains no locations, so the store typing sequence is only used to type the heap which is empty and is therefore well-formed under, and only under, store typing sequences of the form ∅, ∅, . . . , ∅ . Thus, every configuration typing judgment in the derivation of (∗) is of the form ∅ I ; Γ n i (∅, en ) : τ |S and every heap well-formedness judgment is of the b b ∅. If we consistently replace I by I + 2 in all form ∅ I ; Γ i such judgments, then we have a derivation tree for ∅, ∅ ; ∅|∅ 0 (∅, e0 ) : τ |S. The typing derivation constructed here is clearly well-formed. Note that Proposition 4 starts by assuming typability under because that is what we used for program typing in Figure 8. Proposition 5. If M 0 appears in a well-typed class C, then ∅, ∅ ; this : C 0 |∅ 0 M 0 : mtype(M 0 )|S by a derivation that does not involve ill-formed store typing sequences. For any l ∈ dom(∪j Σj ), we have Σj j ; Γ H2 (l) : (∪j Σj )(l) by hypothesis. Then by Σh and Γh weakenings, we get Σj j , Σ; Γ H2 (l) : (∪j Σj )(l). Therefore, Σj j , Σ; Γ H2 . If en = |en+1 | , the variable typing’s partitioning bar (|) is b moved but the store typing sequence is not extended. In this case, we just use ΓH weakening. Lemma 7 (Σe relevance). If we have Σi I ; Γ n en : τ |S and b b i b b (∪i Σi )|locs(bn ) = (∪j Σj )|locs(bn ) , then Σj J ; Γ n en : τ |S. e e j Proof. Proof is by induction on en . When looking up a location the b Σ’s are always unioned together, so it clearly only matters what the union of the sequence contains. The only non-trivial inductive cases are the ones that extend the store typing sequence. Take en = S τ m( τk xk k ){en } for b example, and let L = locs(en ). By inversion we have n ∃Σ. Σi I , Σ; Γ1 , Γ2 , xk : τk k |∅ i n en : τ |S . Then ((∪i Σi ) ∪ Σ)|L = (∪i Σi )|L ∪ Σ|L = (∪j Σj )|L ∪ Σ|L = ((∪j Σj ) ∪ Σ)|L so we can use IH on en to obtain Σj J j , Σ; Γ1 , Γ2 , n xk : τk k |∅ n en : τ |S . The conclusion immediately follows. Lemma 8 (Γe weakening). If Σi i ; Γ (i = 1, 2) then Σi i ; Γ n en : τ |S. b b n A common issue with multi-stage type systems is the fact that run changes the level of a term dynamic. The Demotion Lemma ensures that this change does not destroy well-typedness. Since the code to run is always fetched from the heap, we get well-typedness of the term from well-formedness of the heap. Lemma 13 (Demotion). If Σi i ; ∅|∅ (Code, |e0 | ) : Code S , τ and Σi i ; ∅|∅ H then Σi i ; ∅|∅ 0 (H, e0 ) : τ |S. Proof. Generalize to: Σi i ; Γ ≥1 n+1 n en : τ |S and Γi ⊇ Γi b b Proof. Straightforward induction on en , noting that part-wise conb tainment Γi ⊇ Γi is preserved by manipulations of the form Γ1 |Γ2 → Γ1 , Γ2 , Γ |Γ for Γ and Γ that are independent of Γ1 and Γ2 . Lemma 9 (Γh weakening). If (Γ1 ∪ Γ2 ) Σi i ; Γ h : τ then Σi i ; Γ h : τ . ≥1 en : τ |S ∧ Σi i ; Γ ↓ b b n def H =⇒ Σi i ; Γ ↓ (H, e ) : τ |S b b ⊇ (Γ1 ∪ Γ2 ) ≥1 and Proof. The variable typing pair is used when h = (Code, |e0 | ), where ≥1 0 Σi i ; Γ |e0 | : Code S , τ |S 0 and τ = Code S , τ , or h = (sub D { Mj j }), in which case Σi i ; Γ ≥1 0 0 Mj where Γ ↓= Γ1 ↓ |Γ2 ↓ and Γ ↓ (x) = τ n ⇐⇒ Γ(x) = τ n+1 . We prove this by induction on en . b ≥1 If en = x then Γ (x) = τ k and 0 < k ≤ n so Γ ↓ (x) = b b τ k−1 . Therefore Σi i ; Γ ↓ n (H, x) : τ |S. b b If en = (let x ⇐ new C( en j ) in en ) then by IH, b j Σi i ; Γ ↓ n en : τj |Sj j . By inversion we have j ∃Σ. Σi i , Σ; Γ≥1 , Γ≥1 |x : C n+1 1 2 n+1 : 0 mtype(mname(Mj ), D) j . en : τ |S. b In both cases, we can replace Γ with Γ by Γe weakening. Notice that the partitioning bar is moved all the way to the right before the variable typing has a chance to be looked up, so that the assumption (Γ1 ∪ Γ2 )≥1 ⊇ (Γ1 ∪ Γ2 )≥1 is turned into pair-wise containment, matching the hypothesis of Γe weakening. Lemma 10 (ΓH weakening). If (Γ1 ∪ Γ2 )≥1 ⊇ (Γ1 ∪ Γ2 )≥1 and Σi i ; Γ H then Σi i ; Γ H. Proof. Immediate consequence of Γh weakening. Lemma 11. If Σi i ; Γ n+1 By Lemma 11 and Σe relevance, we may assume Σ = ∅, so by ΓH weakening, Σi i , Σ; (Γ≥1 , Γ≥1 |x : C n+1 ) ↓ H. Hence we can 1 2 use IH to get Σi i , Σ; (Γ1 , Γ2 ) ↓ |x : C n n (H, en ) : τ |S. The b desired conclusion follows immediately. If en = M n , the argument is essentially the same as for a let. b 0 If en = new D( en j ){ Mk k }, IH gives b j Σi i ; Γ ↓ n 0 0 (H, Mk ) : mtype(mname(Mk ), D)|Sk k . en : τ |S then locs(bn ) ⊆ dom Σ1 . b b e For n = 0, this is enough to get the conclusion. If n = 0, then 0 we need dom(∪i Σi ) ⊇ locs(sub D { Mk k }) in addition; this is ensured by Lemma 11. The remaining cases are straightforward. The statement of the Substitution Lemma is more or less standard. Lemma 14 (Substitution Lemma). Let Γ and Γ be identical, including the position of the partitioning bar (|), except that Γ(x) = τ 0 and x ∈ dom Γ . If Σi I ; Γ n (H, en ) : τ |S b b i and (∪i Σi )(l) = τ and x ∈ dom Γ2 ∨ l ∈ dom(Σ1 ∪ ΣI ), then Σi I ; Γ n (H, [l/x]bn ) : τ |S. e b i H is Proof. Induction on en . Note that in each case Σi i ; Γ b assured by ΓH weakening. If en = x, inversion on the configuration typing gives iscf(τ ) ∨ b n = 0, which is just the premise we need to justify Σi i ; Γ n (H, l) : τ |S. b If en = (x.f := en ), then τ = ftype(f, τ ) where τ = b b (∪i Σi )(l). The typing judgment must have been derived with one of two rules. If it used the rule for the generic form (en .f := en ), 1 2 then S = insep so by IH, the configuration after substitution can be typed by the same rule. Otherwise, S = sep, in which case the binding level 0 of x must equal the typing level n. IH gives Σi i ; Γ n Proof. Induction on en . It is evident from the typing rules that we b maintain the invariant that the term can always be seen as lowerlevel than the level of the typing judgment, and therefore that the level of the typing judgment is > 0. Hence when we encounter a rule that looks up the store typing (i.e. en = l or (l.f := en )), it is b the case that iscf((∪i Σi )(l)). By the assumption that the derivation is well-formed, the store typing sequence is well-formed, hence l ∈ Σ1 . Lemma 12 (Σh weakening). If Σi i ; Γ ∪i Σi then Σj j ; Γ h : τ . h : τ and ∪j Σj ⊇ Proof. If h = (T, lk k ) then ∪j Σj assigns the right types to the lk since they are all in dom(∪i Σi ), where ∪i Σi and ∪j Σj coincide. 0 If T = sub D { Ma a } then Σi i ; Γ1 |Γ2 , this : D0 0 0 Ma : τa |Sa b a 0 where τa = mtype(mname(Ma ), D), and b 0 locs(sub D { Ma a }) ⊆ dom(∪i Σi ) ⊆ dom(∪j Σj ) so ∪i Σi |locs(T ) = ∪j Σj |locs(T ) . It follows by Σe relevance that Σj j ; Γ1 |Γ2 , this : D0 0 0 Ma : τa |Sa b a 0 sub D { Ma a }. Therefore h is typable under thus Σj j ; Γ Σj j . ≥1 0 If h = (Code, |e0 | ), then Σi i ; Γ |e0 | : τ |S where 0 τ = Code S , τ . By Lemma 11, locs( |e | ) ⊆ dom(∪i Σi ) ⊆ ≥1 0 dom(∪j Σj ), so Σj j ; Γ |e0 | : τ |S by Σe relevance. It follows that h is typable under Σj j . (H, [l/x]en ) : τ |sep. b Inversion gives x ∈ dom(Γ2 ) ∨ iscf(τ ). If x ∈ dom Γ2 then l ∈ dom(Σ1 ∪ ΣI ) by hypothesis, thus iscf(τ ) ∨ l ∈ dom ΣI . Therefore, we have iscf(τ ) ∨ (n = 0 ∧ l ∈ dom ΣI ), which establishes Σi i ; Γ n (H, (l.f := [l/x]en )) : τ |sep. b Suppose en = (let y ⇐ new C( en j ) in en ). If x = y, b j then the substitution is the identity on this term so the conclusion is immediate. Otherwise, we have Σi i ; Γ n (H, [l/x]en )b|S j j τ by IH. By inversion, Σi i , Σ; Γ1 , Γ2 |y : C n n (H, en ) : τ |S. b The store typing sequence is extended, so the l is no longer in the rightmost Σ (if it was in ΣI ), but the binding for x is no longer to the right of the partitioning bar either, so we can apply IH. This gives Σi i ; Γ1 , Γ2 |y : C n n (H, [l/x]en ) : τ |S. Thus b b Γi i ; Γ n (H, [l/x]en ) : τ |S. If en = M n , the argument is similar to the preceding case. b n If en = new D( ej j ){ Mk k }, the well-typedness of (H, ej ) b n and (H, Mk ) follow directly from IH. The only concern left is n whether dom(∪i Σi ) ⊇ locs([l/x](sub D { Mk k })) when n = 0. This clearly holds because the only addition to the set of locations is l, and l ∈ dom(∪i Σi ). The remaining cases are straightforward. We need a similar lemma that does not lower the level of the typing to handle escapes. e : τ |S b b Lemma 15 (Augmentation Lemma). If Σi i ; Γ and Σi i ; Γ H and dom(∪i Σi ) ⊇ locs(bn ) then Σi i ; Γ n e (H, en ) : τ |S. b b Proof. Induction on en . This is just a matter of checking b Σi i n n by (I), and we can replace the store typing sequence with Σi I due i to Σh weakening. For any other l ∈ dom(∪I Σi ), (III) tells us that l ∈ dom Σ1 i=1 so (∪I Σi )(l) = Σ1 (l) and iscf(Σ1 (l)). It follows that Σ1 (l) = i=1 F and by (II), H2 (l) = (F, lk k ). Moreover, ∀k. iscf(ftypek (F )) so iscf((∪i Σi )(l)), hence lk ∈ dom Σ1 . Therefore, H2 (l) is wellformed under any well-formed store typing sequence starting with Σ1 , including Σi I . i Thus Σi I ; Γ H2 . i Finally, we are ready to prove Preservation. As noted in the main text, there are some invariants that are not captured in the statement of Lemma 3. We give a complete statement here. Lemma 17 (Preservation (extended version)). If (I) Σi I ; Γ n (H1 , en ) : τ |S b1 b i n (II) H1 , en b1 H2 , en b2 >n (III) S = sep ∨ Γ = ∅|∅ (IV) Γ = Γ then ∃ Σi I i ≥1 such that n (i) (H2 , en ) : τ |S b2 b (ii) Σ1 ⊇ Σ1 ∧ ΣI ⊇ ΣI ∧ Σi = Σi I−1 i=2 (iii) H1 (l) = H2 (l) =⇒ (l ∈ dom H1 ) ∨ (l ∈ dom(Σ1 ∪ ΣI )) ∨ (l is local to (I)). Σi I ; Γ i ◦ Σj j ; Γ H (∗) for every extended environment Σi i ◦ Σj j ; Γ that appears in the derivation of Σi i ; Γ n en : τ |S. By Σe relevance, b b Σj j = ∅, ∅, . . . , ∅ without loss of generality. Then ∪i Σi = (∪i Σi ) ∪ (∪j Σj ), so Σi i where H1 (l) = H2 (l) includes the case where one is defined but not the other. Proof. Induction on the evaluation context E n,k . Conclusion (ii) will be obvious for each case, so we will not explicitly write out its justification. We first handle primitive reductions. There are three forms that extend the heap, all at n = 0. The only point of change on the heap for these case is the new location, which is fresh for H1 . Therefore (iii) holds, and only (i) remains to be proved. • Suppose en = new D( lj b1 0 bn j ){ Ma a }. Then e2 = l ∈ H1 and τ = D. Take Σ1 = Σ1 and ΣI = ΣI [l → D]. For every b l = l, we have H1 (l ) = H2 (l ) so by (I) and Σh weakening, ◦ Σj j ; Γ H(l) : (∪i Σi )(l) for every l ∈ dom(∪i Σi ) (= dom((∪i Σi ) ∪ (∪j Σj ))) by Σh weakening. Therefore, Σi i ◦ Σj j ; Γ H, and (∗) follows by ΓH weakening. We now turn to the Smashing Lemma. We refine its statement to take advantage of the well-formedness assumptions. The new lemma still captures the same idea but under a simpler setting: we need typing of code-containing locations only in the scope of any future-stage variables that they may refer to, and its proof relies on the fact that most of the heap has not changed. It is possible to prove it without the well-formedness assumptions, but it would only obfuscate the argument. Lemma 16 (Smashing Lemma (refined)). If (I) (II) (III) Σ i I ; Γ H1 i Σi I+1 ; Γ H2 i ∀l ∈ dom(∪I+1 Σi ). i=1 Σi I ; Γ i H2 (l ) : D. For H2 (l) = (sub D { lj j ), by inversion on (I) we know that both the tag and the fields are well-typed under Σi i ; Γ, so that the heap element itself is well-typed. Thus by Σh weakening, Σi I ; Γ i This shows that Σi I ; Γ i locs(sub D { 0 Ma a }) 0 Ma a }, H2 (l) : D. H2 . By inversion, ⊆ dom(∪i Σi ) ⊆ dom(∪i Σi ). H1 (l) = H2 (l) =⇒ l ∈ dom(Σ1 ∪ ΣI+1 ) (IV) Σ1 ⊇ Σ1 ∧ ΣI+1 ⊇ ΣI+1 ∧ Σi = Σi I i=2 where H1 (l) = H2 (l) includes the case where one is defined while the other is not, then Σi I ; Γ i H2 . All the other premises needed to justify (i) is directly obtained from IH. • Suppose en = (let x ⇐ new C( lj j ) in en ). Then en = b1 b2 [l/x]e0 and l ∈ dom H1 and lj ∈ dom(∪i Σi ) . By inversion ∃Σ. Σi i , Σ; Γ1 , Γ2 |x : C 0 0 (H1 , e0 ) : τ |S. b (∗1) Proof. For every location l ∈ dom(∪I Σi ) such that H1 (l) = i=1 H2 (l), we have Σi I ; Γ i H2 (l) : (∪I Σi )(l) i=1 If iscf(C), take Σ1 = Σ1 [l → C] ∧ ΣI = ΣI ∪ Σ. If not, take Σ1 = Σ1 ∧ ΣI = ΣI ∪ Σ[l → C]. In either case, (iii) and ∪i Σi ⊇ (∪i Σi ) ∪ Σ. For every l ∈ dom(∪i Σi ) such that l = l, H1 (l ) = H2 (l ) so Σi i ; Γ1 , Γ2 |x : C 0 H2 (l ) : (∪i Σi )(l) by (∗1) and Σh weakening. We also have Σi I ; Γ1 , Γ2 |x : i C0 (C, lj j ) : C because (∪i Σi )(lj ) = (∪i Σi )(lj ) ftypej (C) j , so Σi i ; Γ1 , Γ2 |x : C 0 H2 . (∗2) If m does not match one of the Mc c , or if T = C, then the method implementation comes from the static class hierarchy, P . In that case, by Proposition 5 0 ∅, ∅ ; this : τ 0 , xj : τj j |∅ 0 e0 : τ |S. b 0 l is fresh in H1 so it is fresh in (∗1). Thus by (∗1), (∗2), we can use Lemma 6 to get Σi i ; Γ1 , Γ2 |x : C Σi i ; Γ1 , Γ2 |∅ 0 0 0 By Σe relevance and Γe weakening, 0 Σi i ; Γ1 , Γ2 , this : τ 0 , xj : τj j |∅ (H2 , e ) : τ |S. b 0 e0 : τ |S. b Then, noting that l ∈ dom(Σ1 ∪ ΣI ), we have (H2 , [l/x]e0 ) : τ |S b by the Substitution Lemma. By Lemma 18, we can move the partitioning bar to the left, thus Σi i ; Γ 0 (H2 , [l/x]e0 ) : τ |S. b • Suppose en = |e0 | . Then en = l ∈ dom H1 and τ = b1 b2 b Then repeating the argument using the Augmentation and Substitution Lemmas gives (i). • Suppose en = ‘l and H1 (l) = (Code, |e0 | ). Then n = 1, and b1 b since H1 is well-formed, we have Σi i ; Γ 1 e0 : τ |S using Lemma 18. Then by Lemma 11 locs(e0 ) ⊆ dom(∪i Σi ) so by the Augmentation Lemma, Σi i ; Γ 1 (H2 , e0 ) : τ |S. b • Suppose en = en .run(). Then n = 0 and S = insep and b1 H1 (l) = (Code, |bn | ). By (III) and (IV), Γ = ∅|∅. Then by e2 well-formedness of H1 , Σi i ; ∅|∅ (Code, |bn | ) : Code S , τ . e2 b 0 Code S , τ . Take Σ1 = Σ1 and ΣI = ΣI [l → Code S , τ ]. By (I) and (IV), we have Σi i ; Γ ≥1 0 |e0 | : Code S , τ |S H1 = H2 so by the Demotion Lemma Σi i ; Γ en : τ |S. b2 b so the new heap element (Code, |e0 | ) is well-formed. All other locations are unmodified, so they are well-formed under Σi i ; Γ by Σh weakening. Thus Σi i ; Γ H2 , therefore Σi i ; Γ 0 We now consider non-trivial evaluation contexts. Let en = E n,k [rk ] b1 and en = E n,k [ek ]. b2 n • Suppose E n,k = (let x ⇐ new C( vj j) n,k in Ee ). Then (H2 , l) : Code S , τ |S. necessarily n > 0. By inversion, Σi i ; Γ n n (H1 , vj ) : ftypej (C)|S n n n,k (H1 , Ee [rk ]) j There is one case that modifies the heap without extending it: • Suppose (∗3) = (l.fj0 := l ). Then = l and H1 (l) = (T, lj j ). By inversion iscf((∪i Σi )(l)) ∨ (l ∈ dom ΣI ), and the first disjunct implies l ∈ Σ1 , so (iii) holds. Take Σi = Σi i . The updated heap element (T, lj j [j0 → l ]) is well-formed because by inversion (∪i Σi )(l ) ftypej0 (τ ), where τ = (∪i Σi )(l). The other locations are unchanged so they remain well-formed. Thus H2 is well-formed, and we have Σi i ; Γ 0 en b1 en b2 ∃Σ. Σi i , Σ; Γ1 , Γ2 |x : C : τ |S (∗4) b (H2 , l ) : τ |S. b and the derivations of these are disjoint subtrees of (I). We want to apply IH to (∗4), but to do so we must check (III) and (IV). Because n > 0 the x : C n is not a level-0 binding, hence (IV) holds for the subconfiguration. If S = insep then >n Γ = ∅|∅ so (Γ1 , Γ2 |x : C n )>n = ∅|∅ so (III) is satisfied. Therefore, by IH(i) (that is, conclusion (i) of IH), ∃ Σi I+1 i such that Σi I+1 ; Γ1 , Γ2 |x i For all other primitive reductions, we have H1 = H2 so with Σ1 = Σ1 and ΣI = ΣI , (ii) and (iii) hold. • Suppose en = l.fj0 , where n = 0 and H1 (l) = (T, lj b1 • Suppose en = l.m( lj b1 j ). : Cn n n,k (H2 , Ee [ek ]) : τ |S. b By inversion (∪i Σi )(lj0 ) τ so Σi i ; Γ 0 (H2 , lj0 ) : τ |S. b b j ) where H1 (l) = (T, la a ). Let mbody(m, T ) = ( xj J , e0 ) and τ = (∪i Σi )(l). By inverj S sion, mtype(m, τ ) = τj J → τ where (∪i Σi )(lj ) b j If T = sub D { Mc c } and m = mname(Mc0 ), then mtype(m, sub D { Mc c }) = mtype(m, D) = τj τj j . J S j → τ b because class well-formedness rules require method overrides to preserve types. Thus Σi i ; Γ1 , Γ2 |this : D0 0 Mc0 J j |∅ 0 0 =⇒ Σi i ; Γ1 , Γ2 , this : D0 , xj : τj e0 : τ |S. We also have dom(∪i Σi ) ⊇ locs(e0 ) from well-formedness of H1 . ΓH weakening gives Σi i ; Γ1 , Γ2 , this : D , xj : so by the Augmentation Lemma, 0 Σi i ; Γ1 , Γ2 , this : D0 , xj : τj J j |∅ 0 0 0 τj J |∅ j By IH(iii), ∀l ∈ dom(∪i Σi ). H1 (l) = H2 (l) implies l ∈ dom(Σ1 ∪ ΣI+1 ). Thus by the refined Smashing Lemma, Σ i I ; Γ H2 . i IH(iii) also states that any l that H1 and H2 disagree on satisfy one of: l ∈ dom H1 , in which case l is fresh for (∗3) because the domain of store typings in a configuration typing is bounded by the domain of the heap. l is local to (∗4), in which case it is fresh in (∗3) because they are disjoint subtrees. l ∈ dom(Σ1 ∪ Σ). If l ∈ dom Σ1 then l ∈ dom(Σ1 ∪ ΣI ). Else l ∈ Σ implies that l is fresh for (∗3) because it is a subtree of (I) that is disjoint from (∗4), and Σ is introduced at the root of (∗4). Hence l is fresh in (∗3) or l ∈ dom(∪I Σi ). Therefore, by i=1 Lemma 6, Σi I ; Γ i n (H2 , en ) : ftypej (C)|S j j H2 (H2 , e0 ) : τ |S. J 0 j ]e ) Using the Substitution Lemma J + 1 times, we get Σi i ; Γ1 , Γ2 |∅ 0 (H2 , [l/this][ lj J j/ xj : τ |S. which is the last piece needed for (i). Σi I satisfies (ii) because Σi I+1 obeys IH(ii). It also satisi i fies (iii) because by IH(iii), H1 (l) = H2 (l) ensures one of three conditions: l ∈ dom H1 . l ∈ dom(Σ1 ∪ Σ). If l ∈ dom Σ1 then l ∈ dom(Σ1 ∪ ΣI ). If l ∈ Σ, then it is local to (I). l is local to (∗4). Then it is also local to the supertree, (I). n,k • If E n,k = EM , the argument is mostly a repetition of the previous case. • The remaining cases are all straightforward. We simply use IH to obtain Σi I , and apply Lemma 6 to see that the subterms i that did not participate in the small step remain well-typed if we augment them with H2 . This concludes the proof. Lemma 18. If Σi i ; Γ1 , Γ2 |∅ Σi i ; Γ1 |Γ2 n (H, en ) : τ |S. b b n (H, en ) : τ |S holds, then b b Proof. Straightforward induction on en . The partitioning bar is irb relevant for checking heap well-formedness (as seen by ΓH weakening), and the only typing rule that uses the bar, the one for (x.f := en ), only becomes more permissive when the bar is moved to the left. Typing rules that move the bar always move it all the way to the right (and perhaps adds new bindings on the right end) so the invariant is maintined that the typing judgment in the hypothesis has the bar farther to the left than the judgment in the conclusion.