# Compilation

MAlonzo

- The MAlonzo backend's FFI now handles universe polymorphism in a better way.

The translation of Agda types and kinds into Haskell now supports universe-polymorphic postulates. The core changes are that the translation of function types has been changed from

T[[ Pi (x : A) B ]] = if A has a Haskell kind then forall x. () -> T[[ B ]] else if x in fv B then undef else T[[ A ]] -> T[[ B ]]

into

T[[ Pi (x : A) B ]] = if x in fv B then forall x. T[[ A ]] -> T[[ B ]] -- Note: T[[A]] not Unit. else T[[ A ]] -> T[[ B ]],

and that the translation of constants (postulates, constructors and literals) has been changed from

T[[ k As ]] = if COMPILED_TYPE k T then T T[[ As ]] else undef

into

T[[ k As ]] = if COMPILED_TYPE k T then T T[[ As ]] else if COMPILED k E then () else undef.

For instance, assuming a Haskell definition

type AgdaIO a b = IO b,

we can set up universe-polymorphic IO in the following way:

postulate IO : ∀ {ℓ} → Set ℓ → Set ℓ return : ∀ {a} {A : Set a} → A → IO A _>>=_ : ∀ {a b} {A : Set a} {B : Set b} → IO A → (A → IO B) → IO B {-# COMPILED_TYPE IO AgdaIO #-} {-# COMPILED return (\_ _ -> return) #-} {-# COMPILED _>>=_ (\_ _ _ _ -> (>>=)) #-}

This is accepted because (assuming that the universe level type is translated to the Haskell unit type "()")

(\_ _ -> return) : forall a. () -> forall b. () -> b -> AgdaIO a b = T [[ ∀ {a} {A : Set a} → A → IO A ]]

and

(\_ _ _ _ -> (>>=)) : forall a. () -> forall b. () -> forall c. () -> forall d. () -> AgdaIO a c -> (c -> AgdaIO b d) -> AgdaIO b d = T [[ ∀ {a b} {A : Set a} {B : Set b} → IO A → (A → IO B) → IO B ]].

Epic

- New Epic backend pragma: STATIC.

In the Epic backend, functions marked with the STATIC pragma will be normalised before compilation. Example usage:

{-# STATIC power #-} power : ℕ → ℕ → ℕ power 0 x = 1 power 1 x = x power (suc n) x = power n x * x

Occurrences of "power 4 x" will be replaced by "((x * x) * x) * x".

- Some new optimisations have been implemented in the Epic backend:

- Removal of unused arguments.

A worker/wrapper transformation is performed so that unused arguments can be removed by Epic's inliner. For instance, the map function is transformed in the following way:

map_wrap : (A B : Set) → (A → B) → List A → List B map_wrap A B f xs = map_work f xs map_work f [] = [] map_work f (x ∷ xs) = f x ∷ map_work f xs

If map_wrap is inlined (which it will be in any saturated call), then A and B disappear in the generated code.

Unused arguments are found using abstract interpretation. The bodies of all functions in a module are inspected to decide which variables are used. The behaviour of postulates is approximated based on their types. Consider return, for instance:

postulate return : {A : Set} → A → IO A

The first argument of return can be removed, because it is of type Set and thus cannot affect the outcome of a program at runtime.

- Injection detection.

At runtime many functions may turn out to be inefficient variants of the identity function. This is especially true after forcing. Injection detection replaces some of these functions with more efficient versions. Example:

inject : {n : ℕ} → Fin n → Fin (1 + n) inject {suc n} zero = zero inject {suc n} (suc i) = suc (inject {n} i)

Forcing removes the Fin constructors' ℕ arguments, so this function is an inefficient identity function that can be replaced by the following one:

inject {_} x = x

To actually find this function, we make the induction hypothesis that inject is an identity function in its second argument and look at the branches of the function to decide if this holds.

Injection detection also works over data type barriers. Example:

forget : {A : Set} {n : ℕ} → Vec A n → List A forget [] = [] forget (x ∷ xs) = x ∷ forget xs

Given that the constructor tags (in the compiled Epic code) for Vec.[] and List.[] are the same, and that the tags for Vec._∷_ and List._∷_ are also the same, this is also an identity function. We can hence replace the definition with the following one:

forget {_} xs = xs

To get this to apply as often as possible, constructor tags are chosen /after/ injection detection has been run, in a way to make as many functions as possible injections.

Constructor tags are chosen once per source file, so it may be advantageous to define conversion functions like forget in the same module as one of the data types. For instance, if Vec.agda imports List.agda, then the forget function should be put in Vec.agda to ensure that vectors and lists get the same tags (unless some other injection function, which puts different constraints on the tags, is prioritised).

- Smashing.

This optimisation finds types whose values are inferable at runtime:

* A data type with only one constructor where all fields are inferable is itself inferable. * Set ℓ is inferable (as it has no runtime representation).

A function returning an inferable data type can be smashed, which

means that it is replaced by a function which simply returns the inferred value.

An important example of an inferable type is the usual propositional equality type (_≡_). Any function returning a propositional equality can simply return the reflexivity constructor directly without computing anything.

This optimisation makes more arguments unused. It also makes the Epic code size smaller, which in turn speeds up compilation.

JavaScript

- ECMAScript compiler backend.

A new compiler backend is being implemented, targetting ECMAScript (also known as JavaScript), with the goal of allowing Agda programs to be run in browsers or other ECMAScript environments.

The backend is still at an experimental stage: the core language is implemented, but many features are still missing.

The ECMAScript compiler can be invoked from the command line using the flag --js:

agda --js --compile-dir=<DIR> <FILE>.agda

Each source <FILE>.agda is compiled into an ECMAScript target <DIR>/jAgda.<TOP-LEVEL MODULE NAME>.js. The compiler can also be invoked using the Emacs mode (the variable agda2-backend controls which backend is used).

Note that ECMAScript is a strict rather than lazy language. Since Agda programs are total, this should not impact program semantics, but it may impact their space or time usage.

ECMAScript does not support algebraic datatypes or pattern-matching. These features are translated to a use of the visitor pattern. For instance, the standard library's List data type and null function are translated into the following code:

exports["List"] = {}; exports["List"]["[]"] = function (x0) { return x0["[]"](); }; exports["List"]["_∷_"] = function (x0) { return function (x1) { return function (x2) { return x2["_∷_"](x0, x1); }; }; }; exports["null"] = function (x0) { return function (x1) { return function (x2) { return x2({ "[]": function () { return jAgda_Data_Bool["Bool"]["true"]; }, "_∷_": function (x3, x4) { return jAgda_Data_Bool["Bool"]["false"]; } }); }; }; };

Agda records are translated to ECMAScript objects, preserving field names.

Top-level Agda modules are translated to ECMAScript modules, following the common.js module specification. A top-level Agda module "Foo.Bar" is translated to an ECMAScript module "jAgda.Foo.Bar".

The ECMAScript compiler does not compile to Haskell, so the pragmas related to the Haskell FFI (IMPORT, COMPILED_DATA and COMPILED) are not used by the ECMAScript backend. Instead, there is a COMPILED_JS pragma which may be applied to any declaration. For postulates, primitives, functions and values, it gives the ECMAScript code to be emitted by the compiler. For data types, it gives a function which is applied to a value of that type, and a visitor object. For instance, a binding of natural numbers to ECMAScript integers (ignoring overflow errors) is:

data ℕ : Set where zero : ℕ suc : ℕ → ℕ {-# COMPILED_JS ℕ function (x,v) { if (x < 1) { return v.zero(); } else { return v.suc(x-1); } } #-} {-# COMPILED_JS zero 0 #-} {-# COMPILED_JS suc function (x) { return x+1; } #-} _+_ : ℕ → ℕ → ℕ zero + n = n suc m + n = suc (m + n) {-# COMPILED_JS _+_ function (x) { return function (y) { return x+y; }; } #-}

To allow FFI code to be optimised, the ECMAScript in a COMPILED_JS declaration is parsed, using a simple parser that recognises a pure functional subset of ECMAScript, consisting of functions, function applications, return, if-statements, if-expressions, side-effect-free binary operators (no precedence, left associative), side-effect-free prefix operators, objects (where all member names are quoted), field accesses, and string and integer literals. Modules may be imported using the require("<module-id>") syntax: any impure code, or code outside the supported fragment, can be placed in a module and imported.