Emil Axelsson
Chalmers RAW FP Workshop, May 2012
Different DSLs often have a lot in common
Even within the same DSL there are opportunities for reuse
Using Haskell as host for embedded DSLs is supposed to make implementation much easier, but…
These issues are
Arithmetic expressions with equality and conditions:
data Arith a where
Lit :: Show a => a -> Arith a
Add :: Num a => Arith a -> Arith a -> Arith a
Mul :: Num a => Arith a -> Arith a -> Arith a
Equ :: Eq a => Arith a -> Arith a -> Arith Bool
Cond :: Arith Bool -> Arith a -> Arith a -> Arith a
Hardware description:
data Signal a where
Const :: Show a => a -> Signal a
Inv :: Signal Bool -> Signal Bool
And :: Signal Bool -> Signal Bool -> Signal Bool
Or :: Signal Bool -> Signal Bool -> Signal Bool
Mux :: Signal Bool -> Signal a -> Signal a -> Signal a
Similar?
evalArith :: Arith a -> a
evalArith (Lit a) = a
evalArith (Add a b) = evalArith a + evalArith b
evalArith (Mul a b) = evalArith a * evalArith b
evalArith (Equ a b) = evalArith a == evalArith b
evalArith (Cond c t f) = if evalArith c
then evalArith t
else evalArith f
evalSignal :: Signal a -> a
evalSignal (Const a) = a
evalSignal (Inv a) = not (evalSignal a)
evalSignal (And a b) = evalSignal a && evalSignal b
evalSignal (Or a b) = evalSignal a || evalSignal b
evalSignal (Mux c t f) = if evalSignal c
then evalSignal t
else evalSignal f
Similar?
optArith :: Arith a -> Arith a
optArith (Add a b) = Add (optArith a) (optArith b)
optArith (Mul a b) = Mul (optArith a) (optArith b)
optArith (Equ a b) = Equ (optArith a) (optArith b)
optArith (Cond c t f)
| eqArith t f = optArith t
| otherwise = Cond (optArith c) (optArith t) (optArith f)
optArith a = a
optSignal :: Signal a -> Signal a
optSignal (Inv a) = Inv (optSignal a)
optSignal (And a b) = And (optSignal a) (optSignal b)
optSignal (Or a b) = Or (optSignal a) (optSignal b)
optSignal (Mux c t f)
| eqSignal t f = optSignal t
| otherwise = Mux (optSignal c) (optSignal t) (optSignal f)
optSignal a = a
Similar?
Note the repetitive boilerplate code.
The Expression Problem [Wadler 1998] (rephrased):
How to extend a data type with new cases and new interpretation functions without recompiling existing code.
But we want more!
Existing generic programming frameworks (such as SYB) enable generic traversals, but not composable data types.
http://hackage.haskell.org/package/syntactic
Core library:
AST
typeAdditional “Lego” modules:
Idea:
Model data types as application tree:
data AST dom a where
Sym :: Signature a => dom a -> AST dom a
(:$) :: Typeable a => AST dom (a :-> b)
-> AST dom (Full a)
-> AST dom b
infixl 1 :$
type ASTF dom a = AST dom (Full a)
dom
AST
exampleReference DSL:
data DocRef a where
StrRef :: String -> DocRef String
AppendRef :: DocRef String -> DocRef String -> DocRef String
Representation using AST
:
data DocSym a where
Str :: String -> DocSym (Full String)
Append :: DocSym (String :-> String :-> Full String)
type Doc a = ASTF DocSym a
str :: String -> Doc String
str s = Sym (Str s)
append :: Doc String -> Doc String -> Doc String
append s1 s2 = Sym Append :$ s1 :$ s2
Doc
is non-recursive!AST
exampleIsomorphisms:
Doc
DocRef
str
StrRef
append
AppendRef
Define symbol domains for all language constructs:
data Lit2 a where -- Literals
Lit2 :: Show a => a -> Lit2 (Full a)
data Num2 a where -- Numeric operations
Add2 :: Num a => Num2 (a :-> a :-> Full a)
Mul2 :: Num a => Num2 (a :-> a :-> Full a)
data Equ2 a where -- Equality
Equ2 :: Eq a => Equ2 (a :-> a :-> Full Bool)
data Cond2 a where -- Condition (and mux)
Cond2 :: Cond2 (Bool :-> a :-> a :-> Full a)
data Logic2 a where -- Logic operations
Inv2 :: Logic2 (Bool :-> Full Bool)
And2 :: Logic2 (Bool :-> Bool :-> Full Bool)
Or2 :: Logic2 (Bool :-> Bool :-> Full Bool)
Use higher-kinded version of the Either
type:
data (dom1 :+: dom2) :: * -> * where
InjL :: dom1 a -> (dom1 :+: dom2) a
InjR :: dom2 a -> (dom1 :+: dom2) a
infixr :+:
Now we get:
type Arith2 a = ASTF (Lit2 :+: Num2 :+: Equ2 :+: Cond2) a
type Signal2 a = ASTF (Lit2 :+: Logic2 :+: Cond2) a
Isormorphisms:
Arith2
Arith
Signal2
Signal
Constructing the expression :
ex1 :: Arith2 Int
ex1 = Sym (InjR (InjL Add2))
:$ Sym (InjL (Lit2 1))
:$ Sym (InjL (Lit2 2))
Type class magic to the resque:
lit :: (Show a, Lit2 :<: dom) => a -> ASTF dom a
lit a = appSym (Lit2 a)
(<+>) :: (Num a, Typeable a, Num2 :<: dom) =>
ASTF dom a -> ASTF dom a -> ASTF dom a
(<+>) = appSym Add2
Isomorphisms: lit
Lit
, <+>
Add
Informally, a constraint
Sym :<: dom
is valid iff.
dom = ... :+: Sym :+: ...
Possible types of lit
:
lit :: (Show a, Lit2 :<: dom) => a -> ASTF dom a
lit :: Int -> Arith2 Int
lit :: Bool -> Signal2 Bool
See Wouter Swierstra, Data types à la carte, JFP 2008.
(<*>) :: (Num a, Typeable a, Num2 :<: dom) =>
ASTF dom a -> ASTF dom a -> ASTF dom a
(<=>) :: (Eq a, Typeable a, Equ2 :<: dom) =>
ASTF dom a -> ASTF dom a -> ASTF dom Bool
cond :: (Typeable a, Cond2 :<: dom) =>
ASTF dom Bool -> ASTF dom a -> ASTF dom a -> ASTF dom a
(<*>) = appSym Mul2
(<=>) = appSym Equ2
cond = appSym Cond2
The denotation of a symbol:
type family Denotation a
type instance Denotation (Full a) = a
type instance Denotation (a :-> b) = a -> Denotation b
*Main> :kind! Denotation (Int :-> Bool :-> Full Char)
Denotation (Int :-> Bool :-> Full Char) :: *
= Int -> Bool -> Char
Generic code:
class Eval expr where
eval :: expr a -> Denotation a
instance Eval dom => Eval (AST dom) where
eval (Sym s) = eval s
eval (s :$ a) = eval s $ eval a
instance (Eval sub1, Eval sub2) => Eval (sub1 :+: sub2) where
eval (InjL s) = eval s
eval (InjR s) = eval s
Domain-specific code:
instance Eval Lit2 where
eval (Lit2 a) = a
instance Eval Num2 where
eval Add2 = (+)
eval Mul2 = (*)
instance Eval Equ2 where
eval Equ2 = (==)
instance Eval Cond2 where
eval Cond2 = \c t f -> if c then t else f
instance Eval Logic2 where
eval Inv2 = not
eval And2 = (&&)
eval Or2 = (||)
ex2 = cond (lit 10 <=> a) a a
where
a = lit 5 <+> lit (6 :: Int)
*Main> :t ex2
ex2
:: (Cond2 :<: dom, Equ2 :<: dom, Lit2 :<: dom, Num2 :<: dom) =>
ASTF dom Int
*Main> eval (ex2 :: Arith2 Int)
11
Generic code:
class Equality expr where
equal :: expr a -> expr b -> Bool
instance Equality dom => Equality (AST dom) where
equal (Sym s1) (Sym s2) = equal s1 s2
equal (s1 :$ a1) (s2 :$ a2) = equal s1 s2 && equal a1 a2
equal _ _ = False
instance (Equality sub1, Equality sub2) =>
Equality (sub1 :+: sub2) where
equal (InjL s1) (InjL s2) = equal s1 s2
equal (InjR s1) (InjR s2) = equal s1 s2
equal _ _ = False
Domain-specific code:
instance Equality Lit2 where
equal (Lit2 a) (Lit2 b) = show a == show b
instance Equality Num2 where
equal Add2 Add2 = True
equal Mul2 Mul2 = True
equal _ _ = False
instance Equality Equ2 where
equal Equ2 Equ2 = True
instance Equality Cond2 where
equal Cond2 Cond2 = True
instance Equality Logic2 where
equal Inv2 Inv2 = True
equal And2 And2 = True
equal Or2 Or2 = True
equal _ _ = False
The whole generic condition optimizer:
optimize :: (Cond2 :<: dom, Equality dom) => AST dom a -> AST dom a
optimize (cond :$ c :$ t :$ f)
| Just Cond2 <- prj cond
, equal t f = optimize t
optimize (s :$ a) = optimize s :$ optimize a
optimize sym = sym
(prj
from Data type à la carte)
Possible types:
optimize :: Arith2 a -> Arith2 a
optimize :: Signal2 a -> Signal2 a
What have we achieved so far?
Lit2
/lit
and Cond2
/cond
)eval
)optimize
)Some syntactic noise has been introduced (e.g. :$
and prj
), but the overhead is quite small.
The syntactic core library described in my ICFP submission:
Application of AST
s not enough to make a useful EDSL.
Need to make use of existing Haskell features.
Fibonacci in Feldspar:
fib :: Data Index -> Data Index
fib n = fst $ forLoop n (1,1) $ \i (a,b) -> (b,a+b)
Conversion between “sugar” types and AST
s:
class Typeable (Internal a) => Syntactic a dom | a -> dom where
type Internal a
desugar :: a -> ASTF dom (Internal a)
sugar :: ASTF dom (Internal a) -> a
(desugar . sugar)
must preserve semantics (eval
), but not necessarily syntax.data Pair a where
Pair :: Pair (a :-> b :-> Full (a,b))
Fst :: Pair ((a,b) :-> Full a)
Snd :: Pair ((a,b) :-> Full b)
instance (Syntactic a dom, Syntactic b dom, Pair :<: dom) =>
Syntactic (a,b) dom where
type Internal (a,b) = (Internal a, Internal b)
desugar (a,b) = appSym Pair (desugar a) (desugar b)
sugar ab = (sugar (appSym Fst ab), sugar (appSym Snd ab))
All variable binding can be handled by a single construct:
lambda :: ... =>
(ASTF dom a -> ASTF dom b) -> ASTF dom (a -> b)
How to implement?
Two options:
Unfortunately, can’t be implemented as a non-recursive AST
symbol.
Sufficiently good hack:
type VarId = Integer
data Variable a where
Variable :: Typeable a => VarId -> Variable (Full a)
data HOLambda dom a where
HOLambda :: (Typeable a, Typeable b)
=> (ASTF (HODomain dom) a -> ASTF (HODomain dom) b)
-> HOLambda dom (Full (a -> b))
type HODomain dom = HOLambda dom :+: Variable :+: dom
lambda :: (Typeable a, Typeable b)
=> (ASTF (HODomain dom) a -> ASTF (HODomain dom) b)
-> ASTF (HODomain dom) (a -> b)
lambda = appSym . HOLambda
First-order lambda:
data Lambda a where
Lambda :: Typeable a => VarId -> Lambda (b :-> Full (a -> b))
HOAST to “FOAST”:
reify :: Typeable a
=> AST (HODomain dom) a
-> AST (Lambda :+: Variable :+: dom) a
HODomain
is only used when constructing termsreify
reify
guarantees no HOLambda
s in resultThe Syntactic library provides interpretations and transformations that correctly deal with variables, e.g:
alphaEq :: AlphaEq dom dom dom [(VarId,VarId)] =>
ASTF dom a -> ASTF dom b -> Bool
instance (AlphaEq dom dom dom env, VarEqEnv env) =>
AlphaEq Lambda Lambda dom env
instance (AlphaEq dom dom dom env, VarEqEnv env) =>
AlphaEq Variable Variable dom env
...
By using lambda
for all binding constructs the DSL implementor gets important parts of the implementation for free!
forLoop
(soon)instance (Syntactic a (HODomain dom), Syntactic b (HODomain dom)) =>
Syntactic (a -> b) (HODomain dom) where
type Internal (a -> b) = Internal a -> Internal b
desugar f = lambda (desugar . f . sugar)
sugar = error "I don't make sense :("
http://hackage.haskell.org/package/feldspar-language
http://hackage.haskell.org/package/feldspar-compiler
Discrete cosine transform, naive implementation:
dct2 :: Vector (Data Float) -> Vector (Data Float)
dct2 vec = mat *** vec
where
n = length vec
mat = indexedMat n n $ \k l ->
let n' = i2f n
k' = i2f k
l' = i2f l
in cos ((k'*(2*l'+1)*pi)/(2*n'))
Vector
is another example of syntactic sugarVector
enables powerful optimizations on-the-fly (before generating an AST)void dct2(struct array * v0, struct array * out)
{
uint32_t v4;
float v5;
v4 = getLength(v0);
v5 = ((float)(v4) * 2.0f);
initArray(out, sizeof(float), v4);
for(uint32_t v1 = 0; v1 < v4; v1 += 1)
{
float v6;
float v3;
v6 = (float)(v1);
at(float,out,v1) = 0.0f;
for(uint32_t v2 = 0; v2 < v4; v2 += 1)
{
v3 = (at(float,out,v1) + (cosf((((v6 * (((float)(v2)
* 2.0f) + 1.0f)) * 3.1415927410125732f) / v5))
* at(float,v0,v2)));
at(float,out,v1) = v3;
}
}
}
Want to hide AST
and Syntactic
from the user:
newtype Data a = Data { unData :: ASTF FeldDomainAll a }
instance Type a => Syntactic (Data a) FeldDomainAll where
type Internal (Data a) = a
desugar = unData
sugar = Data
class (Syntactic a FeldDomainAll, Type (Internal a)) => Syntax a
instance Type a => Syntax (Data a)
For loop:
data Loop a where
ForLoop :: Type a =>
Loop (Length :-> a :-> (Index -> a -> a) :-> Full a)
type FeldDomainAll = HODomain (... :+: Loop :+: ...)
Semantics of ForLoop
:
instance Semantic Loop where
semantics ForLoop = Sem "forLoop" forLoop
where
forLoop 0 init _ = init
forLoop l init body = foldl (flip body) init [0..l-1]
instance ExprEq Loop where exprEq = exprEqSem
instance Render Loop where renderPart = renderPartSem
instance ToTree Loop
instance Eval Loop where evaluate = evaluateSem
instance EvalBind Loop where evalBindSym = evalBindSymDefault
instance AlphaEq dom dom dom env => AlphaEq Loop Loop dom env where
alphaEqSym = alphaEqSymDefault
Semantic
instanceforLoop :: Syntax st =>
Data Length -> st -> (Data Index -> st -> st) -> st
forLoop len init step =
sugar $ appSym ForLoop
(desugar len)
(desugar init)
(desugar step) -- Introduces two nested lambdas
The function sugarSym
automates the process:
forLoop2 :: Syntax st =>
Data Length -> st -> (Data Index -> st -> st) -> st
forLoop2 = sugarSym ForLoop
Another possible type:
genericForLoop ::
( Syntax len, Syntax st, Syntax ix
, Internal len ~ Index, Internal ix ~ Index) =>
len -> st -> (ix -> st -> st) -> st
genericForLoop = sugarSym ForLoop
Feldspar contains lots of constructs, all implemented more or less in the same way as ForLoop
.
Most of the implementation is straight-forward (ignoring the C back-end)!
Syntactic has made it very easy to experiment with new language constructs.
Downsides:
Mostly suited for expression languages
Mutually recursive data types somewhat unnatural to represent
Needs support for declarative rewrite rules
Complicated type hackery sometimes needed
Language support for composable data types?
Relation to Scala, Modelyze?
Embedding makes life easier for the implementor
…at the cost of the end-user
How to improve errors?
How to improve syntax?
fib :: Data Index -> Data Index
fib n = fst for _ < n
init (a,b) = (0,1)
step (b,a+b)
instead of
fib :: Data Index -> Data Index
fib n = fst $ forLoop n (1,1) $ \i (a,b) -> (b,a+b)
Syntax must be easily extensible (even by users)
Should be able to access derived combinators like
unroll :: Index -> st -> (Data Index -> st -> st) -> st
unroll n init step =
Prelude.foldl (\st i -> step (value i) st) init [0 .. n-1]
(Feldspar uses derived combinators extensively.)
Custom syntax (for example):
ex = unroll i < n
init s = 0
step s+i
instead of
ex = unroll n 0 (+)