% GenExp.lhs - generate expressions for the HPG.

% @(#)GenExp.lhs	1.20 dated 92/07/20 at 17:30:38

% Crown Copyright 1992

\section{Expression generation}

This module generates random expressions which evaluate to a given value.
The generation procedure is as follows:
\begin{enumerate}
\item Decide what sorts of expression could give the value.
    This is based on the value's type: for example integers can be
    generated by addition, multiplication etc.
\item Randomly choose one of the possible sorts of expression, for example
    addition.
\item Produce values which, when substituted into the expression, give the
    value to be generated: for example $3 + 1 = 4$ in the case of an addition
    expression to generate the value 4.
    There is a check at this stage to ensure that the desired value can be
    produced by the chosen sort of expression, for example we cannot
    produce zero by division.
    If this check fails, we return to step 2.
\item Recursively generate expressions evaluating to these new values.
\end{enumerate}
The complexity of an expression is governed by a depth parameter: a basic
value, or an expression consisting of a value, has depth one; a list, tuple,
constructor, array or application expression has depth one greater than that
of its deepest argument.

It is easy to extend the scope of the \HPG\ by adding extra kinds of expression
to be selected at stage 2 above.

\begin{haskell}

> module GenExp (
>     gen_exps
>     ) where

> import Config
> import Types
> import Env
> import Utils
> import GenVal

\end{haskell}

\prog{gen\_exps dp vnesc} applies \prog{vnesc} to a list of
(value name, expression) pairs where the value names are those of
the declarations in the value environment and the expressions evaluate
to the values corresponding to their value names.
\begin{haskell}

> gen_exps :: Int -> Xscont (Val_name, Expression) -> Cont
> gen_exps dp vnesc
>     =  get_all_val_decls
>        (\vds -> cmap [gen_exp dp v . c1 vn | (vn,v) <- vds] vnesc)
>        where
>        c1 vn vnec e  =  vnec (vn,e)

\end{haskell}

It is convenient to introduce a couple of new types at this stage,
for use later on.

\prog{Genfn x} is the type of functions which take a depth argument, \prog{dp},
and an element, \prog{v}, of the type \prog{x} and apply their \prog{Econt}
to an expression of depth $\leq$ \prog{dp} and value \prog{v}.

\prog{Fnlist x} is a type of lists of pairs, where the first element of
each pair is a weighting and the second is a pair whose first element is
an element of \prog{Genfn x} and whose second element is a filter on elements
of \prog{x}.
See the end of section~\ref{exp.num} for an example of its use.
\begin{haskell}

> type Genfn x   =  Int -> x -> Econt -> Cont
> type Fnlist x  =  [(Int, (Genfn x, x -> Bool))]

\end{haskell}

\prog{gen\_exp dp v ec} applies \prog{ec} to an expression of depth
$\leq$ \prog{dp} and value \prog{v}.
If \prog{dp} $\leq$ 1, then \prog{v} is returned, otherwise one of
the elements of \prog{general\_fns} is used to generate an expression.
\begin{haskell}

> gen_exp :: Genfn Value
> gen_exp dp v ec | dp <= one  =  ec (Val_exp v)
>                 | otherwise  =  pick_exp general_fns dp v ec

\end{haskell}

\prog{general\_fns} contains functions which are capable of generating
any type of expression.
There are currently two of these: \prog{gen\_exp'} generates an expression
based on the type of the value to be generated; \prog{gen\_lambda\_exp}
generates a lambda expression which evaluates to the value to be generated.
\begin{haskell}

> general_fns :: Fnlist Value
> general_fns  =  [(1, (gen_exp',ok)), (1, (gen_lambda_exp,ok))]

\end{haskell}

\prog{gen\_exp' dp v ec} applies \prog{ec} to an expression evaluating to
\prog{v}, of depth $\leq$ \prog{dp}, based on the type of \prog{v}.
The values \prog{intfns}, \prog{integerfns} etc are lists of functions
for generating values of a particular type; \prog{gen\_exp'} uses
\prog{pick\_exp} to select a random value from the appropriate list,
and applies it to \prog{v}.
\begin{haskell}

> gen_exp' :: Genfn Value
> gen_exp' dp n@(Num_val Int_type _) ec      =  pick_exp intfns dp n ec
> gen_exp' dp n@(Num_val Integer_type _) ec  =  pick_exp integerfns dp n ec
> gen_exp' dp n@(Num_val Float_type _) ec    =  pick_exp floatfns dp n ec
> gen_exp' dp n@(Num_val Double_type _) ec   =  pick_exp doublefns dp n ec
> gen_exp' dp (Bool_val b) ec       =  pick_exp boolfns dp b ec
> gen_exp' dp (Char_val c) ec       =  pick_exp charfns dp c ec
> gen_exp' dp (List_val vs) ec      =  pick_exp listfns dp vs ec
> gen_exp' dp (Tuple_val vs) ec     =  pick_exp tuplefns dp vs ec
> gen_exp' dp (Tagged_val c vs) ec  =  pick_exp taggedfns dp (c,vs) ec
> gen_exp' dp (Array_val vp avs) ec =  pick_exp arrayfns dp (vp,avs) ec

\end{haskell}
% When randomly generated functions are included they will have to be
% included in intfns etc, which will therefore have to be incorporated
% into the environment.

\subsection{Numeric expressions}
\label{exp.num}
\prog{gen\_plus\_exp dp n} generates a ``plus'' expression with value
\prog{n}.
It does this by generating expressions with values of 1 and \prog{n-1} and
adding them together.
The other functions in this section work in a similar manner.
\begin{haskell}

> gen_plus_exp :: Genfn Value
> gen_plus_exp dp (Num_val k n)
>     =  binary_exp plus_name (dp-one) (Num_val k 1) (Num_val k (n-one))

\end{haskell}

\prog{gen\_minus\_exp dp n} generates a ``minus'' expression with value
\prog{n}.
\begin{haskell}

> gen_minus_exp :: Genfn Value
> gen_minus_exp dp (Num_val k n)
>     =  binary_exp minus_name (dp-one) (Num_val k (n+one)) (Num_val k 1)

\end{haskell}

\prog{gen\_mult\_exp dp n} generates a ``multiply'' expression with value
\prog{n}.
\begin{haskell}

> gen_mult_exp :: Genfn Value
> gen_mult_exp dp n'@(Num_val k n)
>     =  binary_exp mult_name (dp-one) n' (Num_val k 1)

\end{haskell}

\prog{gen\_div\_exp dp n} generates a ``divide'' expression with value
\prog{n}.
It is only applicable to numbers of class \prog{Integral}.
There is an associated filter, \prog{div\_filter},
to ensure that \prog{n} is non-zero.
\begin{haskell}

> gen_div_exp :: Genfn Value
> gen_div_exp dp n'@(Num_val k n)
>     =  binary_exp div_name (dp-one) n' (Num_val k 1)

> div_filter :: Value -> Bool
> div_filter (Num_val _ n)  =  n /= 0

\end{haskell}

\prog{gen\_neg\_exp dp n} generates a ``negate'' expression with value
\prog{n}.
\begin{haskell}

> gen_neg_exp :: Genfn Value
> gen_neg_exp dp (Num_val k n)
>     =  unary_exp negate_name (dp-one) (Num_val k (negate n))

\end{haskell}

\prog{gen\_int\_id\_exp dp n} generates a ``plus'' expression whose first
element is an identifier from the lambda environment.
If the lambda environment is empty, it falls back on \prog{gen\_plus\_exp}.
See section~\ref{exp.lambda} for more on the contents of the lambda
environment.
\begin{haskell}

> gen_int_id_exp :: Genfn Value
> gen_int_id_exp dp n'@(Num_val k n) ec
>     =  get_all_lambdas (\vvs ->
>        case vvs of
>            []    -> gen_plus_exp dp n' ec
>            (_:_) -> choose vvs (\(vn, Num_val Int_type n'') ->
>                     gen_exp (dp-one) (int_val (n-n'')) (\e ->
>                     ec (Apply_exp (Apply_exp (Id_exp plus_name) (Id_exp vn))
>                                   e))))

\end{haskell}

\prog{intfns} is the collected list of functions for generating \prog{Int}
expressions, with their relative probabilities of selection.
\begin{haskell}

> intfns :: Fnlist Value
> intfns  =  [(1, (gen_plus_exp,ok)), (1, (gen_minus_exp,ok)),
>             (1, (gen_mult_exp,ok)), (1, (gen_div_exp,div_filter)),
>             (1, (gen_neg_exp,ok)),  (1, (gen_int_id_exp,ok))]

\end{haskell}

\prog{integerfns} is the collected list of functions for generating
\prog{Integer} expressions, with their relative probabilities of selection.
\begin{haskell}

> integerfns :: Fnlist Value
> integerfns  =  intfns

\end{haskell}

\prog{floatfns} is the collected list of functions for generating \prog{Float}
expressions, with their relative probabilities of selection.
\begin{haskell}

> floatfns :: Fnlist Value
> floatfns  =  [(1, (gen_plus_exp,ok)), (1, (gen_minus_exp,ok)),
>               (1, (gen_mult_exp,ok)), (1, (gen_neg_exp,ok)),
>               (1, (gen_int_id_exp,ok))]

\end{haskell}

\prog{doublefns} is the collected list of functions for generating
\prog{Double} expressions, with their relative probabilities of selection.
\begin{haskell}

> doublefns :: Fnlist Value
> doublefns  =  floatfns

\end{haskell}

\subsection{Boolean expressions}
\prog{gen\_not\_exp dp b} generates a ``not'' expression
with value \prog{b}.
\begin{haskell}

> gen_not_exp :: Genfn Bool
> gen_not_exp dp b  =  unary_exp not_name (dp-one) (Bool_val (not b))

\end{haskell}

\prog{gen\_and\_exp dp b} generates an ``and'' expression with value
\prog{b}.
\begin{haskell}

> gen_and_exp :: Genfn Bool
> gen_and_exp dp b   =  bool_binary_exp and_name (dp-one) True b

\end{haskell}
The definition above raises a nice point regarding lazy evaluation: we could
as easily have written \prog{b~True} as \prog{True~b} but, when \prog{b}
is \prog{False}, the former would give an expression in which the second
operand of the expression was not evaluated.
The latter forces evaluation of both operands, and thus tests the compiler
more comprehensively.
The same remark applies to the following function.

\prog{gen\_or\_exp dp b} generates an ``or'' expression with value
\prog{b}.
\begin{haskell}

> gen_or_exp :: Genfn Bool
> gen_or_exp dp b   =  bool_binary_exp or_name (dp-one) False b

\end{haskell}

\prog{gen\_less\_exp dp b}  generates an integer ``less than'' expression
with value \prog{b}.
Clearly it is simple to extend the \HPG\ with other comparison expressions
yielding a boolean result.
\begin{haskell}

> gen_less_exp :: Genfn Bool
> gen_less_exp dp True   =  int_binary_exp less_name (dp-one) 0 1
> gen_less_exp dp False  =  int_binary_exp less_name (dp-one) 1 1

\end{haskell}

\prog{boolfns} is the collected list of functions for generating boolean
expressions, with their relative probabilities of selection.
\begin{haskell}

> boolfns :: Fnlist Bool
> boolfns  =  [(1, (gen_not_exp,ok)), (1, (gen_and_exp,ok)),
>              (1, (gen_or_exp,ok)),  (1, (gen_less_exp,ok))]

\end{haskell}

\subsection{Character expressions}
\prog{gen\_decode\_exp dp c} generates a int-to-char expression with
value \prog{c}.
\begin{haskell}

> gen_decode_exp :: Genfn Char
> gen_decode_exp dp c  =  unary_exp int_to_char_name (dp-one) (int_val (fromEnum c))

\end{haskell}

\prog{charfns} is the collected list of functions for generating character
expressions, with their relative probabilities of selection.
\begin{haskell}

> charfns :: Fnlist Char
> charfns  =  [(1, (gen_decode_exp,ok))]

\end{haskell}

\subsection{List expressions}
\prog{gen\_list\_exp dp vs ec} generates an enumerated list expression.
We decrease the depth counter by one on recursion to prevent explosion of
the generated expression, since the other list functions increase the
size considerably.
\begin{haskell}

> gen_list_exp :: Genfn [Value]
> gen_list_exp dp vs ec  =  cmap (map (gen_exp (dp-one)) vs)
>                           (\es -> ec (List_exp es))

\end{haskell}

\prog{gen\_drop\_exp dp vs} generates a ``drop'' expression.
This definition just concatenates the list to itself and then drops the
first half.
\begin{haskell}

> gen_drop_exp :: Genfn [Value]
> gen_drop_exp dp vs
>     =  binary_exp drop_name (dp-one) (int_val (length vs)) (List_val (vs++vs))

\end{haskell}

\prog{gen\_take\_exp dp vs} generates a ``take'' expression.
This definition just concatenates the list to itself and then takes the
first half.
\begin{haskell}

> gen_take_exp :: Genfn [Value]
> gen_take_exp dp vs
>     =  binary_exp take_name (dp-one) (int_val (length vs)) (List_val (vs++vs))

\end{haskell}

%This subsection also needs something to generate ``ZF''expressions.

\prog{listfns} is the collected list of functions for generating list
expressions, with their relative probabilities of selection.
\prog{gen\_list\_exp} has a high relative probability of selection to
prevent explosion of the generated expression --- \prog{gen\_drop\_exp}
and \prog{gen\_take\_exp} each double the size of the expression generated.
\begin{haskell}

> listfns :: Fnlist [Value]
> listfns  =  [(8, (gen_list_exp,ok)), (1, (gen_drop_exp,ok)),
>              (1, (gen_take_exp,ok))]

\end{haskell}

\subsection{Tuple expressions}
\prog{gen\_tuple\_exp dp vs ec} generates a ``tuple'' expression by just
enumerating the elements.
The \HPG\ does not currently generate any expressions containing functions
whose result is a tuple.
\begin{haskell}

> gen_tuple_exp :: Genfn [Value]
> gen_tuple_exp dp vs ec  =  cmap (map (gen_exp dp) vs)
>                            (\es -> ec (Tuple_exp es))

\end{haskell}

\prog{tuplefns} is the collected list of functions for generating tuple
expressions, with their relative probabilities of selection.
\begin{haskell}

> tuplefns :: Fnlist [Value]
> tuplefns  =  [(1, (gen_tuple_exp,ok))]

\end{haskell}

\subsection{Tagged expressions}
\prog{gen\_tuple\_exp dp (c,vs)} generates a ``tagged'' expression.
It does this by just generating expressions which evaluate to the
arguments of the constructor of the value to be generated.
\begin{haskell}

> gen_tagged_exp :: Genfn (Constructor, [Value])
> gen_tagged_exp dp (c,vs) ec  =  cmap (map (gen_exp dp) vs)
>                                 (\es -> ec (Tagged_exp c es))

\end{haskell}

\prog{taggedfns} is the collected list of functions for generating tagged
expressions, with their relative probabilities of selection.
\begin{haskell}

> taggedfns :: Fnlist (Constructor, [Value])
> taggedfns  =  [(1, (gen_tagged_exp,ok))]

\end{haskell}

\subsection{Array expressions}

\prog{gen\_array\_exp dp ((v,v'), avs)} generates an array expression.
It separately generates expressions evaluating to the array bounds and
to the association values comprising the array, and then reassembles
them to form the array.
\begin{haskell}

> type Assoc a b = (a,b)
>
> gen_array_exp :: Genfn ((Value,Value), [Assoc Value Value])
> gen_array_exp dp ((v,v'), avs) ec
>     =  gen_exp (dp-one) (Tuple_val [v,v'])
>        (\e -> cmap [binary_exp assoc_name (dp-one) v1 v2
>                         | (v1, v2) <- avs]
>        (\es -> ec (Array_exp e (List_exp es))))

\end{haskell}

\prog{arrayfns} is the collected list of functions for generating array
expressions, with their relative probabilities of selection.
\begin{haskell}

> arrayfns :: Fnlist ((Value,Value), [Assoc Value Value])
> arrayfns  =  [(1, (gen_array_exp,ok))]

\end{haskell}

\subsection{Lambda expressions}
\label{exp.lambda}
\prog{gen\_lambda\_exp dp v ec} generates a lambda expression with value
\prog{v}.
It does this by generating a \prog{Val\_name}, \prog{vn}, and an integer
value, \prog{v'}, which it pushes onto the lambda environment; it then
generates an expression, \prog{e}, with value \prog{v} in the modified
environment, pops the top element of the lambda environment and applies
\prog{ec} to one of two expressions.
The first expression is \prog{($\backslash$vn -> e) vn}, where \prog{(vn,v')}
has been added to the value environment;
the second expression is \prog{($\backslash$vn -> e) e'} where \prog{e'} is
an expression evaluating to \prog{v'}.
\begin{haskell}

> gen_lambda_exp :: Genfn Value
> gen_lambda_exp dp v ec
>     = get_val_names one (\[vn] ->
>       gen_id_val (dp-one) (Basic_type (Num_type Int_type)) (\v' ->
>       push_lambda (vn, v') (
>       gen_exp (dp-one) v (\e ->
>       pop_lambda (\ _ ->
>       choose [extend_val_env [(vn,v')] (ec (Apply_exp (Lambda_exp vn e)
>                                                       (Id_exp vn))),
>               gen_exp (dp-one) v'
>                       (\e' -> ec (Apply_exp (Lambda_exp vn e) e'))]
>              id)))))

\end{haskell}

\subsection{Auxiliary functions}
\prog{pick\_exp fs dp x ec} applies \prog{ec} to an expression of depth
$\leq$ \prog{dp} with value
\prog{x}, generated by a function chosen at random from \prog{fs}.
Each function in \prog{fs} comes with a filter which determines whether
the function is capable of generating the value \prog{x}; if not, then
\prog{pick\_exp} is called again.
\begin{haskell}

> pick_exp :: [(Int, (Genfn x, x -> Bool))] -> Genfn x
> pick_exp fs dp x ec
>     =  choosew fs (\(f,p) ->
>            if p x then f dp x ec else pick_exp fs dp x ec)

\end{haskell}

\prog{unary\_exp vn dp v ec} applies \prog{ec} to an ``apply'' expression
whose first element is an ``identifier'' expression with value name
\prog{vn} and whose second element is an expression of depth \prog{dp}
and value \prog{v}.
\begin{haskell}

> unary_exp :: Val_name -> Genfn Value
> unary_exp vn dp v ec  =  gen_exp dp v (\e -> ec (Apply_exp (Id_exp vn) e))

\end{haskell}

\prog{binary\_exp vn dp v v'ec} applies \prog{ec} to an ``apply'' expression
whose first element is itself an ``apply'' expression with first element
an ``identifier'' expression with value name
\prog{vn} and second element an expression of depth \prog{dp}
and value \prog{v}; the second element is an expression of depth \prog{dp}
and value \prog{v'}.
\begin{haskell}

> binary_exp :: Val_name -> Int -> Value -> Value -> Econt -> Cont
> binary_exp vn dp v v' ec
>     =  unary_exp vn dp v
>        (\e -> gen_exp dp v'
>        (\e' -> ec (Apply_exp e e')))

\end{haskell}

\prog{int\_binary\_exp vn dp n n'} and \prog{bool\_binary\_exp vn dp b b'}
are abbreviations for uses of \prog{binary\_exp} for generating integers
and booleans respectively.
\begin{haskell}

> int_binary_exp :: Val_name -> Int -> Int -> Int -> Econt -> Cont
> int_binary_exp vn dp n n'   =  binary_exp vn dp (int_val n) (int_val n')

> bool_binary_exp :: Val_name -> Int -> Bool -> Bool -> Econt -> Cont
> bool_binary_exp vn dp b b'  =  binary_exp vn dp (Bool_val b) (Bool_val b')

\end{haskell}

\prog{ok} is a filter which always returns \prog{True}.
\begin{haskell}

> ok :: x -> Bool
> ok x  =  True

\end{haskell}