% GenExp.lhs - generate expressions for the HPG. % @(#)GenExp.lhs 1.20 dated 92/07/20 at 17:30:38 % Crown Copyright 1992 \section{Expression generation} This module generates random expressions which evaluate to a given value. The generation procedure is as follows: \begin{enumerate} \item Decide what sorts of expression could give the value. This is based on the value's type: for example integers can be generated by addition, multiplication etc. \item Randomly choose one of the possible sorts of expression, for example addition. \item Produce values which, when substituted into the expression, give the value to be generated: for example $3 + 1 = 4$ in the case of an addition expression to generate the value 4. There is a check at this stage to ensure that the desired value can be produced by the chosen sort of expression, for example we cannot produce zero by division. If this check fails, we return to step 2. \item Recursively generate expressions evaluating to these new values. \end{enumerate} The complexity of an expression is governed by a depth parameter: a basic value, or an expression consisting of a value, has depth one; a list, tuple, constructor, array or application expression has depth one greater than that of its deepest argument. It is easy to extend the scope of the \HPG\ by adding extra kinds of expression to be selected at stage 2 above. \begin{haskell} > module GenExp ( > gen_exps > ) where > import Config > import Types > import Env > import Utils > import GenVal \end{haskell} \prog{gen\_exps dp vnesc} applies \prog{vnesc} to a list of (value name, expression) pairs where the value names are those of the declarations in the value environment and the expressions evaluate to the values corresponding to their value names. \begin{haskell} > gen_exps :: Int -> Xscont (Val_name, Expression) -> Cont > gen_exps dp vnesc > = get_all_val_decls > (\vds -> cmap [gen_exp dp v . c1 vn | (vn,v) <- vds] vnesc) > where > c1 vn vnec e = vnec (vn,e) \end{haskell} It is convenient to introduce a couple of new types at this stage, for use later on. \prog{Genfn x} is the type of functions which take a depth argument, \prog{dp}, and an element, \prog{v}, of the type \prog{x} and apply their \prog{Econt} to an expression of depth $\leq$ \prog{dp} and value \prog{v}. \prog{Fnlist x} is a type of lists of pairs, where the first element of each pair is a weighting and the second is a pair whose first element is an element of \prog{Genfn x} and whose second element is a filter on elements of \prog{x}. See the end of section~\ref{exp.num} for an example of its use. \begin{haskell} > type Genfn x = Int -> x -> Econt -> Cont > type Fnlist x = [(Int, (Genfn x, x -> Bool))] \end{haskell} \prog{gen\_exp dp v ec} applies \prog{ec} to an expression of depth $\leq$ \prog{dp} and value \prog{v}. If \prog{dp} $\leq$ 1, then \prog{v} is returned, otherwise one of the elements of \prog{general\_fns} is used to generate an expression. \begin{haskell} > gen_exp :: Genfn Value > gen_exp dp v ec | dp <= one = ec (Val_exp v) > | otherwise = pick_exp general_fns dp v ec \end{haskell} \prog{general\_fns} contains functions which are capable of generating any type of expression. There are currently two of these: \prog{gen\_exp'} generates an expression based on the type of the value to be generated; \prog{gen\_lambda\_exp} generates a lambda expression which evaluates to the value to be generated. \begin{haskell} > general_fns :: Fnlist Value > general_fns = [(1, (gen_exp',ok)), (1, (gen_lambda_exp,ok))] \end{haskell} \prog{gen\_exp' dp v ec} applies \prog{ec} to an expression evaluating to \prog{v}, of depth $\leq$ \prog{dp}, based on the type of \prog{v}. The values \prog{intfns}, \prog{integerfns} etc are lists of functions for generating values of a particular type; \prog{gen\_exp'} uses \prog{pick\_exp} to select a random value from the appropriate list, and applies it to \prog{v}. \begin{haskell} > gen_exp' :: Genfn Value > gen_exp' dp n@(Num_val Int_type _) ec = pick_exp intfns dp n ec > gen_exp' dp n@(Num_val Integer_type _) ec = pick_exp integerfns dp n ec > gen_exp' dp n@(Num_val Float_type _) ec = pick_exp floatfns dp n ec > gen_exp' dp n@(Num_val Double_type _) ec = pick_exp doublefns dp n ec > gen_exp' dp (Bool_val b) ec = pick_exp boolfns dp b ec > gen_exp' dp (Char_val c) ec = pick_exp charfns dp c ec > gen_exp' dp (List_val vs) ec = pick_exp listfns dp vs ec > gen_exp' dp (Tuple_val vs) ec = pick_exp tuplefns dp vs ec > gen_exp' dp (Tagged_val c vs) ec = pick_exp taggedfns dp (c,vs) ec > gen_exp' dp (Array_val vp avs) ec = pick_exp arrayfns dp (vp,avs) ec \end{haskell} % When randomly generated functions are included they will have to be % included in intfns etc, which will therefore have to be incorporated % into the environment. \subsection{Numeric expressions} \label{exp.num} \prog{gen\_plus\_exp dp n} generates a ``plus'' expression with value \prog{n}. It does this by generating expressions with values of 1 and \prog{n-1} and adding them together. The other functions in this section work in a similar manner. \begin{haskell} > gen_plus_exp :: Genfn Value > gen_plus_exp dp (Num_val k n) > = binary_exp plus_name (dp-one) (Num_val k 1) (Num_val k (n-one)) \end{haskell} \prog{gen\_minus\_exp dp n} generates a ``minus'' expression with value \prog{n}. \begin{haskell} > gen_minus_exp :: Genfn Value > gen_minus_exp dp (Num_val k n) > = binary_exp minus_name (dp-one) (Num_val k (n+one)) (Num_val k 1) \end{haskell} \prog{gen\_mult\_exp dp n} generates a ``multiply'' expression with value \prog{n}. \begin{haskell} > gen_mult_exp :: Genfn Value > gen_mult_exp dp n'@(Num_val k n) > = binary_exp mult_name (dp-one) n' (Num_val k 1) \end{haskell} \prog{gen\_div\_exp dp n} generates a ``divide'' expression with value \prog{n}. It is only applicable to numbers of class \prog{Integral}. There is an associated filter, \prog{div\_filter}, to ensure that \prog{n} is non-zero. \begin{haskell} > gen_div_exp :: Genfn Value > gen_div_exp dp n'@(Num_val k n) > = binary_exp div_name (dp-one) n' (Num_val k 1) > div_filter :: Value -> Bool > div_filter (Num_val _ n) = n /= 0 \end{haskell} \prog{gen\_neg\_exp dp n} generates a ``negate'' expression with value \prog{n}. \begin{haskell} > gen_neg_exp :: Genfn Value > gen_neg_exp dp (Num_val k n) > = unary_exp negate_name (dp-one) (Num_val k (negate n)) \end{haskell} \prog{gen\_int\_id\_exp dp n} generates a ``plus'' expression whose first element is an identifier from the lambda environment. If the lambda environment is empty, it falls back on \prog{gen\_plus\_exp}. See section~\ref{exp.lambda} for more on the contents of the lambda environment. \begin{haskell} > gen_int_id_exp :: Genfn Value > gen_int_id_exp dp n'@(Num_val k n) ec > = get_all_lambdas (\vvs -> > case vvs of > [] -> gen_plus_exp dp n' ec > (_:_) -> choose vvs (\(vn, Num_val Int_type n'') -> > gen_exp (dp-one) (int_val (n-n'')) (\e -> > ec (Apply_exp (Apply_exp (Id_exp plus_name) (Id_exp vn)) > e)))) \end{haskell} \prog{intfns} is the collected list of functions for generating \prog{Int} expressions, with their relative probabilities of selection. \begin{haskell} > intfns :: Fnlist Value > intfns = [(1, (gen_plus_exp,ok)), (1, (gen_minus_exp,ok)), > (1, (gen_mult_exp,ok)), (1, (gen_div_exp,div_filter)), > (1, (gen_neg_exp,ok)), (1, (gen_int_id_exp,ok))] \end{haskell} \prog{integerfns} is the collected list of functions for generating \prog{Integer} expressions, with their relative probabilities of selection. \begin{haskell} > integerfns :: Fnlist Value > integerfns = intfns \end{haskell} \prog{floatfns} is the collected list of functions for generating \prog{Float} expressions, with their relative probabilities of selection. \begin{haskell} > floatfns :: Fnlist Value > floatfns = [(1, (gen_plus_exp,ok)), (1, (gen_minus_exp,ok)), > (1, (gen_mult_exp,ok)), (1, (gen_neg_exp,ok)), > (1, (gen_int_id_exp,ok))] \end{haskell} \prog{doublefns} is the collected list of functions for generating \prog{Double} expressions, with their relative probabilities of selection. \begin{haskell} > doublefns :: Fnlist Value > doublefns = floatfns \end{haskell} \subsection{Boolean expressions} \prog{gen\_not\_exp dp b} generates a ``not'' expression with value \prog{b}. \begin{haskell} > gen_not_exp :: Genfn Bool > gen_not_exp dp b = unary_exp not_name (dp-one) (Bool_val (not b)) \end{haskell} \prog{gen\_and\_exp dp b} generates an ``and'' expression with value \prog{b}. \begin{haskell} > gen_and_exp :: Genfn Bool > gen_and_exp dp b = bool_binary_exp and_name (dp-one) True b \end{haskell} The definition above raises a nice point regarding lazy evaluation: we could as easily have written \prog{b~True} as \prog{True~b} but, when \prog{b} is \prog{False}, the former would give an expression in which the second operand of the expression was not evaluated. The latter forces evaluation of both operands, and thus tests the compiler more comprehensively. The same remark applies to the following function. \prog{gen\_or\_exp dp b} generates an ``or'' expression with value \prog{b}. \begin{haskell} > gen_or_exp :: Genfn Bool > gen_or_exp dp b = bool_binary_exp or_name (dp-one) False b \end{haskell} \prog{gen\_less\_exp dp b} generates an integer ``less than'' expression with value \prog{b}. Clearly it is simple to extend the \HPG\ with other comparison expressions yielding a boolean result. \begin{haskell} > gen_less_exp :: Genfn Bool > gen_less_exp dp True = int_binary_exp less_name (dp-one) 0 1 > gen_less_exp dp False = int_binary_exp less_name (dp-one) 1 1 \end{haskell} \prog{boolfns} is the collected list of functions for generating boolean expressions, with their relative probabilities of selection. \begin{haskell} > boolfns :: Fnlist Bool > boolfns = [(1, (gen_not_exp,ok)), (1, (gen_and_exp,ok)), > (1, (gen_or_exp,ok)), (1, (gen_less_exp,ok))] \end{haskell} \subsection{Character expressions} \prog{gen\_decode\_exp dp c} generates a int-to-char expression with value \prog{c}. \begin{haskell} > gen_decode_exp :: Genfn Char > gen_decode_exp dp c = unary_exp int_to_char_name (dp-one) (int_val (fromEnum c)) \end{haskell} \prog{charfns} is the collected list of functions for generating character expressions, with their relative probabilities of selection. \begin{haskell} > charfns :: Fnlist Char > charfns = [(1, (gen_decode_exp,ok))] \end{haskell} \subsection{List expressions} \prog{gen\_list\_exp dp vs ec} generates an enumerated list expression. We decrease the depth counter by one on recursion to prevent explosion of the generated expression, since the other list functions increase the size considerably. \begin{haskell} > gen_list_exp :: Genfn [Value] > gen_list_exp dp vs ec = cmap (map (gen_exp (dp-one)) vs) > (\es -> ec (List_exp es)) \end{haskell} \prog{gen\_drop\_exp dp vs} generates a ``drop'' expression. This definition just concatenates the list to itself and then drops the first half. \begin{haskell} > gen_drop_exp :: Genfn [Value] > gen_drop_exp dp vs > = binary_exp drop_name (dp-one) (int_val (length vs)) (List_val (vs++vs)) \end{haskell} \prog{gen\_take\_exp dp vs} generates a ``take'' expression. This definition just concatenates the list to itself and then takes the first half. \begin{haskell} > gen_take_exp :: Genfn [Value] > gen_take_exp dp vs > = binary_exp take_name (dp-one) (int_val (length vs)) (List_val (vs++vs)) \end{haskell} %This subsection also needs something to generate ``ZF''expressions. \prog{listfns} is the collected list of functions for generating list expressions, with their relative probabilities of selection. \prog{gen\_list\_exp} has a high relative probability of selection to prevent explosion of the generated expression --- \prog{gen\_drop\_exp} and \prog{gen\_take\_exp} each double the size of the expression generated. \begin{haskell} > listfns :: Fnlist [Value] > listfns = [(8, (gen_list_exp,ok)), (1, (gen_drop_exp,ok)), > (1, (gen_take_exp,ok))] \end{haskell} \subsection{Tuple expressions} \prog{gen\_tuple\_exp dp vs ec} generates a ``tuple'' expression by just enumerating the elements. The \HPG\ does not currently generate any expressions containing functions whose result is a tuple. \begin{haskell} > gen_tuple_exp :: Genfn [Value] > gen_tuple_exp dp vs ec = cmap (map (gen_exp dp) vs) > (\es -> ec (Tuple_exp es)) \end{haskell} \prog{tuplefns} is the collected list of functions for generating tuple expressions, with their relative probabilities of selection. \begin{haskell} > tuplefns :: Fnlist [Value] > tuplefns = [(1, (gen_tuple_exp,ok))] \end{haskell} \subsection{Tagged expressions} \prog{gen\_tuple\_exp dp (c,vs)} generates a ``tagged'' expression. It does this by just generating expressions which evaluate to the arguments of the constructor of the value to be generated. \begin{haskell} > gen_tagged_exp :: Genfn (Constructor, [Value]) > gen_tagged_exp dp (c,vs) ec = cmap (map (gen_exp dp) vs) > (\es -> ec (Tagged_exp c es)) \end{haskell} \prog{taggedfns} is the collected list of functions for generating tagged expressions, with their relative probabilities of selection. \begin{haskell} > taggedfns :: Fnlist (Constructor, [Value]) > taggedfns = [(1, (gen_tagged_exp,ok))] \end{haskell} \subsection{Array expressions} \prog{gen\_array\_exp dp ((v,v'), avs)} generates an array expression. It separately generates expressions evaluating to the array bounds and to the association values comprising the array, and then reassembles them to form the array. \begin{haskell} > type Assoc a b = (a,b) > > gen_array_exp :: Genfn ((Value,Value), [Assoc Value Value]) > gen_array_exp dp ((v,v'), avs) ec > = gen_exp (dp-one) (Tuple_val [v,v']) > (\e -> cmap [binary_exp assoc_name (dp-one) v1 v2 > | (v1, v2) <- avs] > (\es -> ec (Array_exp e (List_exp es)))) \end{haskell} \prog{arrayfns} is the collected list of functions for generating array expressions, with their relative probabilities of selection. \begin{haskell} > arrayfns :: Fnlist ((Value,Value), [Assoc Value Value]) > arrayfns = [(1, (gen_array_exp,ok))] \end{haskell} \subsection{Lambda expressions} \label{exp.lambda} \prog{gen\_lambda\_exp dp v ec} generates a lambda expression with value \prog{v}. It does this by generating a \prog{Val\_name}, \prog{vn}, and an integer value, \prog{v'}, which it pushes onto the lambda environment; it then generates an expression, \prog{e}, with value \prog{v} in the modified environment, pops the top element of the lambda environment and applies \prog{ec} to one of two expressions. The first expression is \prog{($\backslash$vn -> e) vn}, where \prog{(vn,v')} has been added to the value environment; the second expression is \prog{($\backslash$vn -> e) e'} where \prog{e'} is an expression evaluating to \prog{v'}. \begin{haskell} > gen_lambda_exp :: Genfn Value > gen_lambda_exp dp v ec > = get_val_names one (\[vn] -> > gen_id_val (dp-one) (Basic_type (Num_type Int_type)) (\v' -> > push_lambda (vn, v') ( > gen_exp (dp-one) v (\e -> > pop_lambda (\ _ -> > choose [extend_val_env [(vn,v')] (ec (Apply_exp (Lambda_exp vn e) > (Id_exp vn))), > gen_exp (dp-one) v' > (\e' -> ec (Apply_exp (Lambda_exp vn e) e'))] > id))))) \end{haskell} \subsection{Auxiliary functions} \prog{pick\_exp fs dp x ec} applies \prog{ec} to an expression of depth $\leq$ \prog{dp} with value \prog{x}, generated by a function chosen at random from \prog{fs}. Each function in \prog{fs} comes with a filter which determines whether the function is capable of generating the value \prog{x}; if not, then \prog{pick\_exp} is called again. \begin{haskell} > pick_exp :: [(Int, (Genfn x, x -> Bool))] -> Genfn x > pick_exp fs dp x ec > = choosew fs (\(f,p) -> > if p x then f dp x ec else pick_exp fs dp x ec) \end{haskell} \prog{unary\_exp vn dp v ec} applies \prog{ec} to an ``apply'' expression whose first element is an ``identifier'' expression with value name \prog{vn} and whose second element is an expression of depth \prog{dp} and value \prog{v}. \begin{haskell} > unary_exp :: Val_name -> Genfn Value > unary_exp vn dp v ec = gen_exp dp v (\e -> ec (Apply_exp (Id_exp vn) e)) \end{haskell} \prog{binary\_exp vn dp v v'ec} applies \prog{ec} to an ``apply'' expression whose first element is itself an ``apply'' expression with first element an ``identifier'' expression with value name \prog{vn} and second element an expression of depth \prog{dp} and value \prog{v}; the second element is an expression of depth \prog{dp} and value \prog{v'}. \begin{haskell} > binary_exp :: Val_name -> Int -> Value -> Value -> Econt -> Cont > binary_exp vn dp v v' ec > = unary_exp vn dp v > (\e -> gen_exp dp v' > (\e' -> ec (Apply_exp e e'))) \end{haskell} \prog{int\_binary\_exp vn dp n n'} and \prog{bool\_binary\_exp vn dp b b'} are abbreviations for uses of \prog{binary\_exp} for generating integers and booleans respectively. \begin{haskell} > int_binary_exp :: Val_name -> Int -> Int -> Int -> Econt -> Cont > int_binary_exp vn dp n n' = binary_exp vn dp (int_val n) (int_val n') > bool_binary_exp :: Val_name -> Int -> Bool -> Bool -> Econt -> Cont > bool_binary_exp vn dp b b' = binary_exp vn dp (Bool_val b) (Bool_val b') \end{haskell} \prog{ok} is a filter which always returns \prog{True}. \begin{haskell} > ok :: x -> Bool > ok x = True \end{haskell}