÷ƒ’À;è TeX output 1999.06.22:1358‹ÿÿÿÿ •ºâ ý? £ þŸŒ÷‘s=óX«Q ff cmr12ºHaskš›¼ell–³/and“XML:“Generic“Com˜binators“or“T˜ypdCe-Based“T‘þÓ4ranslation?ŽŸy”’VÜóKñ`y ó3 cmr10»Malcolm–¦fW‘ÿeallace“and“Colin“RuncimanŽŽŽŽŽŸï#’²²Univ•²!ersit“y–¦fof“Y‘ÿeork,“UKŽŽ ú þ"‘íºâót ‰: cmbx9ÄAbstractŽ©阑íºâóo´‹Ç cmr9¹W‘ÿ:«e–¾£presenš¾9t“t˜w˜o“complemen˜tary“approac˜hes“to“writing“XMLŽ¤ ‘íºâdo•AÇcumen¾9t-pro“cessing–Tapplications“in“a“functional“language.Ž¡‘û:âIn–Úthe“ rst“approac¾9h,‘»the“generic“tree“structure“of“XMLŽ¡‘íºâdoAÇcumen¾9ts–ô«is“used“as“the“basis“for“the“design“of“a“libraryŽ¡‘íºâof–ÔMcom¾9binators“for“generic“proAÇcessing:‘ûíselection,‘áOgeneration,Ž¡‘íºâand–Ttransformation“of“XML“trees.Ž¡‘û:âThe–wsecond“approacš¾9h“is“to“use“a“t˜ypAÇe-translation“frame-Ž¡‘íºâwš¾9ork–,†for“treating“XML‘,KdoAÇcumen˜t“t˜ypAÇe“de nitions“(DTDs)“asŽ¡‘íºâdeclarations–ãžof“algebraic“data“t¾9ypAÇes,‘íand“a“deriv‘ÿ|ration“of“theŽ¡‘íºâcorrespšAÇonding–+jfunctions“for“reading“and“writing“do˜cumen¾9tsŽ¡‘íºâas–Ttš¾9ypAÇed“v‘ÿ|ralues“in“Hask˜ell.ŽŸü‘íºâÄ1Ž‘ý´oIn´CtroK¼ductionŽ¦‘íºâ1.1Ž‘üñDoK¼cumen´Ct–ŒÊmarkup“languagesŽŸÏþ‘íºâ¹XML‘¢(Extensible–žMarkup“Language)“[1Ž‘Ÿþ]“is“a“recen¾9t“sim-Ž¡‘íºâpli cation–û•of“the“older“SGML‘û(Standardised“GeneralisedŽ¡‘íºâMarkup–­3Language)“standard“that“is“widely“used“in“the“pub-Ž¡‘íºâlishing–„äindustry‘ÿ:«.‘kIt“is“a“markup“language,‘àÇmeaning“thatŽ¡‘íºâit–cadds“structural“information“around“the“text“of“a“doAÇcu-Ž¡‘íºâmenš¾9t.‘O:It–{—is“extensible,‘•(meaning“that“the“v˜oAÇcabulary“of“theŽ¡‘íºâmarkup–§is“not“ xed“{“eacš¾9h“doAÇcumen˜t“can“con˜tain“or“refer-Ž¡‘íºâence–žùa“meta-došAÇcumen¾9t,‘¶¥called“a“DTD‘žÛ(Do˜cumenš¾9t“T˜ypAÇe“Def-Ž¡‘íºâinition),‘0¸whic¾9h–+=describAÇes“the“particular“markup“capabilitiesŽ¡‘íºâused.Ž¡‘û:âThe–…Ause“of“XML‘…$is“not“ho•¾9w“ev“er–…Arestricted“to“the“tradi-Ž¡‘íºâtional–Ðidea“of“a“došAÇcumen•¾9t.‘ÉDMan“y–Ðorganisations“are“prop˜osingŽ¡‘íºâto–3¾use“XML‘3„as“an“in•¾9terc“hange–3¾format“for“pure“data“proAÇducedŽ¡‘íºâbš¾9y–©¢applications“lik˜e“graph-plotters,–εspreadsheets,“and‘©¢rela-Ž¡‘íºâtional‘Tdatabases.Ž¡‘û:âHTML‘éŽ(HypAÇer-T‘ÿ:«ext–êIMarkup“Language)“is“one“w¾9ell-Ž¡‘íºâknoš¾9wn–¹ƒexample“of“an“instance“of“SGML‘¹{“ev˜ery“HTMLŽ¡‘íºâdošAÇcumen¾9t–is“an“SGML‘ûdo˜cumen¾9t“conforming“to“a“particu-Ž¡‘íºâlar–oDTD.“Where“XML‘mimpro•¾9v“es›oo“v“er˜SGML‘mis˜in˜remo“vingŽ¡‘íºâshorthand–Pforms“that“require“an“application“to“ha•¾9v“e‘Pkno“wl-Ž¡‘íºâedge–ûHof“a“doAÇcumen¾9t's“DTD.“F‘ÿ:«or“instance,‘4Äin“HTML‘û someŽ¡‘íºâmarkup–T(sucš¾9h“as“a“n˜um˜bAÇered“list)“requires“an“end“mark˜er;Ž¡‘íºâother–à forms“(sucš¾9h“as“paragraphs)“ha˜v˜e“implicit“end“mark˜ersŽ¡‘íºâundersto•AÇo“d–’–when“the“next“similar“form“starts;‘¾+and“y¾9et“otherŽ¡‘íºâmarkup–(sucš¾9h“as“in-line“images)“is“self-con˜tained“and“needsŽ‘íºâŸÀ‰ff_ÿ Ÿ™šŸ÷@ó|{Ycmr8¼T‘ÿJªo–mÂappš¹,‘ÎsandŽ¡’õºâend–…tags“ƹ,‘[Ñwhere“Ætag“¹is“an“arbitrary“name.‘,ThereŽ¡’õºâis–ì×spAÇecial“synš¾9tax“for“an“empt˜y“elemen˜t:‘ËuÆ“¹is“exactlyŽ¡’õºâequiv‘ÿ|ralenš¾9t–×_to“ƹ.‘ÉThe“start“and“end“tags“for“eac˜hŽ¡’õºâelemen•¾9t›«Écon“tain˜a˜tag˜name,‘Àæwhic“h˜iden“ti es˜seman“tic˜infor-Ž¡’õºâmation–‰abAÇout“the“structure,‘Gindicating“ho¾9w“the“enclosed“con-Ž¡’õºâtenš¾9t–Köshould“bAÇe“in˜terpreted.‘ÀUThe“start“tag“ma˜y“also“con˜tainŽ¡’õºâattributes,›‰Éwhic¾9h–fåare“simple“name/v‘ÿ|ralue“bindings,˜pro¾9vidingŽ¡’õºâfurther–Þinformation“abAÇout“the“elemenš¾9t.‘¶Figure“1“sho˜ws“anŽ¡’õºâexample–2XML“došAÇcumen¾9t,‘Ÿillustrating“all“these“comp˜onen¾9ts.Ž¦’õºâÄ1.3Ž’ üñRepresenš´Cting–ŒÊXML“in“Hask˜ellŽŸÏþ’õºâ¹This–papšAÇer“is“ab˜out“pro˜cessing“XML‘Óusing“the“functionalŽ¡’õºâlanguage–ýKHaskš¾9ell.Ÿü-=ó¹Aa¨cmr6½1ŽŽ‘ þû¹MoAÇdern“functional“languages“are“w˜ell-Ž¡’õºâequippšAÇed–}½to“deal“with“tree-structured“data,‘—Øso“one“exp˜ectsŽ¡’õºâthe–‘Òlanguage“to“bšAÇe“a“go˜o˜d“ t“for“the“application.‘‘êEv¾9enŽ¡’õºâso,‘Ò4a–¬nkš¾9ey“issue“is“just“ho˜w“to“represen˜t“doAÇcumen˜ts,‘Ò4and“inŽ¡’õºâparticular–GIhoš¾9w“to“reconcile“the“DTD‘Fúdatat˜ypAÇe“de nitionsŽ¡’õºâincluded–WKin“XML‘WdoAÇcumenš¾9ts“with“the“data“t˜ypšAÇes“that“can“b˜eŽ¡’õºâde ned– ïin“Haskš¾9ell.‘ÊùW‘ÿ:«e“ha˜v˜e“in˜v˜estigated“t˜w˜o“complemen˜taryŽ¡’õºâapproac¾9hes:ŽŸ33’øé"(1)ŽŽŽ’ ºäDe ne–‚an“inš¾9ternal“data“structure“that“represen˜ts“con-Ž¡’ ºätenš¾9ts–©Žof“Åany“¹XML‘©sdo•AÇcumen˜t,‘¿indep“enden˜t–©Žof“all“DTDs.ŽŸ34’øé"(2)ŽŽŽ’ ºäGiv¾9en–\vthe“DTD›\dfor“some“XML˜doAÇcumenš¾9ts“of“in˜terest,Ž¡’ ºäsystematically–#ÏÅderive“¹de nitions“for“inš¾9ternal“Hask˜ellŽ’õºâŸ@‰ff_ÿ Ÿ× ‘ r}Ÿüûró†›Zcmr5°1ŽŽŽ‘YóÙ“ Rcmr7±The–8pXML‘8Mtoš7olkit“from“this“pap˜er“is“aÈãv‘Çailable“on“the“WWW‘8MatŽŸóßCÊscmtt8Éhttp://www.cs.york.ac.uk/fp/HaXml/ŽŽŽŽŽŽŽŒ‹* •ºâ ý? £Ÿéff ý£™š‘íºâ„ffðŽŸ33‘íºâÆŽ¤ ‘íºâŽ¡‘íºâŽ¡‘÷.Time‘¹–OutŽ¡‘÷.Dave–¹–Brubeck“QuartetŽ¡‘÷.Ž¡‘¡:Ž¡‘÷.Ž¡¡‘÷.Ž¡‘÷.Ž¡‘÷.Ž¡‘÷.Ž¡¡‘÷.Ž¡‘¡:Ž¡‘¡:Ž¡‘¡:Ž¡‘¡:Ž¡‘÷.Ž¡¡‘÷.Ž¡‘¡:Ž¡‘¡:Ž¡‘¡:Ž¡‘¡:Ž¡‘¡:Ž¡‘¡:Ž¡‘¡:Ž¡‘÷.Ž¡¡‘÷.Ž¡‘¡:Possibly–¹–the“DBQ's“most“famous“album,Ž¡‘¡:this‘¹–containsŽ¡‘¡:Take“Five,Ž¡‘¡:the–¹–most“famous“jazz“track“of“that“period.Ž¡‘¡:These–¹–experiments“in“different“timeŽ¡‘¡:signatures–¹–are“what“Dave“Brubeck“is“mostŽ¡‘¡:remembered–¹–for.‘ s,Recorded“Jun-Aug“1959Ž¡‘¡:in–¹–NYC.‘ s,See“also“the“sequel,Ž¡‘ fŽ¡‘‡’Time–¹–Further“Out.Ž¡‘÷.Ž¡‘íºâŽŸ33‘…†¹Figure–T1:‘pAn“example“XML“doAÇcumen¾9t.ŽŽ¡‘íºâ„ffðŽŽŽŽ ý€’ ºädata–m@tš¾9ypAÇes“to“represen˜t“them.‘$4These“de nitions“areŽ¤ ’ ºäclosely–Tbased“on“the“spAÇeci c“DTD.ŽŸ33’:âAdv‘ÿ|ran¾9tages–ùof“(1)“include“Ågenericity“¹and“Åfunction-levelŽ¡’õºâscripting¹.‘DÕGeneric–"Ëapplications“handle“a“wide“class“of“XMLŽ¡’õºâdošAÇcumen¾9ts,‘RÐnot–F„just“those“sharing“a“sp˜eci c“DTD.“One“ex-Ž¡’õºâample–RPof“a“completely“generic“application“is“searc¾9hing“doAÇc-Ž¡’õºâumenš¾9ts–u Ø|–¹–CText“StringŽ¦‘íºâ¹Because–?functional“languages“are“go•AÇo“d–?at“proAÇcessing“tree-Ž¡‘íºâstructured–“–data,‘ó&there“is“a“natural“ t“bAÇet•¾9w“een–“–the“XMLŽ¡‘íºâdoAÇcumenš¾9t–Ädomain“and“Hask˜ell“tree“datat˜ypAÇes.‘•In“simpli edŽ¡‘íºâform,‘™Ithe–~åmain“datatš¾9ypAÇes“whic˜h“mošAÇdel“an“XML‘~Êdo˜cumen¾9tŽ¡‘íºâare–7 ÆElement“¹and“ÆContent¹,‘€3whose“de nitions“are“m¾9utuallyŽ¡‘íºârecursivš¾9e,–Ttogether“forming“a“m˜ulti-branc˜h“tree“structure.ŽŸ—ü‘íºâÄThe–ŒÊ lter“t´CypK¼eŽ¦‘¡:Ætype–¹–CFilter“=“Content“->“[Content]Ž¦‘íºâ¹Our–)†basic“t¾9ypšAÇe“for“all“do˜cumen¾9t“pro˜cessing“functions“is“theŽ¡‘íºâÅc‡ontent‘áü lter¹,‘ÞOwhic•¾9h›¶tak“es˜a˜fragmen“t˜of˜the˜con“ten“t˜of˜anŽ¡‘íºâXML‘9—došAÇcumen¾9t–9 (whether“that“b˜e“some“text,‘B³or“a“completeŽ¡‘íºâtagged–p»elemenš¾9t),‘‘§and“returns“some“sequence“of“con˜ten˜t.‘å’TheŽ¡‘íºâresult–³ylist“mighš¾9t“bAÇe“empt˜y‘ÿ:«,‘Ç it“migh˜t“con˜tain“a“single“item,‘Ç orŽ¡‘íºâit–Tcould“con¾9tain“a“large“collection“of“items.Ž¡‘û:âSome–¨ lters“are“used“to“select“parts“of“the“input“doAÇcu-Ž¡‘íºâmen¾9t,‘\xand–N=others“are“used“to“construct“parts“of“the“outputŽ¡‘íºâdoAÇcumenš¾9t.‘ÆTThey–all“share“the“same“basic“t˜yp•AÇe,‘F«b“ecause‘whenŽ¡‘íºâbuilding–±ïa“new“doAÇcumenš¾9t,‘Ùthe“in˜ten˜tion“is“to“re-use“or“ex-Ž¡‘íºâtract–Éšinformation“from“parts“of“the“old“doAÇcumen¾9t.‘9AWhereŽ¡‘íºâthe–óƒresult“of“a“ lter“is“either“empt¾9y“or“a“singleton,‘úFthe“ lterŽ¡‘íºâcan– sometimes“bAÇe“though¾9t“of“as“a“Åpr•‡e“dic“ate¹,‘Odeciding‘ whetherŽ¡‘íºâor–Tnot“to“k¾9eep“its“input.ŽŸ—ü‘íºâÄProgram‘ŒÊwrappK¼erŽ¦‘¡:ÆprocessXMLwith–¹–::“CFilter“->“IO“()Ž¦‘íºâ¹W‘ÿ:«e–!¤assume“a“top-levš¾9el“wrappAÇer“function,‘$·whic˜h“getsŽ¡‘íºâcommand-line–'argumenš¾9ts,‘¿Úparses“an“XML‘e le“in˜to“theŽ¡‘íºâÆContent–Ÿ¹t¾9ypAÇe,›ë±applies“a“ lter,˜and“prett•¾9y-prin“ts–Ÿthe“out-Ž¡‘íºâput–Ò·doAÇcumenš¾9t.‘T™The“giv˜en“ lter“is“applied“to“the“top-lev˜elŽ¡‘íºâenclosing–Telemenš¾9t“of“the“doAÇcumen˜t.ŽŸ—ü‘íºâÄBasic‘C lters‘ ?ü¹A‘Ãcomplete–álist“of“prede ned“ lters“is“sho¾9wnŽ¡‘íºâin–ÝÀFigure“2.‘u³The“simplest“pAÇossible“ lters:‘­HÆnone“¹takš¾9es“an˜yŽ¡‘íºâcon•¾9ten“t–:†and“returns“nothing;‘ÍÆkeep“¹takš¾9es“an˜y“con˜ten˜t“andŽ¡‘íºâreturns–Ì(just“that“item.‘ Algebraically‘ÿ:«,‘ÚËthese“are“the“zero“andŽ¡‘íºâunit‘T lters.ŽŸ33‘øæÈŽŽŽ‘ºäÅPr•‡e“dic“ate–&‰and“sele‡ction“ lters¹.‘ The–ê lter“Æelm“¹is“a“pred-Ž¡‘ºäicate,›‡returning–Cäjust“this“item“if“it“is“an“elemen¾9t,˜orŽ¡‘ºänothing›z otherwise.Ÿü-=½4ŽŽ‘ uC¹Con•¾9v“ersely‘ÿ:«,‘“<Ætxt˜¹returns˜this˜itemŽ¡‘ºäonly–׬if“is“plain“text,Ÿü-=½5ŽŽ‘¨¹and“nothing“otherwise.‘ãThe“ lterŽ¡‘ºäÆchildren–Ÿ¹returns“the“immediate“cš¾9hildren“of“an“elemen˜tŽ¡‘ºäif–Eit“has“anš¾9y‘ÿ:«,‘«Aor“nothing“if“this“con˜ten˜t-item“is“not“anŽ¡‘ºäelemen¾9t.‘ÖThe–û† lter“Ætag‘¹–t“¹returns“this“item“only“if“it“isŽ‘íºâŸ€‰ff_ÿ Ÿ× ‘ r}Ÿüûr°4ŽŽŽ‘Y±The––}shortened“name“Éelm“±wšÈãas“c˜hosen“to“a˜v˜oid“a“clash“with“theŽ¤Standard–±ÈPrelude“function“Éelem±.ŽŸ£Ù‘ r}Ÿüûr°5ŽŽŽ‘Y±F‘ÿZªor–î those“familiar“with“the“detail“of“XML,“en•Èãtit“y–î references“withinŽ¡the–±Èdo7cumenÈãt“are“treated“as“plain“text.ŽŽŽ þìÌÌ þ‰™š’õºâ„ffðŽŸ33’õºâÄPredicatesŽŽ¤ ’¡:Ænone,‘:$Åzer•‡o/failur“eŽŽ¡’¡:Ækeep,‘:$Åidentity/suc•‡c“essŽŽ¡’¡:Æelm,‘>Ý°Åtagge‡d‘NÝ°Åname‡d‘NÝ°Åelement–N“CFilterŽŽ¡’¡:attrval‘0°îÅelement–N“CFilterŽŽ¡¡’õºâÄSelectionŽŽ¡’¡:Æchildren‘+÷XÅchildr‡en–N“CFilterŽŽ¡¡’õºâÄConstructionŽŽ¡’¡:Æliteral,‘+÷XÅbuild–N“CFilterŽŽ¡’¡:mkElem‘5j„Åbuild‘N“[CFilter]“->“CFilterŽŽ¡’¡:mkElemAttrs‘Ê–Åbuild–N“[(String,CFilter)]ŽŽ¡’?°->–¹–[CFilter]“->“CFilterŽŽ¡’¡:replaceTag‘"„,År•‡eplac“e–N“CFilterŽŽ¡’¡:replaceAttrs‘År•‡eplac“e–N“CFilterŽŽ¡Ÿ33’.[¹Figure–T2:‘pBasic“con•¾9ten“t‘T lters.ŽŽ¡’õºâ„ffðŽŽŸšc’ ºäan–lÔelemen¾9t“whose“tag“name“is“the“string“Æt¹.‘"ïThe“ lterŽ¤ ’ ºäÆattr‘¹–a–³“¹returns“this“item“only“if“it“is“an“elemen¾9t“con-Ž¡’ ºätaining–KEthe“attribute“name“Æa¹.‘ÙThe“ lter“Æattrval‘¹–(a,v)Ž¡’ ºä¹returns–À2this“item“only“if“is“an“elemenš¾9t“con˜taining“theŽ¡’ ºäattribute–TÆa“¹with“the“v‘ÿ|ralue“Æv¹.ŽŸf’æÈŽŽŽ’ ºäÅConstruction‘†% lters¹.‘ÒøThe–R,function“Æliteral‘¹–s“¹mak¾9es“aŽ¡’ ºätext›ë&con•¾9ten“t˜con“taining˜just˜the˜string˜Æs¹.‘aThe˜functionŽ¡’ ºäÆmkElem–¹–t“fs–iå¹builds“a“con•¾9ten“t›iåelemen“t˜with˜the˜tag˜Æt¹;Ž¡’ ºäthe–eargumenš¾9t“Æfs“¹is“a“list“of“ lters,‘¹ eac˜h“of“whic˜h“isŽ¡’ ºäapplied–%×to“the“curren¾9t“item,‘i÷and“all“their“results“areŽ¡’ ºäcollected–û}to“bAÇecome“the“cš¾9hildren“of“the“new“elemen˜t.Ž¡’ ºäThe– žfunction“ÆmkElemAttrs–¹–t“avs“fs– ž¹is“just“lik¾9e“ÆmkElemŽ¡’ ºä¹except–q5that“its“extra“parameter“Æavs“¹is“a“list“of“attributeŽ¡’ ºäv‘ÿ|raluesŸü-=½6ŽŽ‘?û¹to–TbAÇe“attac¾9hed“to“the“tag.ŽŸ´/’:âA‘Kuseful–‹ lter“whicš¾9h“in˜v˜olv˜es“bAÇoth“selection“and“construc-Ž¡’õºâtion–|eis“ÆshowAttr‘¹–a¹,‘šûwhic¾9h“extracts“the“v‘ÿ|ralue“of“the“attributeŽ¡’õºâÆa–ÛY¹from“the“currenš¾9t“elemen˜t“and“returns“just“that“string“as“aŽ¡’õºâpiece–Tof“con•¾9ten“t.Ž¡’:âWhen–¦constructing“a“new“doAÇcumen¾9t“(e.g.“the“scriptŽ¡’õºâin–¡sFigure“4“whic¾9h“generates“HTML),“the“ÆmkElem“¹func-Ž¡’õºâtion–àìošAÇccurs“rep˜eatedly–ÿ:«.‘7W“e–àìde ne“and“use“a“small“libraryŽ’õºâŸ 2‰ff_ÿ Ÿ× ‘ r}Ÿüûr°6ŽŽŽ‘Y±Actually‘ÿZª,‘ø®a–ê€list“of“attribute/ lter“pairs.‘8cEacÈãh“ lter“is“applied“toŽ¤the–£FcurrenšÈãt“elemen˜t“and“the“resultan˜t“con˜ten˜t“is“ attened“to“a“stringŽ¡v‘Çalue–±ÈwhicÈãh“is“assigned“to“the“named“attribute.ŽŽŽŽŽŸ’çjã¹3ŽŽŒ‹4 •ºâ ý? £ ý€‘íºâ¹of–0Ôfunctions“sucš¾9h“as“Æhtable¹,–·´Æhrow¹,“and–0ÔÆhcol“¹whic˜h“areŽ¤ ‘íºâjust–*synon¾9yms“for“particular“applications“of“ÆmkElem“¹andŽ¡‘íºâÆmkElemAttrs–@(¹to“di erenš¾9t“tagnames,‘JÝreducing“v˜erbAÇosit˜y“andŽ¡‘íºâmaking–Tthe“syn¾9tax“rather“more“readable.Ž¡‘û:âAlso–߸for“con•¾9v“enience,‘êqw“e–߸de ne“the“new“opAÇerators“Æ?‘ ‘¹andŽ¡‘íºâÆ!‘îï¹as–ŒÒsynonš¾9yms“for“ÆshowAttr“¹and“Æliteral“¹respAÇectiv˜ely:‘Ø/theyŽ¡‘íºâare–sused“in“a“brac•¾9k“eted–spAÇost x“notation,Ÿü-=½7ŽŽ‘>z¹a“st¾9yle“some“pro-Ž¡‘íºâgrammers‘Tprefer.Ž©U‘íºâÄ2.2Ž‘üñCom´CbinatorsŽŸÏþ‘íºâ¹The–€±comš¾9binators“used“as“in˜termediate“coAÇde“in“compilers“canŽ¡‘íºârender–¹programs“`totally“un t“for“h¾9uman“consumption'“[11Ž‘ ?ü]!Ž¡‘íºâHo•¾9w“ev“er,‘¯Ethe–•Áidea“of“a“com¾9binator“library“for“a“spAÇeci c“classŽ¡‘íºâof–œÍapplications“is“to“ac•¾9hiev“e–œÍa“form“of“expression“that“is“nat-Ž¡‘íºâural–ûofor“the“problem.‘ÎA‘ûhcomš¾9binator“library“should“bAÇe“lik˜e“aŽ¡‘íºâlanguage–£Žextension“tailored“to“the“problem“domain“[4Ž‘Ÿþ].‘ÇInŽ¡‘íºâthis–Ø»sense,›äÚfunctional“languages“are“extensible,˜just“as“XMLŽ¡‘íºâitself–)5is“extensible.‘XThe“com¾9binators“are“higher-order“op-Ž¡‘íºâerators–{¼serving“as“`glue'[6Ž‘Ÿþ]“to“assemš¾9ble“functions“in˜to“moreŽ¡‘íºâpAÇo•¾9w“erful›“com“binations.‘ñ W‘ÿ:«e˜aim˜to˜k“eep˜the˜t“yp•AÇes˜of˜comp“o-Ž¡‘íºânenš¾9t–¥°functions“as“uniform“as“pAÇossible“so“that“an˜y“functionŽ¡‘íºâcan–Z3bšAÇe“comp˜osed“with“an¾9y“other.‘ë Within“the“lexical“limitsŽ¡‘íºâof–Þ*the“host“language,‘é2cš¾9hoice“of“notation“should“follo˜w“appli-Ž¡‘íºâcation›ßacon•¾9v“en“tions:‘°‰in˜Hask“ell˜w“e˜can,‘ãwhere˜appropriate,Ž¡‘íºâde ne–Tnew“in x“opšAÇerator“sym¾9b˜ols“for“com¾9binators.Ž¡‘û:âSo,›Ø÷ha¾9ving–~£de ned“some“basic“ lters“already‘ÿ:«,˜in“whatŽ¡‘íºâw•¾9a“ys–©?can“these“usefully“bAÇe“comš¾9bined“in˜to“more“in˜terestingŽ¡‘íºâand–Tcomplex“ lters?‘p(See“Figure“3.)Ž¡‘û:âThe–õmost“impAÇortanš¾9t“and“useful“ lter“com˜binator“is“Æ`o`¹.Ž¡‘íºâW‘ÿ:«e–ѱcall“this“opšAÇerator“Irish“comp˜osition,‘Éfor“reasons“whic¾9hŽ¡‘íºâshould–5fbAÇe“obš¾9vious.‘ÑËIt“plugs“t˜w˜o“ lters“together:‘¬ythe“left“ lterŽ¡‘íºâis–“f“:>“g–™ò¹is“a“functional“c¾9hoice“opAÇerator;Ž¡‘íºâif–»¬the“(predicate)“ lter“Æp“¹is“proAÇductiv¾9e,‘åBthen“the“ lter“Æf“¹isŽ¡‘íºâapplied,‘ìrotherwise–ÁlÆg“¹is“applied.‘ ¸F‘ÿ:«rom“this“is“deriv¾9ed“a“di-Ž¡‘íºârected–¾öcš¾9hoice“opAÇerator:‘o³Æf–¹–|>|“g–¾ö¹giv˜es“either“the“results“ofŽ¡‘íºâÆf¹,–Tor“those“of“Æg“¹only“if“Æf“¹is“unproAÇductiv¾9e.Ž¦‘íºâÄGeneralised–·³P´Cath“Selectors‘ ?ü¹Selection–÷Kof“subtrees“b¾9yŽ¡‘íºâÅp•‡ath‘•œp“atterns–y”¹is“familiar“to“users“of“the“Unix“ le-system,Ž¡‘íºâwhere–X suc¾9h“patterns“are“used“to“access“directory“structure,Ž¡‘íºâusing–z9a“Æ/“¹notation“to“indicate“the“`con¾9taining'“relation.‘è¼Sim-Ž¡‘íºâilar–¹patterns“are“used“in“XSL‘ÿ:«T,“an“XML‘¸–transformationŽ¡‘íºâlanguage–‰[3Ž‘Ÿþ].‘wÁIn“this“connection,‘æ wš¾9e“de ne“t˜w˜o“path“se-Ž¡‘íºâlection–,¥comš¾9binators“Æ/>“¹and“Æ“¹isŽ¡‘íºâan–ýŒ`in¾9terior'“selector,‘Nreturning“the“inner“structure;‘zÆ),‘1ë–Åinterior‘N|)‘1ë–Ådir•‡e“cte“d‘N“CFilter“->“CFilterŽŽ¡¡’ÿ.f–¹–`o`“g‘%Ì°=“concat“.“map“f“.“gŽŽ¡’ÿ.f–¹–|||“g‘%Ì°=“\c->“f“c“++“g“cŽŽ¡’ÿ.f–¹–`with`“g‘Ÿî=“filter“(not.null.g)“.“fŽŽ¡’ÿ.f–¹–`without`“g‘ s,=“filter“(null.g)“.“fŽŽ¡’ÿ.f–¹–/>“g‘*†F=“g“`o`“children“`o`“fŽŽ¡’ÿ.f–¹–|“g‘%Ì°=“f“?>“f“:>“gŽŽ¡¡’ÿ.cat‘;^ÂÅc•‡onc“atenate‘N“CFilterŽŽ¡¡’ÿ.cat–¹–fs‘æX=“\c->“concat.“map“(\f->f“c)“fsŽŽ¡¡’ÿ.et‘@XÅdisjoint‘NCFilter)“->“CFilter“->“CFilterŽŽ¡¡’ÿ.f–¹–`et`“g‘ s,=“(f“`oo`“tagged“elm)ŽŽ¡’7á|>|–¹–(g“`o`“txt)ŽŽ¡¡’ÿ.(?>)‘6¥,Åif-then-else‘N“ThenElse“CFilter“->“CFilterŽŽ¡¡’ÿ.data–¹–ThenElse“a“=“a“:>“aŽŽ¡’ÿ.p–¹–?>“f“:>“g“=“\c->“if“(not.null.p)“cŽŽ¡’Xô0then–¹–f“c“else“g“cŽŽ¡¡’ÿ.chip,‘1ë–Å\in-plac•‡e"›N“CFilterŽŽ¡¡’ÿ.deep–¹–f‘æX=“f“|>|“(deep“f“`o`“children)ŽŽ¡’ÿ.deepest–¹–f“=“(deepest“f“`o`“children)“|>|“fŽŽ¡’ÿ.multi–¹–f‘,Â=“f“|||“(multi“f“`o`“children)ŽŽ¡’ÿ.foldXml–¹–f“=“f“`o`“(chip“(foldXml“f))ŽŽ¡Ÿ33’Vû¹Figure–T3:‘pFilter“com¾9binators“and“their“de nitions.ŽŽ¡’õºâ„ffðŽŽŸ’õºâÄAn–bIediting“com´Cbinator‘ ?ü¹Aside–from“predicates,‘Eëselectors,Ž¤ ’õºâc•¾9hoice,‘¦$and›ŠWconstructiv“e˜ lters,‘¦$there˜is˜one˜v“ery˜useful˜com-Ž¡’õºâbinator–r«whicš¾9h“stands“in“its“o˜wn“category“{“an“editing“com˜bi-Ž¡’õºânator.‘ÛÆchip‘¹–f–”¹proAÇcesses“the“cš¾9hildren“of“an“elemen˜t“in-place:Ž¡’õºâthe–VK lter“Æf“¹is“applied“to“its“c¾9hildren;‘vÆthe“results“are“rebuiltŽ¡’õºâas–Tthe“new“cš¾9hildren“of“that“same“elemen˜t.ŽŸ—ü’õºâÄRecursion‘ ?ü¹It–Ôâis“often“useful“to“express“recursiv¾9e“transfor-Ž¡’õºâmations–Ôkon“XML‘Ô[doAÇcumenš¾9ts:‘ûütransformations“whic˜h“can“bAÇeŽ¡’õºâapplied–Tat“manš¾9y“di eren˜t“lev˜els“of“the“doAÇcumen˜t“tree.Ž¡’:âOne–8yfamily“of“suc¾9h“expressions“is“useful“primarily“in“se-ŽŽŽŽŽŸ’çjã4ŽŽŒ‹OÛ •ºâ ý? £ ý€‘íºâ¹lecting–‘«a“subtree“from“an“arbitrarily“deep“loAÇcation,‘¬althoughŽ¤ ‘íºâthey–¨can“of“course“bAÇe“used“for“editing“and“ ltering“as“w¾9ellŽ¡‘íºâas–selection.‘!The“recursivš¾9e“com˜binator“Ædeep‘¹–f“Åp‡otentialxälyŽ¡‘íºâ¹pushes–Uthe“action“of“ lter“Æf“¹deep“inside“the“doAÇcumen¾9t“sub-Ž¡‘íºâtree.‘NÆIt–& rst“tries“the“givš¾9en“ lter“on“the“curren˜t“item:‘=ÿifŽ¡‘íºâit–xýis“proAÇductiv¾9e“then“it“stops“here,‘‘èbut“if“no“results“are“re-Ž¡‘íºâturned,‘£then–Çit“mo•¾9v“es–Çto“the“c¾9hildren“and“tries“again“recur-Ž¡‘íºâsivš¾9ely‘ÿ:«.‘º¢When–Ÿeused“with“a“predicate,‘Áéthis“strategy“searc˜hesŽ¡‘íºâfor–Éìthe“topmost“matcš¾9hing“elemen˜ts“in“the“tree.‘:9There“areŽ¡‘íºâv‘ÿ|rariations:‘ÊUÆdeepest–lF¹searcš¾9hes“for“the“bAÇottommost“matc˜hingŽ¡‘íºâelemenš¾9ts;‘ ×Æmulti–ºV¹returns“all“matc˜hes,‘ã—ev˜en“those“whic˜h“areŽ¡‘íºâsub-trees–@ of“other“matc•¾9hes.‘œœHo“w“ev“er,›J¼as–@ already“noted,˜theŽ¡‘íºâaction–•øof“these“com¾9binators“is“not“restricted“to“predicates“orŽ¡‘íºâselectors.Ž¡‘û:âAnother›rÀpAÇo•¾9w“erful˜recursion˜com“binator˜is˜ÆfoldXml¹:‘×HtheŽ¡‘íºâexpression–^iÆfoldXml‘¹–f“¹applies“the“ lter“Æf“¹to“evš¾9ery“lev˜el“of“theŽ¡‘íºâtree,‘•æfrom–|/the“lea•¾9v“es›|/up“w“ards˜to˜the˜roAÇot˜(at˜least˜concep-Ž¡‘íºâtually–¿6{“of“course“lazy“ev‘ÿ|raluation“makš¾9es“this“more“ecien˜t).ŽŸ—ü‘íºâÄ2.3Ž‘üñExampleŽŸÏþ‘íºâ¹The–&use“of“these“ lters“and“com¾9binators“is“illustrated“in“anŽ¡‘íºâexample–Y1script“in“Figure“4.‘èThis“program“transforms“anŽ¡‘íºâÆ–|"¹elemenš¾9t“in˜to“an“HTML‘|doAÇcumen˜t“that“pro˜vides“aŽ¡‘íºâformatted–Ü=summary‘ÿ:«.‘q*The“HTML‘Ü output,‘ ÷rendered“b¾9y“theŽ¡‘íºâNetscapAÇe–#Qbroš¾9wser,‘fÐis“illustrated“in“Figure“5.‘FhSuc˜h“a“taskŽ¡‘íºâmigh¾9t–TbAÇe“fairly“common“in“e-commerce“applications.Ž¡‘û:âW‘ÿ:«e–m§noš¾9w“describAÇe“some“of“the“salien˜t“features“of“the“ex-Ž¡‘íºâample.Ž¤33‘¡:Æ(albumf–¹–`o`“deep“(tag“"album"))Ž¡‘íºâ¹The–Û¼script“ rst“searcš¾9hes“recursiv˜ely“for“the“topmost“ele-Ž¤ ‘íºâmen¾9t–ªåtagged“ƹ,‘ÐIbAÇefore“applying“the“ lter“Æalbumf“¹toŽ¡‘íºâit.‘yjTh•¾9us,‘<it›4Rw“orks˜equally˜w“ell˜with˜an“y˜XML‘4Jsource˜doAÇcu-Ž¡‘íºâmenš¾9t–ãÉthat“con˜tains“an“Æ“¹elemen˜t“an˜ywhere“within“it,Ž¡‘íºâand–ÍÂ(correctly)“prošAÇduces“no“output“for“do˜cumenš¾9ts“whic˜h“doŽ¡‘íºânot–Tcon¾9tain“album“data.Ž¡‘û:âThe–ã¼output“doAÇcumenš¾9t's“Æ“¹section“con˜tains“theŽ¡‘íºâartist–3 name“and“album“title“separated“b¾9y“a“colon.‘uœW‘ÿ:«e“noteŽ¡‘íºâthat–Tthe“expression,Ž©33‘¡:Ætxt–¹–`o`“children“`o`“tag“"artist"Ž¡‘‡’`o`–¹–children“`o`“tag“"album"Ž¦‘íºâ¹whicš¾9h–Rygrabs“the“textual“con˜ten˜t“of“the“Æ“¹elemen˜tŽ¡‘íºâwithin–…rthe“Æ“¹elemenš¾9t,‘¡yis“somewhat“un˜wieldy‘ÿ:«.‘lÊMore-Ž¡‘íºâo•¾9v“er–ô¾its“trailing“test“for“the“Æ“¹tag“is“redundan¾9t,‘ûBsinceŽ¡‘íºâthe–@Rcalling“ lter“has“already“pAÇerformed“that“matc¾9h.‘iTheŽ¡‘íºâexpression–Tcan“bAÇe“simpli ed“b¾9y“using“path“selectors“to:Ž¦‘¡:Ækeep–¹–/>“tag“"artist"“/>“txtŽ¦‘íºâ¹and–©this“st¾9yle“is“used“elsewhere“in“the“example.‘ín(The“al-Ž¡‘íºâgebraic–t×laš¾9ws“in“Section“2.5“guaran˜tee“that“this“rewriting“isŽ¡‘íºâsafe.)Ž¡‘û:âSucš¾9h–Ý™expressions“mak˜e“some“assumptions“abAÇout“theŽ¡‘íºâstructure–jof“the“data“within“the“Æ“¹elemen¾9t.‘ãZIn“this“in-Ž¡‘íºâstance,‘c}the–7assumption“is“that“an“Æ“¹elemen¾9t“is“an“im-Ž¡‘íºâmediate–Ücš¾9hild,‘çƒand“that“Åits“¹immediate“c˜hildren“include“text.Ž¡‘íºâIf–ñJsucš¾9h“assumptions“pro˜v˜e“incorrect“for“a“particular“doAÇcu-Ž¡‘íºâmenš¾9t,–Tthe“ lter“is“simply“unproAÇductiv˜e;“no“error“is“ agged.Ž¡‘û:âWith–·Xa“suitable“de nition,‘Ê$Æhbody–¹–=“mkElemAttr“"BODY"Ž¡‘íºâ¹the‘TexpressionŽ¦‘¡:Æhbody‘ s,[("bgcolor",("white"!))]‘,Â[...]ŽŽŽ ý€’õºâ¹can–©©bšAÇe“understo˜o˜d“to“set“the“bac¾9kground“colour“attribute“ofŽ¤ ’õºâthe–²:Æ“¹tag“to“the“literal“v‘ÿ|ralue“Æwhite¹.‘ó#Notice“ho¾9w“theŽ¡’õºâattribute–q2v‘ÿ|ralue“is“itself“describAÇed“b¾9y“a“ lter.‘åºIn“this“case,‘’theŽ¡’õºâ lter–Ÿis“not“v¾9ery“exciting,‘òbut“the“later“de nition“of“ÆmkLinkŽ¡’õºâ¹illustrates–+àthe“generation“of“an“HTML‘+Úreference“b¾9y“loAÇokingŽ¡’õºâup–lvthe“v‘ÿ|ralue“of“a“supplied“Ælink“¹attribute“(using“the“Æ?‘ä&¹ lter).Ž¡’:âWhen– `the“script“is“used“on“the“particular“doAÇcumen¾9tŽ¡’õºâfrom– Figure“1,‘\“the“output“is“a“re-ordering“of“the“in¾9ternalŽ¡’õºâcompAÇonen¾9ts–H¢of“the“input:‘ƒ in“the“Æ“¹part“of“the“out-Ž¡’õºâput,‘“¹section“is“selected“and“transformed“b¾9yŽ¡’õºâÆnotesf–ÕɹbšAÇefore“the“Æ“¹elemen¾9ts“are“pro˜cessed“b¾9yŽ¡’õºâthe–ëbÆsummaryf“¹ lter.‘uAlthough“in“the“absence“of“a“DTD‘ëWit“isŽ¡’õºâimpšAÇossible–-vto“b˜e“sure“of“an¾9y“input“ordering,‘3~the“script“hereŽ¡’õºâensures–Tthat“the“output“ordering“is“consisten¾9t.Ž¡’:âThe–qäde nition“of“the“Ænotesf“¹ lter“is“in¾9teresting“bAÇe-Ž¡’õºâcause–NÛit“makš¾9es“few˜er“assumptions“abAÇout“the“con˜ten˜t“of“aŽ¡’õºâÆ–G ¹structure,‘“{and“in“addition“it“preserv¾9es“the“inputŽ¡’õºâordering.‘„WThe–Lcš¾9hained“if-then-else“c˜hoice“within“the“recur-Ž¡’õºâsivš¾9e–íhÆfoldXml“¹com˜binator“causes“all“in˜ternal“structure“of“theŽ¡’õºâÆ–V‹¹elemenš¾9t“to“bAÇe“retained“except“for“the“replacemen˜tŽ¡’õºâof–{#ƹs“bš¾9y“emphasised“text,‘Ô–and“ƹs“b˜yŽ¡’õºâHTML‘Tlinks.Ž¡’:âOne–&of“the“most“striking“features“of“the“example“as“aŽ¡’õºâwhole–¢ is“hoš¾9w“selection“and“testing“of“old“con˜ten˜t“and“con-Ž¡’õºâstruction–öof“new“con•¾9ten“t–öare“uniform,‘£and“can“bAÇe“com¾9binedŽ¡’õºâalmost‘Tin•¾9terc“hangeably‘ÿ:«.Ž¡’:âW‘ÿ:«e–]ìwill“return“to“the“treatmenš¾9t“of“Æ“¹elemen˜tsŽ¡’õºâin–4)Section“2.4“after“inš¾9troAÇducing“some“extra“Ålab‡elxäling“¹com˜bi-Ž¡’õºânators.ŽŸ]ž’õºâÄ2.4Ž’ üñLabK¼ellingsŽŸÏþ’õºâ¹One–6kfeature“that“is“oAÇccasionally“useful“is“the“abilitš¾9y“to“attac˜hŽ¡’õºâlabAÇels–%]to“items“in“a“sequence,›)`for“instance,˜to“n•¾9um“bAÇer–%]a“listŽ¡’õºâof–¸1items,‘àéor“to“treat“the“ rst/last“item“of“a“list“di eren¾9tlyŽ¡’õºâfrom–Ý¥the“other“items.‘ àF‘ÿ:«or“this“purpAÇose,‘èÈthe“library“pro¾9videsŽ¡’õºâsp•AÇecial›ußlab“elling˜com•¾9binators.‘çIW‘ÿ:«e˜c“hoAÇose˜to˜in“troAÇduce˜a˜newŽ¡’õºât¾9ypAÇe:Ž¤ñ’ÿ.Ætype–¹–LabelFilter“a“=“Content“->“[“(a,Content)“]Ž¡’õºâ¹A‘hÆLabelFilter–h.¹is“likš¾9e“a“ÆCFilter“¹except“it“attac˜hes“a“labAÇelŽ¤ ’õºâto–Ìeacš¾9h“of“its“results.‘@xW‘ÿ:«e“migh˜t“ha˜v˜e“c˜hosen“to“fold“labAÇelŽ¡’õºâv‘ÿ|ralues–@tinside“the“ÆContent“¹t¾9ypAÇe,‘K“LabelFilter“IntŽ¡’ÿ.interspersed–¹–::“a“->“CFilter“->“aŽ¡’‘§8->–¹–LabelFilter“aŽ¡’ÿ.tagged‘!::–¹–CFilter“->“LabelFilter“StringŽ¡’ÿ.attributed‘,Â::–¹–CFilter“->Ž¡’T:šLabelFilter‘¹–[(String,String)]Ž¦’õºâ¹These–àºlabAÇelling“functions“lift“a“ÆCFilter“¹to“the“ÆLabelFilterŽ¡’õºâ¹tš¾9ypAÇe:‘ç¸Ænumbered‘¹–f–zø¹transforms“the“ordinary“ lter“Æf“¹in˜to“aŽ¡’õºânew–H± lter“that“attacš¾9hes“in˜tegers“(from“1“up˜w˜ards)“to“theŽ¡’õºâresults–^of“Æf¹;‘)Æinterspersed–¹–a“f“z–^¹attac¾9hes“the“labAÇel“Æa“¹toŽ¡’õºâall–©øof“the“results“of“Æf“¹except“the“last,‘Ï!whic¾9h“gets“the“labAÇelŽ¡’õºâÆz¹;‘‡AÆtagged‘¹–f–aH¹labAÇels“evš¾9ery“tagged“elemen˜t“with“its“tag“nameŽ¡’õºâ(and–Wnon-elemenš¾9ts“with“the“empt˜y“string);‘ø®Æattributed‘¹–fŽ¡’õºâ¹labAÇels–&¼evš¾9ery“tagged“elemen˜t“with“its“attribute/v‘ÿ|ralue“pairsŽ¡’õºâ(and–Tnon-elemenš¾9ts“with“the“empt˜y“list).ŽŸº ’ÿ.Æ`oo`–¹–::“(a->CFilter)“->“LabelFilter“a“->“CFilterŽŽŽŽŽŸ’çjã¹5ŽŽŒ‹l; •ºâ ý? £Ÿîff ý™™š‘íºâ„ffðŽŸ33‘íºâÆmodule–¹–Main“whereŽ¤ ‘íºâimport‘¹–XmlŽ¡‘íºâmain‘¹–=Ž¡‘÷.processXMLwith–¹–(albumf“`o`“deep“(tag“"album"))Ž¡‘íºâalbumf‘¹–=Ž¡‘÷.htmlŽ¡‘¡:[‘¹–hheadŽ¡‘ f[‘¹–htitleŽ¡‘‡’[–¹–txt“`o`“children“`o`“tag“"artist"Ž¡‘/á`o`–¹–children“`o`“tag“"album"Ž¡‘‡’,–¹–literal“":“"Ž¡‘‡’,–¹–keep“/>“tag“"title"“/>“txtŽ¡‘‡’]Ž¡‘ f]Ž¡‘¡:,–¹–hbody“[("bgcolor",("white"!))]Ž¡‘ f[‘¹–hcenterŽ¡‘ú¾[–¹–h1“[“keep“/>“tag“"title"“/>“txt“]“]Ž¡‘ f,–¹–h2“[“("Notes"!)“]Ž¡‘ f,–¹–hpara“[“notesf“`o`“(keep“/>“tag“"notes")“]Ž¡‘ f,‘¹–summaryfŽ¡‘ f]Ž¡‘¡:]Ž¡‘íºânotesf‘¹–=Ž¡‘÷.foldXml–¹–(txt›8³?>“keep˜:>Ž¡‘!´Ttag–¹–"trackref"“?>“replaceTag“"EM"“:>Ž¡‘!´Ttag–¹–"albumref"“?>“mkLink‘/?Ü:>Ž¡‘!´Tchildren)Ž¡‘íºâsummaryf‘¹–=Ž¡‘÷.htable‘¹–[("BORDER",("1"!))]Ž¡‘¡:[–¹–hrow“[“hcol“[“("Album“title"!)“]Ž¡‘!´T,–¹–hcol“[“keep“/>“tag“"title"“/>“txt“]Ž¡‘!´T]Ž¡‘¡:,–¹–hrow“[“hcol“[“("Artist"!)“]Ž¡‘!´T,–¹–hcol“[“keep“/>“tag“"artist"“/>“txt“]Ž¡‘!´T]Ž¡‘¡:,–¹–hrow“[“hcol“[“("Recording“date"!)“]Ž¡‘!´T,–¹–hcol“[“keep“/>Ž¡‘Zg\tag–¹–"recordingdate"“/>“txt“]Ž¡‘!´T]Ž¡‘¡:,–¹–hrow“[“hcola“[“("VALIGN",("top"!))“]Ž¡‘G[–¹–("Catalog“numbers"!)“]Ž¡‘!´T,‘¹–hcolŽ¡‘+'€[‘¹–hlistŽ¡‘4š¬[–¹–catno“`oo`Ž¡‘BÇnnumbered–¹–(deep“(tag“"catalogno"))Ž¡‘4š¬]Ž¡‘+'€]Ž¡‘!´T]Ž¡‘¡:]Ž¡‘íºâcatno–¹–n“=Ž¡‘÷.mkElem‘¹–"LI"Ž¡‘¡:[–¹–((show“n++".“")!),– s,("label"?),“("number"?)Ž¡‘¡:,–¹–("“("!),– s,("format"?),“(")"!)‘¹–]Ž¡‘íºâmkLink‘¹–=Ž¡‘÷.mkElemAttr–¹–"A"“[“("HREF",("link"?))“]Ž¡‘¡:[–¹–children“]ŽŸ33‘íºâ¹Figure–ÈÞ4:‘ö5An“example“do•AÇcumen¾9t-pro“cessing–ÈÞscript“using“theŽ¡‘íºâgeneric–T lter“com¾9binators.Ž¡‘íºâ„ffðŽŽŽŽ þÀ¨¥ŸØ’õºâï¹s“in“the“example“of“Figure“4:Ž¤33’ÿ.Æcatno–¹–`oo`“numbered“(deep“(tag“"catalogno"))Ž¡’õºâ¹First,‘·Öthe– vdesired“elemen¾9ts“are“extracted“from“their“topmostŽ¤ ’õºâpAÇositions–GÉin“the“tree,‘”fthen“they“are“givš¾9en“n˜umeric“labAÇels,Ž¡’õºâand–B nally“the“Æcatno“¹ lter“incorpšAÇorates“the“lab˜el“in¾9to“someŽ¡’õºâgenerated–Ýætext.‘v'Another“example“can“bAÇe“seen“in“the“de -Ž¡’õºânition–Qiof“the“Æ`et`“¹comš¾9binator“in“Figure“3.‘Я(Æ`et`“¹com˜binesŽ¡’õºâa–C² lter“Æf“¹on“elemen¾9ts“with“a“ lter“Æg“¹on“text.‘§‰Æf“¹pattern-Ž¡’õºâmatcš¾9hes–•against“tagnames“{“the“tagnames“are“extracted“b˜yŽ¡’õºâthe–TlabAÇelling“function“Ætagged¹.)Ž¡’:âF‘ÿ:«urthermore,‘K‡it– }is“pšAÇossible“to“com¾9bine“lab˜ellings.‘ìTheŽ¡’õºâÆ`x`–Óƒ¹comš¾9binator“glues“t˜w˜o“labAÇelling“functions“together,‘à­pair-Ž¡’õºâing–Tthe“labšAÇels“they“pro˜duce.ŽŸ33’ÿ.Æ`x`–¹–::“(CFilter->LabelFilter“a)Ž¡’)´T->–¹–(CFilter->LabelFilter“b)Ž¡’)´T->–¹–(CFilter->LabelFilter“(a,b))ŽŸ—ü’õºâÄ2.5Ž’ üñAlgebraic–ŒÊlaš´Cws“of“com˜binatorsŽŸÏþ’õºâ¹W‘ÿ:«e–õvbrie y“shoš¾9w“ho˜w“com˜binators“are“de ned“in“suc˜h“a“w˜a˜yŽ¡’õºâthat–{+v‘ÿ|rarious“algebraic“laš¾9ws“hold.‘MôThe“complete“set“of“la˜wsŽ¡’õºâis–Tgiv¾9en“in“Figure“6.Ž¡’:âGiving–3all“con•¾9ten“t–3 lters“the“same“t¾9ypAÇe“maximises“theŽ¡’õºâusefulness–äof“com¾9binators“for“plugging“together“functions“ofŽŽŽŽŽŸ’çjã6ŽŽŒ‹‹ë •ºâ ý? £Ÿøff ý…™š‘íºâ„ffðŽŸ33‘‡’ÄIrish‘ŒÊcompK¼ositionŽŽ¤ ‘íºâÆf–¹–`o`“(g“`o`“h)– s,=“(f–¹–`o`“g)“`o`“h‘µBÅasso‡ciativityŽŽ¡‘íºâÆnone–¹–`o`“f‘!=› s,f“`o`“none˜=˜none‘ÎêÅzer‡oŽŽ¡‘íºâÆkeep–¹–`o`“f‘!=› s,f“`o`“keep˜=˜f‘û¬ÅidentityŽŽ¡¡‘‡’ÄGuardsŽŽ¡‘íºâÆf–¹–`with`“keep– s,=“f‘fN¢ÅidentityŽŽ¡‘íºâÆf–¹–`with`“none– s,=“none–¹–`with`“f“=“none‘ ˆ€Åzer‡oŽŽ¡‘íºâÆ(f–¹–`with`“g)“`with`“g– s,=“f–¹–`with`“g‘û¬Åidemp•‡otenc“eŽŽ¡‘íºâÆ(f–¹–`with`“g)“`with`“hŽŽ¡‘&mê=‘ s,(f–¹–`with`“h)“`with`“g‘û¬Åpr‡omotionŽŽ¡‘íºâÆ(f–¹–`o`“g)“`with`“hŽŽ¡‘&mê=‘ s,(f–¹–`with`“h)“`o`“g‘$(nÅpr‡omotionŽŽ¡¡‘íºâÆf–¹–`without`“keep– s,=“none–¹–`without`“fŽŽ¡‘BÇn=‘ s,none‘IõÅzer‡oŽŽ¡‘íºâÆf–¹–`without`“none– s,=“keep‘IõÅidentityŽŽ¡‘íºâÆ(f–¹–`without`“g)“`without`“gŽŽ¡‘‡’=‘ s,f–¹–`without`“g‘N®´Åidemp•‡otenc“eŽŽ¡‘íºâÆ(f–¹–`without`“g)“`without`“hŽŽ¡‘‡’=‘ s,(f–¹–`without`“h)“`without`“g‘ ˆ€Åpr‡omotionŽŽ¡‘íºâÆ(f–¹–`o`“g)“`without`“hŽŽ¡‘‡’=‘ s,(f–¹–`without`“h)“`o`“g‘(âÅpr‡omotionŽŽ¡¡‘‡’ÄP´Cath‘ŒÊselectorsŽŽ¡‘íºâÆf–¹–/>“(g“/>“h)– s,=“(f–¹–/>“g)“/>“h‘-›šÅasso‡ciativityŽŽ¡‘íºâÆnone–¹–/>“f‘Y„=› s,f“/>“none˜=˜none‘û¬Åzer‡oŽŽ¡‘íºâÆkeep–¹–/>“f‘Y„=‘ s,f“`o`“childrenŽŽ¡‘íºâf–¹–/>“keep‘Y„=‘ s,children“`o`“fŽŽ¡‘íºâkeep–¹–/>“keep‘,Â=‘ s,childrenŽŽ¡‘íºânone–¹–“g– s,=“f–¹–/>“g‘N®´Åidemp•‡otenc“eŽŽ¡¡‘íºâÆ(f–¹–/>“g)““(g““h)‘Ÿî=‘ s,g“/>“(f“`o`“h)‘û¬Åpr‡omotionŽŽ¡‘íºâÆ(f–¹–/>“g)“`o`“h‘Ÿî=‘ s,(f“`o`“h)“/>“g‘û¬Åpr‡omotionŽŽ¡‘íºâÆ(f–¹–/>“g)“`with`“h– s,=“f–¹–/>“(g“`with`“h)‘ÎêÅpr‡omotionŽŽ¡‘íºâÆ(f–¹–|“g)“|>|“h– s,=“f–¹–|>|“(g“|>|“h)‘µBÅasso‡ciativityŽŽ¡‘íºâÆkeep–¹–|>|“f‘!=‘ s,keepŽŽ¡‘íºânone–¹–|>|“f‘!=› s,f“|>|“none˜=˜f‘û¬ÅidentityŽŽ¡‘íºâÆf‘æX|>|‘¹–f‘!=‘ s,f‘\ÛvÅidemp•‡otenc“eŽŽ¡¡‘‡’ÄRecursionŽŽ¡‘íºâÆdeep‘¹–keep‘Y„=‘ s,keep‘X!àÅsimpli c‡ationŽŽ¡‘íºâÆdeep‘¹–none‘Y„=‘ s,none‘X!àÅsimpli c‡ationŽŽ¡‘íºâÆdeep‘¹–children– s,=“children‘E;ˆÅsimpli c‡ationŽŽ¡‘íºâÆdeep–¹–(deep“f)– s,=“deep‘¹–f‘N®´Ådepth‘N|“txt“=“txt“|>|“elm“=“keep‘(âÅc‡ompletenessŽŽ¡‘íºâÆelm–¹–`o`“txt“=“txt“`o`“elm“=“none‘(âÅexcl.‘N“¹path“selector“is“assoAÇciativ¾9e“but“ƹ,“Æ|“¹view˜ed“b˜y“itself“ap-Ž¡’õºâpšAÇears–Á@to“b˜e“algebraically“sensible,‘ì:but“it“do˜es“not“seem“toŽ¡’õºâha•¾9v“e–xuseful“algebraic“propAÇerties“in“connection“with“otherŽ¡’õºâcomš¾9binators–;DbAÇecause“of“its“bias“to˜w˜ards“the“left“opAÇerand.Ž¡’õºâThe–É”simpler“result-appšAÇending“com¾9binator“Æ|||“¹could“b˜e“anŽ¡’õºâalternativš¾9e–´Üto“the“directed“c˜hoice“opAÇerator,‘È'and“w˜ould“prob-Ž¡’õºâably–)Ölead“to“more“la¾9ws,‘.öbut“it“has“less“`application“bite'.‘YõAŽ¡’õºâpAÇotenš¾9tially–µÌserious“problem“is“that“the“Æ|||¹-com˜bination“ofŽ¡’õºât•¾9w“o–Tselectors“is“not“necessarily“a“selector.Ž¡’:âThe–ºKrecursion“opAÇerator“Ædeep“¹has“some“minor“la¾9ws,‘ãˆoneŽ¡’õºâof–Éwhicš¾9h,‘bfthe“depth“la˜w,‘bfis“more“profound.‘;ÏW‘ÿ:«e“ha˜v˜e“notŽ¡’õºâyš¾9et–Öfully“in˜v˜estigated“the“propAÇerties“of“Ædeepest¹,–¼Æmulti¹,“andŽ¡’õºâÆfoldXml¹.ŽŸü’õºâÄ3Ž’´oT‘ÿÌranslation–ŒÊof“DTDs“to“T´CypK¼esŽŸ阒õºâ3.1Ž’ üñDTDsŽŸÏþ’õºâ¹So–µVfar“wš¾9e“ha˜v˜e“considered“do•AÇcumen˜t-pro“cessing–µVb˜y“genericŽ¡’õºâtree–´Xtransformations,‘Ǿwhere“markup“is“matc¾9hed“textually“atŽ¡’õºârunš¾9time,‘:^and–2õno“accoun˜t“is“tak˜en“of“an˜y“deepAÇer“meaning“ofŽ¡’õºâtags.Ž¡’:âHo•¾9w“ev“er,‘§¨when–ŒŽ¤ ‘íºâ Øcoverart,‘¹–(catalogno)+,Ž¡‘> Øpersonnel,–¹–tracks,“notes)“>Ž¡‘íºâŽ¡‘íºâŽ¡‘íºâŽ¡‘¡:Ž¡‘íºâŽ¡‘¡:Ž¡‘íºâŽ¡‘¡:Ž¡‘íºâŽ¡‘¡:Ž¡‘íºâŽ¡‘íºâŽ¡‘¡:Ž¡‘íºâŽ¡‘íºâŽ¡‘¡:Ž¡‘íºâŽ¡‘¡:Ž¡‘íºâŽ¡‘¡:Ž¡‘íºâŽ¡‘¡:Ž¡‘íºâ]>ŽŸ33‘*Ž¹Figure–T7:‘pAn“example“DTD.ŽŽ¡‘íºâ„ffðŽŽŸ‘û:âXML‘ §doAÇcumenš¾9t– ìv‘ÿ|ralidators“are“readily“a˜v‘ÿ|railable.‘?8Ho˜w-Ž¤ ‘íºâev•¾9er,‘D9w“e–¥go“further“and“de ne“the“idea“of“Åvalid‘,édo‡cumentŽ¡‘íºâpr•‡o“c“essing¹.‘¨ A‘Dv‘ÿ|ralid–DprošAÇcessing“script“is“one“whic¾9h“pro˜ducesŽ¡‘íºâa–!%v‘ÿ|ralid“doAÇcumenš¾9t“as“output,‘Qügiv˜en“a“v‘ÿ|ralid“doAÇcumen˜t“as“input.Ž¡‘íºâW‘ÿ:«e›Z7ac•¾9hiev“e˜this˜b“y˜demonstrating˜a˜corresp•AÇondence˜b“et•¾9w“eenŽ¡‘íºâthe–@DTD‘?öof“a“doAÇcumen¾9t“and“the“de nition“of“a“set“of“alge-Ž¡‘íºâbraic–¥¸tš¾9ypAÇes“in“Hask˜ell,‘ÉÐand“the“consequen˜t“correspAÇondenceŽ¡‘íºâbšAÇet•¾9w“een–áthe“do˜cumenš¾9t's“con˜ten˜t“and“a“structured“Hask˜ellŽ¡‘íºâv‘ÿ|ralue.‘F#Hence,‘Ñ^bš¾9y–xwriting“doAÇcumen˜t“proAÇcessing“scripts“toŽ¡‘íºâmanipulate–i—the“tš¾9ypAÇed“Hask˜ell“v›ÿ|ralue,‘¾§the“script“v˜alidationŽ¡‘íºâproblem–V”is“just“an“instance“of“normal“Haskš¾9ell“t˜ypAÇe“infer-Ž¡‘íºâence.Ÿü-=½9ŽŽŽŸ—ü‘íºâÄ3.2Ž‘üñDTD‘ŒÊtranslations.ŽŸÏþ‘íºâ¹An–†Ôexample“DTD‘†¯for“the“doAÇcumenš¾9t“sho˜wn“earlier“is“giv˜en“inŽ¡‘íºâFigure–¶ó7.‘üúThe“immediate“features“to“note“are:‘í?(1)“F‘ÿ:«or“ev¾9eryŽ‘íºâŸÀ‰ff_ÿ Ÿ× ‘ r}Ÿüûr°9ŽŽŽ‘Y±W–ÿZªell,‘\nearly!‘j†V“aliditšÈãy–F¬also“encompasses“some“other“minor“c˜hec˜ks,ŽŸfor–±Èinstance“that“IDREF“attributes“mÈãust“b7e“globally“unique.ŽŽŽ ÿZÌÌ þ™š’õºâ„ffðŽŸ33’õºâÆmodule–¹–AlbumDTD“whereŽ¤ ¡’õºâdata–¹–Album“=Ž¡’¡:Album–¹–Title“Artist“(Maybe“Recordingdate)Ž¡’$ú¾Coverart–¹–[Catalogno]“PersonnelŽ¡’$ú¾Tracks‘¹–NotesŽ¡’õºânewtype–¹–Title“=“Title“StringŽ¡’õºânewtype–¹–Artist“=“Artist“StringŽ¡’õºânewtype–¹–Recordingdate“=Ž¡’ATBRecordingdate‘¹–Recordingdate_AttrsŽ¡’õºâdata–¹–Recordingdate_Attrs“=“Recordingdate_Attrs“{Ž¡’¡:date–¹–::“Maybe“String,Ž¡’¡:place–¹–::“Maybe“String“}Ž¡’õºânewtype–¹–Coverart“=“Coverart“(String,“Maybe“Location)Ž¡’õºânewtype–¹–Location“=“Location“Location_AttrsŽ¡’õºâdata–¹–Location_Attrs“=“Location_Attrs“{Ž¡’¡:thumbnail–¹–::“Maybe“String,Ž¡’¡:fullsize‘ s,::–¹–Maybe“String“}Ž¡’õºânewtype–¹–Catalogno“=“Catalogno“Catalogno_AttrsŽ¡’õºâdata–¹–Catalogno_Attrs“=“Catalogno_Attrs“{Ž¡’¡:label–¹–::“String,Ž¡’¡:number–¹–::“String,Ž¡’¡:format–¹–::“Maybe“Format,Ž¡’¡:releasedate–¹–::“Maybe“String,Ž¡’¡:country–¹–::“Maybe“String“}Ž¡’õºâdata–¹–Format“=“CD“|“LP“|“MiniDiscŽ¡’õºânewtype–¹–Personnel“=“Personnel“[Player]Ž¡’õºânewtype–¹–Player“=“Player“Player_AttrsŽ¡’õºâdata–¹–Player_Attrs“=“Player_Attrs“{Ž¡’¡:name–¹–::“String,Ž¡’¡:instrument–¹–::“String“}Ž¡’õºânewtype–¹–Tracks“=“Tracks“[Track]Ž¡’õºânewtype–¹–Track“=“Track“Track_AttrsŽ¡’õºâdata–¹–Track_Attrs“=“Track_Attrs“{Ž¡’¡:title–¹–::“String,Ž¡’¡:credit–¹–::“Maybe“String,Ž¡’¡:timing–¹–::“Maybe“String“}Ž¡’õºânewtype–¹–Notes“=“Notes“(Maybe“String,“[Notes_])Ž¡’õºâdata–¹–Notes_“=Ž¡’¡:Notes_Str‘¹–StringŽ¡’ÿ.|–¹–Notes_Albumref“AlbumrefŽ¡’ÿ.|–¹–Notes_Trackref“TrackrefŽ¡’õºânewtype–¹–Albumref“=“Albumref“(String,String)Ž¡’õºânewtype–¹–Trackref“=“Trackref“(Maybe“String,String)ŽŸ33’øC]¹Figure–T8:‘pThe“example“DTD“translated“to“Haskš¾9ell“t˜ypAÇes.ŽŽ¡’õºâ„ffðŽŽŸ’õºâelemenš¾9t,‘KÆthere– °is“a“spAÇeci cation“of“allo˜w˜ed“inner“elemen˜tsŽ¤ ’õºâ(ÆELEMENT‘òŹdeclaration),‘*hand–òþpšAÇossibly“also“a“sp˜eci cation“ofŽ¡’õºâallo•¾9w“ed–vÚattribute“v‘ÿ|ralues“(ÆATTLIST‘vÁ¹declaration).‘A(2)“F‘ÿ:«or“in-Ž¡’õºâner›²ãcon•¾9ten“t,‘Æ”the˜grammar˜allo“ws˜sequence˜(commas),‘Æ”c“hoiceŽ¡’õºâ(v•¾9ertical› bar),‘Koptionalit“y˜(question˜mark),‘Kand˜repAÇetitionŽ¡’õºâ(star–‚ or“plus).‘b(3)“Where“the“inner“con•¾9ten“t–‚ declaration“al-Ž¡’õºâloš¾9ws–àæfree“text“(Æ#PCDATA¹),“c˜hoice“bAÇet˜w˜een“text“and“other“ele-Ž¡’õºâmenš¾9ts–Òis“pAÇermitted,‘Sbut“sequencing“of“those“elemen˜ts“is“notŽ¡’õºâpAÇermitted.‘íç(4)–‰ºIn“attribute“lists,‘¥¦some“v‘ÿ|ralues“are“mandatoryŽ¡’õºâ(Æ#REQUIRED¹)‘i]and–iµsome“are“optional“(Æ#IMPLIED¹);“attributeŽ¡’õºâv‘ÿ|ralues–#ƒcan“either“bAÇe“unconstrained“strings“(ÆCDATA¹)‘#Eor“a“mem-Ž¡’õºâbAÇer–Tof“some“pre-de ned“set“of“string“v‘ÿ|ralues.Ž¡’:âThere–v¨seem“to“bšAÇe“some“ob¾9vious“corresp˜ondences“b˜et•¾9w“eenŽ¡’õºâthis–/‹vš¾9ery“restricted“form“of“t˜ypAÇe“language“and“the“ric˜her“t˜ypAÇeŽŽŽŽŽŸ’çjã8ŽŽŒ‹ µÕ •ºâ ý? £ ý€‘íºâ¹language–9 of“Hask•¾9ell.‘‡ÕEac“h›9 elemen“t˜declaration˜is˜roughlyŽ¤ ‘íºâspšAÇeaking–­1a“new“datat¾9yp˜e“declaration.‘ùºSequence“is“lik¾9e“pro˜d-Ž¡‘íºâuct–å tš¾9ypAÇes“(i.e.“single-constructor“v‘ÿ|ralues).‘ XChoice“is“lik˜e“sumŽ¡‘íºâtš¾9ypAÇes–è=(i.e.“m˜ulti-constructor“v‘ÿ|ralues).‘•*Optionalit˜y“is“just“aŽ¡‘íºâÆMaybe›T¹t¾9yp•AÇe.‘pRep“etition˜is˜lists.Ž¡‘û:âAš¾9ttribute–Álists“also“ha˜v˜e“a“translation:‘-IbAÇecause“theyŽ¡‘íºâare–“sunordered“and“accessed“bš¾9y“name,‘²ûHask˜ell“named- eldsŽ¡‘íºâlošAÇok–åclik¾9e“a“go˜o˜d“represen•¾9tation.‘ŒOptionalit“y–åccan“again“b˜eŽ¡‘íºâexpressed–5Ras“ÆMaybe“¹t•¾9ypAÇes.‘|iA“ttribute–5Rv‘ÿ|ralues“that“are“con-Ž¡‘íºâstrained–°to“a“particular“v‘ÿ|ralue-set“can“bšAÇe“mo˜delled“b¾9y“de n-Ž¡‘íºâing–~a“new“enš¾9umeration“t˜ypšAÇe“encompassing“the“p˜ermittedŽ¡‘íºâstrings.Ž©—ü‘íºâÄ3.3Ž‘üñImplemen´CtationŽŸÏþ‘íºâ¹These–~1rules“are“formalised“in“the“appAÇendix“(Figure“9).‘WAnŽ¡‘íºâimplemen¾9tation–‰•of“these“rules“(with“some“additional“rules“toŽ¡‘íºâeliminate–ýredundancy)“translated“the“DTD‘ýin“Figure“7“in¾9toŽ¡‘íºâthe–THaskš¾9ell“t˜ypAÇe“declarations“sho˜wn“in“Figure“8.Ž¡‘û:âAlso–0needed,›kalong“with“the“t¾9ypAÇe“declarations,˜are“func-Ž¡‘íºâtions–‘dwhicš¾9h“read“and“write“v‘ÿ|ralues“of“these“t˜ypAÇes“to“and“fromŽ¡‘íºâactual–ÊSXML‘Ê@doAÇcumen¾9ts.‘pThese“are“generated“automaticallyŽ¡‘íºâfrom–JYthe“t¾9ypAÇe“declarations“alone.‘»Using“an“appropriate“setŽ¡‘íºâof–eïpre-de ned“tš¾9ypAÇe“classes,‘‰w˜e“deriv˜e“a“new“instance“for“eac˜hŽ¡‘íºâgenerated–Tt¾9ypšAÇe“using“a“to˜ol“lik¾9e“DrIFT“[16Ž‘ ?ü].Ž¦‘íºâÄ3.4Ž‘üñDiscussionŽŸÏþ‘íºâ¹Although–÷Cthis“t¾9ypšAÇe-based“translation“lo˜oks“straigh•¾9tforw“ard,Ž¡‘íºâit–Tturns“out“that“there“are“sevš¾9eral“tric˜ky“issues.Ž¡‘û:âFirst,‘´Ùthe–”òtš¾9ypAÇe“translation“ma˜y“only“use“datat˜ypAÇes“andŽ¡‘íºânewt•¾9ypšAÇes,‘¤nev“er‘Çt“yp˜e›Çsynon“yms.‘&ÊThis˜is˜a˜result˜of˜needingŽ¡‘íºâto–ewrite“v‘ÿ|ralues“out“as“XML‘&{“a“tš¾9ypAÇe“synon˜ym“in“Hask˜ellŽ¡‘íºâis–Tcindistinguishable“from“the“t¾9ypAÇe“it“abbreviates,‘¤&but“theŽ¡‘íºâgenerated–Ûétš¾9ypAÇes“m˜ust“bšAÇe“distinct“in“order“to“b˜e“able“toŽ¡‘íºâre-in¾9troAÇduce–â«enclosing“start“and“end“tags“with“the“correctŽ¡‘íºâmarkup.Ž¡‘û:âA‘´…separate–´®tš¾9ypAÇe“is“in˜troAÇduced“for“eac˜h“collection“of“at-Ž¡‘íºâtributes.‘ö&Hence,‘Úµan–³;elemenš¾9t“is“represen˜ted“b˜y“a“pairing“ofŽ¡‘íºâthe–òZattributes“and“the“con•¾9ten“t.‘ÇWhere–òZa“tagged“elemen¾9t“di-Ž¡‘íºârectly–Qmconš¾9tains“an“optional“t˜ypAÇe“or“a“sequence“of“t˜ypAÇes“whic˜hŽ¡‘íºâare–ɳthemselvš¾9es“sum-t˜ypAÇes,‘ØÔit“is“necessary“to“in˜terpAÇose“a“sep-Ž¡‘íºâarate–XHaskš¾9ell“t˜ypAÇe,‘aØe.g.“ÆNotes“¹con˜tains“a“Æ[Notes_]“¹whereŽ¡‘íºâthe–Tauxiliary“tš¾9ypAÇe“ÆNotes_“¹has“three“alternativ˜es.Ž¡‘û:âNaming–ÍÚis“a“big“issue.‘Case“matters“in“XML,“so“a“ÆŽ¡‘íºâ¹di ers–Åàfrom“a“Æ“¹and“attribute“Æattr“¹di ers“from“ÆAttr¹.Ž¡‘íºâIn–<"Haskš¾9ell“ho˜w˜ev˜er,‘EÖt˜ypAÇes“m˜ust“bšAÇegin“with“upp˜er-case,‘EÖandŽ¡‘íºâ eld-names–_Emš¾9ust“bAÇegin“with“lo˜w˜er-case.‘úBWhere“auxiliaryŽ¡‘íºâtš¾9ypAÇes–"Ïare“necessary‘ÿ:«,‘f-w˜e“ha˜v˜e“c˜hosen“to“appAÇend“an“under-Ž¡‘íºâscore–K\c¾9haracter“to“the“name.‘¾‰All“of“these“factors“impAÇoseŽ¡‘íºârestrictions–ñÿon“the“use“of“this“translation,‘ùdue“to“the“pAÇoten-Ž¡‘íºâtial–Tname“con icts.Ž¡‘û:âF‘ÿ:«urthermore,‘i½there–¿Cis“a“mismatcš¾9h“bAÇet˜w˜een“Hask˜ell'sŽ¡‘íºânamed–g~ elds“and“the“attribute“naming/scoping“rules“inŽ¡‘íºâXML.–0yIn“XML,“di erenš¾9t“elemen˜ts“ma˜y“ha˜v˜e“attributes“ofŽ¡‘íºâthe–”‡same“name“and“tš¾9ypAÇe,‘®Jwhereas“Hask˜ell's“named“ elds“areŽ¡‘íºârestricted–ú†to“use“within“a“single“tš¾9ypAÇe.‘ÌA‘úJsystem“of“t˜ypAÇedŽ¡‘íºâextensible–Trecords“[5Ž‘Ÿþ]“wš¾9ould“bAÇe“a“m˜uc˜h“bAÇetter“ t.Ž¡‘û:âDespite–Öthese“problems“in“expressing“DTDs“within“theŽ¡‘íºâHask•¾9ell›rÐt“ypAÇesystem,‘Ê/the˜latter˜is˜v“ery˜m“uc“h˜more˜pAÇo“w“er-Ž¡‘íºâful–ìthan“DTDs“{“for“instance,‘aÒDTDs“ha•¾9v“e–ìno“notion“ofŽ¡‘íºâpšAÇolymorphism.‘[LIndeed,‘Úthere–Ôóare“frequen¾9t“o˜ccasions“whenŽ¡‘íºâDTD‘alwriters–aÂresort“to“textual“macrosŸü-=½10ŽŽ‘ 7¹to“indicate“moreŽ‘íºâŸ€‰ff_ÿ Ÿ× ‘ ]Ÿüûr°10ŽŽŽ‘Y±That–±Èis,“parameter“en•Èãtit“y‘±Èreferences.ŽŽŽ ý€’õºâ¹detailed–µöstructuring“than“DTDs“pšAÇermit“(including“p˜olymor-Ž¤ ’õºâphism–Ãand“quali ed“t•¾9yping),‘Fzev“en–Ãthough“suc¾9h“implicit“struc-Ž¡’õºâturing–¤cannot“bšAÇe“v‘ÿ|ralidated“b¾9y“XML‘£ñto˜ols.‘ȶIt“is“signi can¾9tŽ¡’õºâto–±Çnote“the“XML‘±Ÿcomm•¾9unit“y's–±Çrecognition“of“these“limita-Ž¡’õºâtions–ú§of“DTDs“{“recen¾9t“propAÇosals“for“ÅschemasŸü-=½11ŽŽ‘ Ïõ¹address“theŽ¡’õºâquestion–Tof“ricš¾9her“t˜yping“in“a“more“disciplined“manner.Ž¡’:âOne–GÛarea“in“whicš¾9h“the“t˜ypAÇe“system“of“Hask˜ell“in“partic-Ž¡’õºâular–#(as“oppAÇosed“to“other“functional“languages)“is“exploitedŽ¡’õºâis–âÉtš¾9ypAÇe“classes.‘„ÏThis“systematic“o˜v˜erloading“mec˜hanism“isŽ¡’õºâvš¾9ery–Tuseful“for“coAÇdifying“the“I/O“con˜v˜ersions.ŽŸü’õºâÄ4Ž’´oPros–ŒÊand“cons“of“the“t•´Cw“o‘ŒÊsc“hemesŽŸ阒õºâ4.1Ž’ üñCom´CbinatorsŽŸÏþ’õºâ¹Compared–d;with“the“mainstream“solution“for“XML‘d'proAÇcess-Ž¡’õºâing,‘­_namely–[Änew“domain-spAÇeci c“languages“for“expressingŽ¡’õºâand–‹pscripting“transformations,‘§the“comš¾9binator“approac˜h“hasŽ¡’õºâsev•¾9eral‘Tadv‘ÿ|ran“tages:Ž©—ü’õºâÄEase–Ìkof“extension“and“v‘ÿh‰ariation‘ ?ü¹Scripting‘*ñlanguagesŽ¡’õºâsometimes–ÔÑlacš¾9k“useful“facilities,‘°or“pro˜vide“them“in“con˜v˜o-Ž¡’õºâluted›½¬w•¾9a“ys.–ÿ8Extending˜the˜language˜is˜dicult.“A‘½–com¾9bina-Ž¡’õºâtor‘<.library‘ÿ:«,›gho•¾9w“ev“er,˜can–<.bAÇe“enlarged“comparativš¾9ely“straigh˜t-Ž¡’õºâforw¾9ardly–^À{“the“de nitions“are“accessible,‘ƒEand“most“are“shortŽ¡’õºâand‘Tsimple.Ž¦’õºâÄComputational‘«pK¼o•´Cw“er‘ ?ü¹Scripting–ªdlanguages“tend“to“o erŽ¡’õºâeither–p¸a“vš¾9ery“limited“expression“language,‘‡’or“a“hoAÇok“in˜to“aŽ¡’õºâprogramming–ÌÞsystem“at“a“completely“di erenš¾9t“lev˜el“of“ab-Ž¡’õºâstraction.‘þBut–»Ùif“XML‘»®scripts“are“programs“in“a“languageŽ¡’õºâsucš¾9h–ñkas“Hask˜ell,‘ø™the“full“pAÇo˜w˜er“of“the“nativ˜e“language“is“im-Ž¡’õºâmediately‘Ta¾9v‘ÿ|railable.Ž¦’õºâÄAbstraction,‘ÀSgeneralit´Cy–5and“reuse‘ ?ü¹Almost–7Xan¾9y“patternŽ¡’õºâošAÇccurring–5—in“a“com¾9binator“program“can“b˜e“isolated“and“de-Ž¡’õºâ ned–Mcas“a“separate“re-usable“idea“[6Ž‘Ÿþ].‘ÙÊThis“also“applies“at“theŽ¡’õºâapplication–·úlev¾9el,‘ʦwhere“common“ideas“from“similar“applica-Ž¡’õºâtions–Eæmighš¾9t“easily“bAÇe“de ned“in“a“higher-lev˜el“library‘ÿ:«.‘®'ThisŽ¡’õºâform–VTof“re-use“makš¾9es“program“dev˜elopmen˜t“m˜uc˜h“quic˜k˜erŽ¡’õºâand–Tless“error-prone.Ž¦’õºâÄLa´Cws–úXfor“reasoning“abK¼out“scripts‘ ?ü¹The–t|seman¾9tics“of“aŽ¡’õºâscripting–•language“are“often“de ned“b¾9y“illustration.‘3So“itŽ¡’õºâis–}#hard“to“reason“with“con dence“abAÇout“the“meanings“ofŽ¡’õºâscripts.‘X½Is–~Ãó5ùž" cmmi9ÇA“¹just“a“st¾9ylistic“v‘ÿ|rariation“of“ÇB‘ñx¹or“are“there“in-Ž¡’õºâputs–for“whicš¾9h“the“t˜w˜o“could“giv˜e“di eren˜t“results?‘É^But“whenŽ¡’õºâthe–´fseman¾9tics“of“scripts“can“bAÇe“de ned“in“terms“of“the“equa-Ž¡’õºâtions–æfor“the“comš¾9binators,‘?‹propAÇerties“suc˜h“as“assoAÇciativit˜yŽ¡’õºâand–Tdistribution“can“often“bAÇe“demonstrated“simply‘ÿ:«.Ž¦’õºâÄImplemen´Ctation–|for“free‘ ?ü¹DoAÇes–(|a“scripting“language“ha•¾9v“eŽ¡’õºâan›6in•¾9teractiv“e˜in“terpreter?–~¦A›6compiler?“A˜t•¾9ypAÇe-c“hec“k“er?‘~¦AŽ¡’õºâpro ler?‘þAll–º6these“things“are“immediately“a¾9v‘ÿ|railable“to“XMLŽ¡’õºâscripts–Tdirectly“expressed“as“Hask¾9ell“programs.Ž’õºâŸ@‰ff_ÿ Ÿ× ‘ ]Ÿüûr°11ŽŽŽ‘YÉhttp://www.w3.org/TR/xmlschema-1–±È±for“structures,ŽŸand–±ÈÉhttp://www.w3.org/TR/xmlschema-2“±for“datatÈãyp7es.ŽŽŽŽŽŸ’çjã¹9ŽŽŒ‹ ÍC •ºâ ý? £ ý€‘íºâ¹Of–Tcourse,“there“are“disadv‘ÿ|ran¾9tages“toAÇo.Ž©é'‘íºâÄDistance–Þófrom“target“language‘ ?ü¹XSL‘ÿ:«T‘~1[3Ž‘Ÿþ]–~Whas“the“prop-Ž¤ ‘íºâert¾9y–ôthat“a“script“is“an“expression“in“the“target“language:Ž¡‘íºâit–ö¥uses“exactly“the“XML‘öksynš¾9tax“for“building“new“con˜ten˜t.Ž¡‘íºâComš¾9binator-based–iscripts“m˜ust“use“a“di eren˜t“syn˜tax“dueŽ¡‘íºâto–Fthe“underlying“language.‘:EThe“linguistic“gap“migh¾9t“causeŽ¡‘íºâconfusion–Tand“increase“learning“costs.Ž¦‘íºâÄLiving–——in“an“unfamiliar“w´Corld‘ ?ü¹Com¾9binator‘ýprogramsŽ¡‘íºâÅlo‡ok‘¯žlike–h·¹scripts“in“a“small“domain-spAÇeci c“language.‘âæW‘ÿ:«ritersŽ¡‘íºâma¾9y–bšAÇe“b˜eguiled“bš¾9y“this“apparen˜t“simplicit˜y‘ÿ:«,‘Emak˜e“a“small“er-Ž¡‘íºâror,‘ˆand–d¼drop“inš¾9to“an“unkno˜wn“corner“of“Hask˜ell.‘á“ErrorŽ‘ Omes-Ž¡‘íºâsages–Ž_maš¾9y“bAÇe“incomprehensible,‘¬¡or“w˜orse,‘¬¡the“script“migh˜tŽ¡‘íºâw¾9ork–Tbut“do“something“utterly“strange.Ž¦‘íºâÄ4.2Ž‘üñT´CypK¼e-based‘ŒÊtranslationŽŸÏþ‘íºâ¹Some–‰of“the“adv‘ÿ|ranš¾9tages“of“the“fully-t˜ypAÇed“represen˜tation“ofŽ¡‘íºâXML–TdoAÇcumenš¾9ts“ha˜v˜e“already“bAÇeen“men˜tioned.Ž¦‘íºâÄV‘ÿÌalidit´Cy‘ ?ü¹The–çabilit¾9y“for“the“system“to“spAÇot“errors“auto-Ž¡‘íºâmatically‘ÿ:«,›ÆÂnot–³just“in“the“data,˜but“in“the“program,˜and“alsoŽ¡‘íºâto›Tprev•¾9en“t˜the˜generation˜of˜incorrect˜doAÇcumen“t˜markup.Ž¦‘íºâÄDirect–9Iprogramming“st´Cyle‘ ?ü¹F‘ÿ:«unctional–gÙlanguages“en-Ž¡‘íºâcourage–œßthe“use“of“pattern-matc¾9hing“(binding“v›ÿ|ralues“to“v˜ari-Ž¡‘íºâables)–+Mon“the“left-hand-side“of“equations.‘^[Ho•¾9w“ev“er,‘pËusingŽ¡‘íºâhigher-order–vQcom¾9binators,‘Îdata“structures“tend“not“to“bAÇeŽ¡‘íºâmen¾9tioned–T•in“equations“at“all.‘Ú3The“DTD‘TCtranslation“ap-Ž¡‘íºâproacš¾9h– õis“m˜uc˜h“more“in“k˜eeping“with“the“pattern-bindingŽ¡‘íºâst•¾9yle,‘˜whic“h–©sometimes“leads“to“shorter“programs!‘âWhereasŽ¡‘íºâwith–çucomš¾9binators,‘ýit“is“sometimes“necessary“to“re-tra˜v˜erseŽ¡‘íºâthe–MGsame“selection“path“with“sligh¾9t“v‘ÿ|rariations,‘[Cthe“pattern-Ž¡‘íºâbinding–Tgiv¾9es“direct“access“for“free.ŽŸ9ó‘íºâDisadv‘ÿ|ran¾9tages‘Tare:Ž¦‘íºâÄHigh–ëstartup“cost‘ ?ü¹Before–èscripting“doAÇcumen¾9t“transfor-Ž¡‘íºâmations,›ä=it–ºÜis“necessary“to“acquire,˜c•¾9hec“k,˜and–ºÜproAÇcess“theŽ¡‘íºâDTD.–ÔAlthough“the“generation“of“Haskš¾9ell“t˜ypAÇes“is“auto-Ž¡‘íºâmated,‘Åhfew–nþpšAÇeople“are“familiar“enough“with“DTDs“to“b˜eŽ¡‘íºâable–ý to“start“using“them“immediately‘ÿ:«.‘ÓšThey“require“care-Ž¡‘íºâful– ?study“and“understanding“bšAÇefore“correct“scripts“can“b˜eŽ¡‘íºâwritten–Tand“the“initial“in•¾9v“estmen“t–Tof“e ort“pa¾9ys“o .Ž¦‘íºâÄIncomplete–ít´CypšK¼e“mo˜del‘ ?ü¹The–ŠŸgrammar“of“DTDs“is“smallŽ¡‘íºâand–RŸrestrictivš¾9e“compared“to“the“sophisticated“t˜ypAÇe“systemsŽ¡‘íºâaš¾9v‘ÿ|railable–˜±in“functional“languages.‘¦‡Better“means“of“t˜ypAÇe-Ž¡‘íºâspAÇeci cation–)in“XML‘(Öare“still“under“dev•¾9elopmen“t.‘WËIn‘)theŽ¡‘íºâmean¾9time,‘O there–Nis“little“scopšAÇe“for“using“the“full“p˜o•¾9w“er‘NofŽ¡‘íºâfeatures–Tlik¾9e“pAÇolymorphism.ŽŸá'‘íºâÄ5Ž‘ý´oRelated‘ŒÊW‘ÿÌorkŽŸ阑íºâXML‘­ ProK¼cessing‘ ?ü¹There–1`are“infan¾9t“proAÇcessing“languagesŽ¡‘íºâsurrounding–TXML.“Of“most“in¾9terest“here“are:ŽŸË;‘øæÈŽŽŽ‘ºä¹XSL›ÿ:«T‘ž[3Ž‘Ÿþ]–ž±(eXtensible“St¾9yle“Language“for“T˜ransforma-Ž¡‘ºätion)–ß„is“a“W3C-propAÇosed“declarativ¾9e“language“for“ex-Ž¡‘ºäpressing–a“limited“form“of“transformations“on“XML‘ÌdoAÇc-Ž¡‘ºäumen•¾9ts,‘Ñîoriginally›¬5in“tended˜for˜rendering˜to˜a˜la“y“out-Ž¡‘ºäbased–¾format,›AØe.g.‘í­HTML,“P¾9ostScript,˜etc.,˜but“no¾9wŽ¡‘ºäwidely–Tused“for“XMLÈ!¹XML“transformations.ŽŽŽ ý€’æÈŽŽŽ’ ºä¹DSSSL‘Žø[12Ž‘ ?ü]–Y(DoAÇcumenš¾9t“St˜yle“Seman˜tics“and“SpAÇeci -Ž¤ ’ ºäcation–NøLanguage)“is“a“mature“ISO‘N§standard“with“noŽ¡’ ºäcomplete–Û implemen¾9tations.‘m‘It“is“similar“in“essence“toŽ¡’ ºäXSL‘ÿ:«T,–—but“deals“with“full“SGML‘Âjinput,‘íèand“is“basedŽ¡’ ºäon‘TSc¾9heme.ŽŸ33’:âNot– manš¾9y“functional“language“researc˜hers“are“visibly“en-Ž¡’õºâgaged–1kin“XML-related“wš¾9ork,‘_but“t˜w˜o“other“toAÇolkits“for“XML-Ž¡’õºâproAÇcessing–aßare“Christian“Lindig's“XML‘aËparser“in“O'CamlŸü-=½12ŽŽŽ¡’õºâ¹and–]éAndreas“Neumann's“v‘ÿ|ralidating“XML‘]Öparser“in“SMLŸü-=½13ŽŽ‘ÕN¹.Ž¡’õºâT‘ÿ:«o–'our“knoš¾9wledge,‘+neither“of“these“pro˜vides“transformationŽ¡’õºâcapabilities–7åin“either“a“comš¾9binator“st˜yle“or“a“t˜ypAÇe-translationŽ¡’õºâstš¾9yle.‘ú¼Philip–°8W‘ÿ:«adler“has“written“a“short“formal“seman˜tics“ofŽ¡’õºâXSL–Tselection“patterns“[15Ž‘ ?ü].Ž©—ü’õºâÄApplication-based‘õµcom´Cbinators‘ ?ü¹P¾9arsing–-'is“the“mostŽ¡’õºâextensivš¾9ely–studied“application“for“com˜binator“libraries.Ž¡’õºâSince–Žõthe“original“treatmenš¾9t“b˜y“Burge“[2Ž‘Ÿþ],‘­]there“ha˜v˜e“bAÇeenŽ¡’õºâmanš¾9y–ÿv‘ÿ|rariations“on“the“theme.‘*rSwierstra“and“DupAÇonc˜heel'sŽ¡’õºâmetho•AÇd›¥incorp“orating˜on-the- y˜grammar˜analysis˜andŽ¡’õºâerror-correction–€Eis“a“notable“recen¾9t“example“[10Ž‘ ?ü].‘]BW‘ÿ:«e“hopAÇeŽ¡’õºâit–Ìma¾9y“bšAÇe“p˜ossible“to“incorp˜orate“DTD-analysis“in“our“com-Ž¡’õºâbinators–Tin“a“similar“st¾9yle.Ž¡’:âAlthough–manš¾9y“other“libraries“of“application“com˜bina-Ž¡’õºâtors›Pha•¾9v“e˜bAÇeen˜devised,‘wythe˜general˜design˜principles˜for˜suc“hŽ¡’õºâlibraries–£jare“scarcely“referred“to“in“the“literature.‘ƳHughes'Ž¡’õºâexpAÇosition–-#of“a“design“for“prett•¾9y-prin“ting›-#com“binators˜[7Ž‘Ÿþ]˜isŽ¡’õºâa–´vunique“resource“in“this“respAÇect,‘ÇÖand“wš¾9e“ha˜v˜e“y˜et“to“exploitŽ¡’õºâit‘Tfully‘ÿ:«.Ž¦’õºâÄT‘ÿÌree-pro•K¼cessing‘‚•op“erators‘ ?ü¹An–.earlier“v¾9ersion“of“this“pa-Ž¡’õºâpšAÇer–­prompted“more“than“one“p˜oinš¾9ter“to“the“w˜ork“of“EelcoŽ¡’õºâVisser–‘ðand“colleagues“[13Ž‘ ?ü].‘’CTheir“motiv‘ÿ|rating“application“isŽ¡’õºâspAÇeci cation–ü–extension“simpli es“v‘ÿ|rarious“tree-proAÇcessing“tasks,Ž¡‘íºâand–{éalso“hoš¾9w“it“can“bAÇe“translated“in˜to“standard“Hask˜ell.Ž¡‘íºâThis–óWwš¾9ork“could“pro˜vide“one“compAÇonen˜t“of“a“h˜ybrid“solu-Ž¡‘íºâtion,‘ïNwith–åÌDTD-spAÇeci c“represen¾9tation“Åand“¹generic“forms“ofŽ¡‘íºâtra•¾9v“ersal–Tand“matc¾9hing.Ž¡‘û:âVisser–ŽCet.“al.“[13Ž‘ ?ü]“also“discuss“sevš¾9eral“other“approac˜hes“toŽ¡‘íºâthe–Ttree“transformation“problem.ŽŸü‘íºâÄ6Ž‘ý´oConclusions–ŒÊand“F›ÿÌuture“W˜orkŽŸ阑íºâ¹In–9)our“expAÇerience,‘‚Haskš¾9ell“is“a“v˜ery“suitable“language“forŽ¡‘íºâXML‘:×proAÇcessing.‘ÞF‘ÿ:«or–;#generic“applications,‘„—a“small“set“ofŽ¡‘íºâcom¾9binators–Ô2designed“with“algebraic“propAÇerties“in“mind“canŽ¡‘íºâb•AÇe› Óp“o•¾9w“erful˜enough˜and˜ exible˜enough˜to˜describAÇe˜a˜fullŽ¡‘íºârange–:of“selection,–R³testing,“and–:construction“opAÇerations“inŽ¡‘íºâa–÷uniform“framew¾9ork.‘ ÁF‘ÿ:«or“applications“where“the“DTDŽ¡‘íºâis–éÙ xed,‘^ùa“tošAÇol“deriving“corresp˜onding“t¾9yp˜es“and“asso˜ci-Ž¡‘íºâated–»ƒI/O›»Xroutines“turns“XML˜proAÇcessing“inš¾9to“Hask˜ell“pro-Ž¡‘íºâgramming›¨6o•¾9v“er˜t“ypAÇed˜data˜structures,‘¾ and˜the˜Hask“ell˜t“ypAÇe-Ž¡‘íºâc•¾9hec“k“er–Tv‘ÿ|ralidates“scripts.Ž¡‘û:âHo•¾9w“ev“er,›¨)there–W™is“plen•¾9t“y–W™of“scopAÇe“for“further“w¾9ork,˜inŽ¡‘íºâsev¾9eral‘Tdirections:ŽŸ—ü‘íºâÄGeneralitš´Cy–y“of“com˜binators‘ ?ü¹Though–âýwš¾9e“ha˜v˜e“had“gen-Ž¡‘íºâeralitš¾9y–èas“a“design“aim“for“our“presen˜t“com˜binator“libraryŽ¡‘íºâthere–Tis“scopAÇe“for“generalising“it“further.ŽŸ33‘øæÈŽŽŽ‘ºäÅWider‘;\functionality.‘"¹Most›_con•¾9ten“t˜ lters˜in˜our˜cur-Ž¡‘ºären¾9t– [library“are“either“pure“selectors“(with“results“thatŽ¡‘ºäare–º”sequences“of“sub-trees“from“the“full“doAÇcumen¾9t“tree)Ž¡‘ºäor–Z1pure“constructors“(creating“doAÇcumenš¾9t“con˜ten˜t“fromŽ¡‘ºäv‘ÿ|ralues–$£of“other“t¾9ypšAÇes).‘J\The“design“could“usefully“b˜eŽ¡‘ºäextended–to“include“a“more“general“class“of“Ådeletion“¹op-Ž¡‘ºäerations–îªin“whic¾9h“sub-trees“can“bAÇe“thinned“and“prunedŽ¡‘ºäin–ðov‘ÿ|rarious“w•¾9a“ys.‘$More–ðogeneral“still“are“com¾9binators“forŽ¡‘ºäÅeš‡diting–,ôand“tr˜ansforming¹,‘DIwhere–±some“of“the“ideas“inŽ¡‘ºäVisser's–Tw¾9ork“could“usefully“bAÇe“transferred.Ž©34‘øæÈŽŽŽ‘ºäÅMultiple–±¿inputs“and“outputs.‘aN¹An–žin¾9teresting“extensionŽ¡‘ºäof–Ã,single-doAÇcumenš¾9t“scripting“is“the“handling“of“m˜ulti-Ž¡‘ºäple›< do•AÇcumen¾9ts.‘ÔPro“ducing˜more˜than˜one˜output˜do“c-Ž¡‘ºäumenš¾9t–«$is“no“great“problem.‘ù But“it“is“far“more“c˜halleng-Ž¡‘ºäing–ê¤to“design“appropriate“com¾9binators“for“dealing“withŽ¡‘ºäsev¾9eral‘Tinputs.Ž¦‘øæÈŽŽŽ‘ºäÅMor•‡e›ïgener“al˜typ“es.‘)b¹The–ÄOlabAÇelling“scš¾9heme“has“pro˜v˜edŽ¡‘ºäuseful–¤Òfor“some“applications,‘ȱbut“the“need“for“a“sepa-Ž¡‘ºärate–¡9ÆLabelFilter“¹t¾9ypšAÇe“is“a“blemish.‘õ¼W‘ÿ:«e“hop˜e“to“gener-Ž¡‘ºäalise–xôthe“ÆCFilter“¹t¾9ypšAÇe“to“incorp˜orate“ÆLabelFilter“¹asŽ¡‘ºäa–SspšAÇecial“case.‘ÖöBy“making“the“ÆCFilter“¹t¾9yp˜e“paramet-Ž¡‘ºäric–« it“mighš¾9t“ev˜en“bšAÇe“p˜ossible“to“incorp˜orate“the“t¾9yp˜e-Ž¡‘ºätranslation–Œof“DTDs“within“the“comš¾9binator“framew˜ork.ŽŸ—ü‘íºâÄEciency–Êof“com´Cbinators‘ ?ü¹The–(Ùcurrenš¾9t“com˜binator“li-Ž¡‘íºâbrary–‘|is“quite“usable,‘°†but“here“are“some“pAÇossible“routes“toŽ¡‘íºâgreater‘Teciency‘ÿ:«.ŽŸ33‘øæÈŽŽŽ‘ºäÅAÃŽlgebr‡aic‘ðùnormalisation–Æl¹So“far“wš¾9e“ha˜v˜e“merely“estab-Ž¡‘ºälished–ûuthat“la¾9ws“hold,‘týand“ošAÇccasionally“app˜ealed“toŽ¡‘ºäthem–H€when“writing“scripts.‘Ø)The“implemen¾9tation“simplyŽ¡‘ºäde nes–„êthe“comš¾9binators“b˜y“their“spAÇecifying“equations.ŽŽŽ ý€’ ºäInstead,‘¾#laš¾9ws–¨Wcould“bAÇe“exploited“at“the“implemen˜tationŽ¤ ’ ºälev•¾9el.‘ ÇF‘ÿ:«ollo“wing–àYHughes“[7Ž‘Ÿþ],‘êòwš¾9e“ha˜v˜e“in“mind“an“imple-Ž¡’ ºämenš¾9tation–Ï›that“automatically“reduces“all“com˜binationsŽ¡’ ºäto–…§a“Ånormal‘Ê6form¹,‘¢dthat“is“the“least“expAÇensivš¾9e“equiv‘ÿ|ralen˜tŽ¡’ ºäcomputationally‘ÿ:«.Ž©34’æÈŽŽŽ’ ºäÅSp•‡ac“e-ecient‘Qkformulation–/`¹Some“lazy“functional“pro-Ž¡’ ºägrams–Ï‹that“proAÇcess“trees“in“pre-order“left-to-righ¾9t“fash-Ž¡’ ºäion–†úcan“bAÇe“form¾9ulated“to“run“in“log(N)‘†›space.‘qcTheŽ¡’ ºäpart–8of“the“tree“that“is“held“in“memory“correspAÇonds“toŽ¡’ ºäa–[path“from“the“rošAÇot“to“some“no˜de“that“is“curren¾9tlyŽ¡’ ºäthe–µOfoAÇcus“of“computation:‘ìnto“the“left“are“`garbage'“sub-Ž¡’ ºätrees–áalready“proAÇcessed,‘što“the“righ¾9t“are“subtrees“notŽ¡’ ºäy•¾9et›){ev‘ÿ|raluated.‘XåHo“w“ev“er,‘.…our˜curren“t˜com“binators˜ha“v“eŽ¡’ ºänot–¸ebAÇeen“formš¾9ulated“to“guaran˜tee“this“sort“of“space“bAÇe-Ž¡’ ºäha•¾9viour,‘ev“en–Šin“fa•¾9v“ourable–Šcases.‘#This“problem“migh¾9tŽ¡’ ºäbAÇe–Ttacš¾9kled“b˜y“the“normalisation“approac˜h.Ž¦’æÈŽŽŽ’ ºäÅDTD-awar•‡e‘+‡c“ombinators–¹¹The“currenš¾9t“com˜binator“li-Ž¡’ ºäbrary–just“ignores“DTDs.‘ 0£Com¾9binators“that“main-Ž¡’ ºätain–RDTD‘Rinformation“migh¾9t,›aPfor“example,˜ac•¾9hiev“e‘RfarŽ¡’ ºämore–oDecienš¾9t“searc˜h“in“some“cases“b˜y“pruning“branc˜hesŽ¡’ ºäbšAÇound–}$to“fail.‘éµThey“could“also“b˜e“used“to“pro˜duce“ rst-Ž¡’ ºäclass–†ŸXML‘†{doAÇcumen¾9ts“as“the“results“of“queries,‘£*not“justŽ¡’ ºäraš¾9w–‚€extracts“of“unkno˜wn“t˜ypAÇe.‘cõAs“w˜e“ha˜v˜e“alreadyŽ¡’ ºänoted,‘~äDTDs–6”could“pšAÇerhaps“b˜e“attac¾9hed“as“lab˜els“inŽ¡’ ºäthe–Þsense“of“Èx¹2.4:‘5either“as“explicit“v‘ÿ|ralues“or“implicitlyŽ¡’ ºäin–Tt¾9ypAÇe“information.ŽŸ—ü’õºâÄRelations›rrbK¼et•´Cw“een˜DTDs‘ ?ü¹As– wš¾9e“ha˜v˜e“seen,‘Q&in“the“DTD-Ž¡’õºâdirected–Èapproacš¾9h“with“kno˜wn“ xed“DTDs“for“input“and“out-Ž¡’õºâput,‘: v‘ÿ|ralidation–ÿ€translates“to“static“t•¾9ypAÇe-c“hec“king;‘t—whereasŽ¡’õºâgeneric–×ôcom¾9binators“could“in“principle“acquire“and“computeŽ¡’õºâDTDs–ðûdynamically‘ÿ:«.‘¯eThese“represen¾9t“extremes“with“disad-Ž¡’õºâv‘ÿ|ranš¾9tages–±•of“in exibilit˜y“on“the“one“hand“and“some“insecu-Ž¡’õºâritš¾9y– jon“the“other.‘½²There“are“man˜y“other“w˜a˜ys“of“handlingŽ¡’õºârelations›TbAÇet•¾9w“een˜DTDs.‘pF‘ÿ:«or˜example:ŽŸ33’æÈŽŽŽ’ ºäÅPolymorphic–d and“higher-or‡der“scripts.‘ ë&¹The‘Z;genericŽ¡’ ºäapproac•¾9h›Ï w“ould˜gain˜securit“y˜if˜one˜could˜Åinfer˜¹aŽ¡’ ºäDTDÈ!¹DTD‘pšfunction.‘å•By–pÄanalogy“with“functional“pro-Ž¡’ ºägrams–\»it“is“then“natural“to“assign“scripts“pAÇolymorphicŽ¡’ ºäand–uýhigher-order“DTDs,‘•Ümaking“explicit“their“degree“ofŽ¡’ ºägenericit¾9y‘ÿ:«.Ž¦’æÈŽŽŽ’ ºäÅInclusion›““b•‡etwe“en˜DTDs.‘þϹThis–`Éhas“bAÇeen“implicitly“as-Ž¡’ ºäsumed–ªalready‘ÿ:«,‘¿èbut“has“practical“impAÇortance“in“its“o¾9wnŽ¡’ ºärigh•¾9t.‘UAs›}ŠstoAÇc“k˜DTDs˜are˜re ned,‘×—XML‘}-doAÇcumen“tsŽ¡’ ºäwill–®inhabit“a“hierarc•¾9h“y–®of“spAÇecialisation.‘ú Givš¾9en“sev˜eralŽ¡’ ºäsimilar–afDTDs,‘…cone“wš¾9ould“lik˜e“to“deriv˜e“a“DTD‘a8for“a“vir-Ž¡’ ºätual–?¬common“roAÇot“(inš¾9tersection)“or“common“descenden˜tŽ¡’ ºä(union).‘rØThis–2!gošAÇes“w¾9ell“b˜eyš¾9ond“the“abilities“of“curren˜tŽ¡’ ºätš¾9ypAÇe-inference–)µsystems,‘.Îbut“w˜ould“mak˜e“a“useful“addi-Ž¡’ ºätion–Tto“our“functional“tošAÇolkit“for“XML“pro˜cessing.ŽŸü’õºâÄAc•´Ckno“wledgemen“tsŽŸ阒õºâ¹Canon–ÖgResearcš¾9h“Cen˜tre“(EuropAÇe)“Ltd.‘_©suggested“this“lineŽ¡’õºâof–¦¹]˜]Ž’`®=Ž’r?Ænewtype–¹–Ëm“Æ=ŽŽ¡’„þ—Ëm–¹–Æ(Ëm‘‘$‰ffÕÂŽ›fæÆAttrs,“Ëm‘‘$‰ffÕÂŽ˜Æ)ŽŽ¡’r?newtype–¹–Ëm‘‘$‰ffÕÂŽ‘ |Æ=“ÈDAǹ[–þuV[Ëspec¹]“]‘¹–ËmŽŽ¡’Ÿ,a¹where‘TËm–¹–¹=“ÈM¹[–þuV[Æn¹]“]ŽŽ¡’û!IÈT‘Nî¹[‘þuV[ƹ]‘þuV]Ž’`®=Ž’r?Ædata–¹–Ëm‘‘$‰ffÕÂŽ‘fæÆAttrs“=ŽŽ¡’{‹kËm‘‘$‰ffÕÂŽ‘fæÆAttrs–¹–Èf“F‘对[–þuV[Åde‡clŸÿÿÌ0ŽŽ‘V¹]“]ŽŽ¡’¡XÆ,‘T¹.–Šª.“.ŽŽ¡’¡XÆ,ÈF‘对[–þuV[Åde‡clŸÌkŽŽ‘†¹]“]‘¹–ÈgŽŽ¡’Ÿ,a¹where‘TËm–¹–¹=“ÈM¹[–þuV[Æn¹]“]ŽŽ¡’r?ÈA¹[–þuV[Åde‡clŸÿÿÌ0ŽŽ‘V¹]“]ŽŽ¡’r?.–Šª.“.ŽŽ¡’r?ÈA¹[–þuV[Åde‡clŸÌkŽŽ‘†¹]“]ŽŽ¡¡’5ÍôÄRHS–ŒÊof“t´CypK¼e“declarationsŽŽ¡’û!IÈDAǹ[‘þuV[(ÇxŸÿÿ½0Ž–*§Ç;›ŠªxŸÿÿ½1Ž“Ç;˜:˜:˜:Ž‘ ßú;˜xŸó;Îcmmi6ÀkŽ‘—¹)]‘þuV]ÇmŽ’`®¹=Ž’r?ÈC‘f¹[–þuV[Çm‘TxŸÿÿ½0Ž‘Ê¥Ç:–Šª:“:Ž‘5IxŸÀkŽ‘—¹]“]ŽŽ¤ (’~ ¿ÈDAÇŸü-=óq¡% cmsy6Ã0Ž›óŽ¹[–þuV[ÇxŸÿÿ½0Ž‘*§¹]“]‘TÈDAÇŸü-=Ã0Ž˜¹[“[ÇxŸÿÿ½1Ž‘*§¹]“]‘ŸþÇ:–Šª:“:Ž‘ ¢ÈDAÇŸü-=Ã0Ž˜¹[“[ÇxŸÀkŽ‘—¹]“]ŽŽ¡’û!IÈDAǹ[›þuV[(ÇxŸÿÿ½0Ž–*§ÈjÇxŸÿÿ½1Ž“Èj–ŠªÇ:“:“:Ž‘ ßúÈjÇxŸÀkŽ‘—¹)]˜]ÇmŽ’`®¹=Ž’xBçÈC‘f¹[–þuV[Çm›TxŸÿÿ½0Ž‘*§¹]“]˜ÈDAÇŸü-=Ã0Ž‘óŽ¹[“[ÇxŸÿÿ½0Ž‘*§¹]“]ŽŽ¡’r?Æ|–TÈC‘f¹[›þuV[Çm“xŸÿÿ½1Ž‘*§¹]˜]“ÈDAÇŸü-=Ã0Ž‘óŽ¹[˜[ÇxŸÿÿ½1Ž‘*§¹]˜]ŽŽ© ’r?Æ|‘T¹.–Šª.“.ŽŽ¡’r?Æ|–TÈC‘f¹[›þuV[Çm“xŸÀkŽ‘—¹]˜]“ÈDAÇŸü-=Ã0Ž‘óŽ¹[˜[ÇxŸÀkŽ‘—¹]˜]ŽŽ¡’û!IÈDAǹ[–þuV[(Çx¹)?]“]ÇmŽ’`®¹=Ž’r?ÆMaybe‘TÈDAÇŸü-=Ã0Ž‘óŽ¹[–þuV[Çx¹]“]ŽŽ¡’û!IÈDAǹ[–þuV[(Çx¹)+]“]ÇmŽ’`®¹=Ž’r?ÆList1‘TÈDAÇŸü-=Ã0Ž‘óŽ¹[–þuV[Çx¹]“]ŽŽ¡’û!IÈDAǹ[–þuV[(Çx¹)ȹ]“]ÇmŽ’`®¹=Ž’r?Æ[›TÈDAÇŸü-=Ã0Ž‘óŽ¹[–þuV[Çx¹]“]˜Æ]ŽŽ¦’û!IÈDAǹ[–þuV[Çx¹]“]ÇmŽ’`®¹=Ž’r?ÈC‘f¹[–þuV[Çm‘Tx¹]“]ŽŽ¦¦’< -ÄInner–ŒÊt´CypK¼e“expressionsŽŽ¡’û!IÈDAÇŸü-=Ã0Ž‘óŽ¹[‘þuV[(ÇxŸÿÿ½0Ž–*§Ç;›ŠªxŸÿÿ½1Ž“Ç;˜:˜:˜:Ž‘ ßú;˜xŸÀkŽ‘—¹)]‘þuV]Ž’`®=Ž’r?Æ(›TÈDAÇŸü-=Ã0Ž‘óŽ¹[–þuV[ÇxŸÿÿ½0Ž‘*§¹]“]˜Æ,˜ÈDAÇŸü-=Ã0Ž‘óŽ¹[“[ÇxŸÿÿ½1Ž‘*§¹]“]ŽŽ¡’~ ¿Æ,‘TÇ:–Šª:“:Ž‘øÈDAÇŸü-=Ã0Ž‘óŽ¹[–þuV[ÇxŸÀkŽ‘—¹]“]‘TÆ)ŽŽ¡’û!IÈDAÇŸü-=Ã0Ž‘óŽ¹[›þuV[(ÇxŸÿÿ½0Ž–*§ÈjÇxŸÿÿ½1Ž“Èj–ŠªÇ:“:“:Ž‘ ßúÈjÇxŸÀkŽ‘—¹)]˜]Ž’`®=Ž’r?Æ(OneOfŸÿÿÀnŽ‘ HßÈDAÇŸü-=Ã0Ž›óŽ¹[–þuV[ÇxŸÿÿ½0Ž‘*§¹]“]‘TÈDAÇŸü-=Ã0Ž˜¹[“[ÇxŸÿÿ½1Ž‘*§¹]“]ŽŽ¡’¢É-Ç:–Šª:“:Ž’²3ÑÈDAÇŸü-=Ã0Ž‘óŽ¹[–þuV[ÇxŸÀkŽ‘—¹]“]‘TÆ)ŽŽ¡’û!IÈDAÇŸü-=Ã0Ž‘óŽ¹[–þuV[(Çx¹)?]“]Ž’`®=Ž’r?Æ(Maybe›TÈDAÇŸü-=Ã0Ž‘óŽ¹[–þuV[Çx¹]“]˜Æ)ŽŽ¡’û!IÈDAÇŸü-=Ã0Ž‘óŽ¹[–þuV[(Çx¹)+]“]Ž’`®=Ž’r?Æ(List1›TÈDAÇŸü-=Ã0Ž‘óŽ¹[–þuV[Çx¹]“]˜Æ)ŽŽ¡’û!IÈDAÇŸü-=Ã0Ž‘óŽ¹[–þuV[(Çx¹)ȹ]“]Ž’`®=Ž’r?Æ[›TÈDAÇŸü-=Ã0Ž‘óŽ¹[–þuV[Çx¹]“]˜Æ]ŽŽ¡’û!IÈDAÇŸü-=Ã0Ž‘óŽ¹[–þuV[Çx¹]“]Ž’`®=Ž’r?ÈC‘f¹[–þuV[Çx¹]“]ŽŽ¦¦’L†ÄName‘ŒÊmanglingŽŽ¦’û!IÈC‘f¹[–þuV[Çm‘TxŸÿÿ½0Ž‘?ûÇxŸÿÿ½1Ž‘Ê¥Ç:–Šª:“:Ž‘5IxŸÀkŽ‘—¹]“]Ž’`®=Ž’r?.–Šª.“.“Åunique–N