This program aligns a collection of transcribed speech files with their hidden Markov models. For each speech file, a plain-text alignment file is produced. \begin{haskell}{BatchAlign} > module Main where \end{haskell} The module \verb~Printf~ is a library module provided with the Chalmers and Glasgow Haskell compilers. It allows C-like printing of integers, floating point numbers, and strings. \begin{verbatim} > import Printf -- needed if you want to use hbc; comment > -- out this import if you use ghc v0.19 > -- import GhcPrintf -- needed if you want to use ghc v0.19; > -- comment this import out if you use hbc \end{verbatim} The following modules are from a general library and are described in later chapters in Part~\ref{part:library}. \begin{verbatim} > import NativeIO > import PlainTextIO \end{verbatim} The following modules are specifically for training HMMs. They were documented in earlier chapters (Part~\ref{part:modules}). \begin{verbatim} > import Phones > import Pronunciations > import HmmDigraphs > import HmmDensities > import Viterbi > import HmmConstants >#if __HASKELL1__ >= 3 > import System ( getArgs ) > import Array > import IO > type Assoc a b = (a,b) > (=:) a b = (a,b) >#endif #if __HASKELL1__ < 5 #define amap map #else #define amap fmap #endif \end{verbatim} In the main expression, the function \verb~build_tmt~ (Section~\ref{sc:tmt}) builds the tied-mixture table and then applies a continuation. \begin{verbatim} >#if __HASKELL1__ < 3 > main = getArgs exit $ \args -> > case args of > [gms_dir, -- Gaussian mixtures directory > dmap_file, -- density map file > dgs_file, > utts_file] -> readFile dmap_file exit $ \cs0 -> > readFile dgs_file exit $ \cs1 -> > readFile utts_file exit $ \cs2 -> > let > > density_map = concat ( > map restructure ( > readElements cs0)) > > hmm_dgs = get_log_probs ( > build_hmm_array ( > readHmms cs1)) > > file_names = lines cs2 > > in > build_tmt gms_dir density_map [] $ > \hmm_tms -> align_each_file hmm_tms hmm_dgs > 0 0.0 file_names > > _ -> error usage >#else > main = getArgs >>= \args -> > case args of > [gms_dir, -- Gaussian mixtures directory > dmap_file, -- density map file > dgs_file, > utts_file] -> readFile dmap_file >>= \cs0 -> > readFile dgs_file >>= \cs1 -> > readFile utts_file >>= \cs2 -> > let > > density_map = concat ( > map restructure ( > readElements cs0)) > > hmm_dgs = get_log_probs ( > build_hmm_array ( > readHmms cs1)) > > file_names = lines cs2 > > in > build_tmt gms_dir density_map [] >>= > \hmm_tms -> align_each_file hmm_tms hmm_dgs > 0 0.0 file_names > > _ -> error usage >#endif > usage = "usage: BatchAlign " > -- partain: got rid of string gap because of doing -cpp \end{verbatim} %---------------------------------------------------------------------- \section {The Density Map} %---------------------------------------------------------------------- The association list that tells us which HMM states have their own mixtures and which are tied to other HMM states is stored in a {\em density map file}. The name of this file is the second argument on this program's command line. Figure~\ref{fg:density-map-file} shows the first six lines of an example file. In this example, all HMMs have three states and all states have their own mixture, indicated by the data constructor \verb~Mix~. If the density of any state was {\em tied\/} to that of another state (Chapter~\ref{ch:HmmDensities}), the constructor \verb~Mix~ would be replaced by the constructor \verb~TiedM~ along with the ``targeted'' phone and state values. The targeted HMM state must have its own mixture; that is, a state must either have its own density or be tied to a state that does, no multiple chaining of ties is allowed. (This program does not explicitly check for this, it is the responsibility of the user to make sure the density map file is correctly structured). \begin{figure} \begin{verbatim} (AA, [1 =: Mix, 2 =: Mix, 3 =: Mix]) (AE, [1 =: Mix, 2 =: Mix, 3 =: Mix]) (AH, [1 =: Mix, 2 =: Mix, 3 =: Mix]) (AO, [1 =: Mix, 2 =: Mix, 3 =: Mix]) (AW, [1 =: Mix, 2 =: Mix, 3 =: Mix]) (AX, [1 =: Mix, 2 =: Mix, 3 =: Mix]) \end{verbatim} \caption[]{The first six lines of an example density map file.} \label{fg:density-map-file} \end{figure} \begin{haskell}{DensityMap} > data DensityMap = Mix | TiedM Phone HmmState deriving (Read, Show) \end{haskell} The function \verb~restructure~ restructures the input list. Hence, the lines shown in Figure~\ref{fg:density-map-file} would be restructured as shown in Figure~\ref{fg:restructure}. \begin{figure} \begin{verbatim} [(AA, 1 =: Mix), (AA, 2 =: Mix), (AA, 3 =: Mix)] [(AE, 1 =: Mix), (AE, 2 =: Mix), (AE, 3 =: Mix)] [(AH, 1 =: Mix), (AH, 2 =: Mix), (AH, 3 =: Mix)] [(AO, 1 =: Mix), (AO, 2 =: Mix), (AO, 3 =: Mix)] [(AW, 1 =: Mix), (AW, 2 =: Mix), (AW, 3 =: Mix)] [(AX, 1 =: Mix), (AX, 2 =: Mix), (AX, 3 =: Mix)] \end{verbatim} \caption[]{The results of applying the function {\tt restructure} to each of the first six lines of the example density map file.} \label{fg:restructure} \end{figure} \begin{haskell}{restructure} > restructure :: (Phone, [Assoc HmmState DensityMap]) -> > [(Phone, Assoc HmmState DensityMap)] > restructure (p,is) = [(p,k) | k<-is ] \end{haskell} These individual lists can be concatenated to form the density map structure that is used to read a set of Gaussian mixture files. %---------------------------------------------------------------------- \section {Building the Tied Mixture Table} \label{sc:tmt} %---------------------------------------------------------------------- The mixtures are stored in separate files, one mixture per file, all in the same directory. The name of that directory is the first argument on this program's command line. Each file has a name with the syntax \begin{quote}\it $<$phone$>$.$<$state$>$.gm \end{quote} where {\it $<$phone$>$} is one of the legal values of \verb~Phone~ and {\it $<$state$>$} is the state index. Keeping the densities in separate files is natural since the densities are estimated individually by collecting all the feature vectors that aligned with a given HMM state (or those that are tied to it) into a single file\footnote{This redistributing of feature vectors is done using a C program not described in this report.} and then estimating the density parameters using a C program (not described in this report). The function \verb~get_gm_fname~ is used to build the filename of a file containing Gaussian mixture parameters. \begin{haskell}{get_gm_fname} > get_gm_fname :: String -> Phone -> Int -> String > get_gm_fname dir p k = dir ++ "/" ++ shows p ('.' : shows k ".gm") \end{haskell} The function \verb~build_tmt~ builds the tied mixture table. If the density for HMM $p$, state $k$, is tied to another mixture, then we just pass that information through. If, however, the density for HMM $p$, state $k$, is its own mixture, then we open the appropriate file and read the mixture parameters, converting them to an internal representation to improve efficiency (Chapter~\ref{ch:HmmDensities}). The variable ``\verb~tc~'' in the definition stands for ``tied-mixture continuation.'' \begin{haskell}{build_tmt} >#if __HASKELL1__ < 3 > build_tmt :: String -> > [(Phone, Assoc HmmState DensityMap)] -> > [Assoc Phone (Assoc Int TiedMixture)] -> > (TmTable -> Dialogue) -> > Dialogue > build_tmt dir ((p,k:=t):rps) as tc = > case t of > TiedM q s -> build_tmt dir rps ((p:=(k:=Tie q s)):as) tc > Mix -> let > file = get_gm_fname dir p k > in > readFile file exit $ \bs -> > case readMixture bs of > Nothing -> error (can't_read file) > Just (m,bs') -> if null bs' > then let > m' = extern_to_intern m > in > build_tmt dir rps > ((p:=(k:=Gm m')):as) tc > else error (can't_read file) > build_tmt _ [] as tc = tc (make_tm_table as) >#else > build_tmt :: String -> > [(Phone, Assoc HmmState DensityMap)] -> > [Assoc Phone (Assoc Int TiedMixture)] -> > IO TmTable > build_tmt dir ((p,(k,t)):rps) as = > case t of > TiedM q s -> build_tmt dir rps ((p=:(k=:Tie q s)):as) > Mix -> let > file = get_gm_fname dir p k > in > readFile file >>= \bs -> > case readMixture bs of > Nothing -> error (can't_read file) > Just (m,bs') -> if null bs' > then let > m' = extern_to_intern m > in > build_tmt dir rps > ((p=:(k=:Gm m')):as) > else error (can't_read file) > build_tmt _ [] as = return (make_tm_table as) >#endif > can't_read :: String -> String > can't_read file = " can't read the file " ++ file > make_tm_table = amap (\as -> array (1, length as) as) . > accumArray (flip (:)) [] phone_bounds \end{haskell} %---------------------------------------------------------------------- \section {Performing the Viterbi Alignment} %---------------------------------------------------------------------- \begin{haskell}{align_each_file} >#if __HASKELL1__ < 3 > align_each_file :: TmTable -> > HmmNetworkDic -> > Int -> -- number of files aligned > Float -> -- cumulative alignment score > StrListCont > align_each_file hmm_tms hmm_dgs nfs ts (fn:rfns) = > readFile (fn ++ ".ppm") exit $ \cs -> > readFile (fn ++ ".fea") exit $ \bs -> > let > Just (pnet,_) = readsPrnNetwork cs > hmm = buildHmm hmm_dgs pnet > fvs = readVectors observation_dimen bs > lts = map (eval_log_densities hmm_tms) fvs > (score,states) = align hmm lts > nfs' = nfs + 1 > ts' = ts + score > in > appendChan stderr (printf "%4d%7.2f%7.2f %s\n" [ > UInt nfs', UFloat score, UFloat (ts' / fromIntegral nfs'), > UString fn ]) exit $ > writeFile (fn ++ ".algn") (showAlignment states) exit > (align_each_file hmm_tms hmm_dgs nfs' ts' rfns) > align_each_file _ _ _ _ [] = done >#else > align_each_file :: TmTable -> > HmmNetworkDic -> > Int -> -- number of files aligned > Float -> -- cumulative alignment score > [String] -> -- file names > IO () > align_each_file hmm_tms hmm_dgs nfs ts (fn:rfns) = > readFile (fn ++ ".ppm") >>= \cs -> > readFile (fn ++ ".fea") >>= \bs -> > let > Just (pnet,_) = readsPrnNetwork cs > hmm = buildHmm hmm_dgs pnet > fvs = readVectors observation_dimen bs > lts = map (eval_log_densities hmm_tms) fvs > (score,states) = align hmm lts > nfs' = nfs + 1 > ts' = ts + score > in > hPutStr stderr (printf "%4d%7.2f%7.2f %s\n" [ > UInt nfs', UFloat score, UFloat (ts' / fromIntegral nfs'), > UString fn ]) >> > writeFile (fn ++ ".algn") (showAlignment states) >> > align_each_file hmm_tms hmm_dgs nfs' ts' rfns > align_each_file _ _ _ _ [] = return () >#endif > showAlignment :: [(Phone, HmmState)] -> String > showAlignment = unlines . map showIndexedState . zip [0..] > showIndexedState (i, x) = shows i (' ' : showState x) > showState (p, j) = show p ++ " " ++ show j \end{haskell}