I think these problems stem from an incorrect application of the enumerator model of I/O. When using enumerators, a file (or other data source) is a resource that can be enumerated over to process data, exactly as a list can be enumerated over in order to access the data contained in the list. Compare the following:
foldl f init xs
enumFd "SomeFile" ==<< stream2list
In the first function, 'xs' is the data to be processed, 'foldl' tells how to access individual items in the data collection, and 'f' and 'init' do the actual processing. In the second, "SomeFile" is the data, 'enumFd' tells how to access the data, and 'stream2list' does the processing. So how does writing fit in? The output file obviously isn't the data source, and it doesn't make sense to enumerate over your output file as there's no data there to process. So it must go within the Iteratee. It turns out that making an iteratee to write data is relatively simple:
> import Data.Iteratee
> import System.IO
> import Control.Monad
>
> writeOut :: FilePath -> IterateeGM [] Char IO ()
> writeOut file = do
> h <- liftIO $ openFile file WriteMode
> loop h
> where
> loop :: Handle -> IterateeGM [] Char IO ()
> loop h = do
> next <- Data.Iteratee.head
> case next of
> Just c -> liftIO $ hPutChar h c >> loop
> Nothing -> liftIO $ hClose h
Add some error handling and you've got a writer. This version could be polymorphic over different StreamChunk instances by generalizing the type (FlexibleContexts may be required as well). Other stream-specific versions could be written that would take advantage of the specific StreamChunk instance (e.g. using Data.ByteString.hPut instead of hPutChar).
I hope this will serve as a very basic introduction to output when using enumerators. In addition to a generic writer like this, it may frequently be beneficial to define special-purpose writers. In a future post I will show a writer that seeks within the output file using a threaded State monad.
No comments:
Post a Comment